[Xapian-discuss] Re: Best way to index relational table

Sebastian Araya cbimax at gmail.com
Mon Jul 24 16:00:48 BST 2006


Michael Schlenker <schlenk <at> isn-oldenburg.de> writes:

> 
> Sebastian Araya wrote:
> > Hi!
> > 
> > 
> >   I'm facing the following issue: I need to index with xapian a relational 
> > table, which has a few enumerated fields and two text fields:
> > 
> > E1	E2	E3	T1	T2
> > ------------------	----------
> >     Enumerated             Text
> > 
> >   I want to index the table in order to narrow by specific field (e.g. for 
a 
> > given text in T1 and by enumerated fields E2 and E3). For example, if E2 is 
> > author's code, E3 is theme's code and T1 is the chapter text, I want to 
issue:
> > 
> > "there and back again author:tolkien theme:fantasy"
> > 
> >   I was playing with Lucene and there is an api call to index by term like 
> > Lucene::Keyword and Lucene::Text to index text (contents), so I can do the 
> > following:
> > 
> >   addField( Lucene::Keyword, 'author:tolkien' )
> >   addField( Lucene::Keyword, 'theme:fantasy' )
> >   addField( Lucene::Text, <textofthebook> )
> > 
> >   I think it is possible to do in Xapian to index termnames and termdata, 
but I 
> > don't found the right way... could you give an example or a little sample ??
> 
> Yes, you can do that easily with Xapian. Depending on your needs a
> combination of a standard relational table for keywords and xapian for
> the fulltext fields could be the best solution.
> 
> For the Tcl binding for example that translates to:
> 
> xapian::Document doc
> doc add_term author:Tolkien
> doc add_term theme:fantasy
> 
Hello Michael,

  thanks for your quickly answer !!

  I'm working on php, so I will ask you a terrible beginner' questions about 
tcl... when you issue:

doc add_term author:Tolkien
doc add_term theme:fantasy
doc add_data $textOfTheBook

  'author:Tolkien' and 'theme:fantasy' is treated as a string, right? And  
$textOfTheBook is variable which holds the text (ok?). So, when I want to 
perform a search I could issue:

there and back again author:Tolkien theme:fantasy

  Now, suppouse that I need to narrow my search in two themes or categories, 
like 'theme:fantasy' and 'theme:filology', so, how I can create the query ?? I 
think this isn't workout:

$query = new_Query( Query_OP_AND, 'there and back again author:Tolkien 
theme:fantasy OR theme:filology' );

  And the other question is: suppouse my entirely problem: R(E1,E2,E3,T1,T2), 
now I need to specify a query in T1, in T2 and, T1 and T2. But T1 and T2 aren't 
indexed with termnames. 

  Let'me explain again: E1, E2, E3 are enumerated fields (author: isbn: theme: 
etc.) and T1 and T2 text fields (like chapter text and editor's commentaries). 
So, how can I properly index this table in order to search in any text fields 
(but without overlaps)?

  Really appreciate your help.

Sebastián.







More information about the Xapian-discuss mailing list