[Xapian-discuss] Best way to index relational table

Michael Schlenker schlenk at isn-oldenburg.de
Mon Jul 24 15:31:53 BST 2006


Sebastian Araya wrote:
> Hi!
> 
> 
>   I'm facing the following issue: I need to index with xapian a relational 
> table, which has a few enumerated fields and two text fields:
> 
> E1	E2	E3	T1	T2
> ------------------	----------
>     Enumerated             Text
> 
>   I want to index the table in order to narrow by specific field (e.g. for a 
> given text in T1 and by enumerated fields E2 and E3). For example, if E2 is 
> author's code, E3 is theme's code and T1 is the chapter text, I want to issue:
> 
> "there and back again author:tolkien theme:fantasy"
> 
>   I was playing with Lucene and there is an api call to index by term like 
> Lucene::Keyword and Lucene::Text to index text (contents), so I can do the 
> following:
> 
>   addField( Lucene::Keyword, 'author:tolkien' )
>   addField( Lucene::Keyword, 'theme:fantasy' )
>   addField( Lucene::Text, <textofthebook> )
> 
>   I think it is possible to do in Xapian to index termnames and termdata, but I 
> don't found the right way... could you give an example or a little sample ??

Yes, you can do that easily with Xapian. Depending on your needs a
combination of a standard relational table for keywords and xapian for
the fulltext fields could be the best solution.

For the Tcl binding for example that translates to:

xapian::Document doc
doc add_term author:Tolkien
doc add_term theme:fantasy

# not sure if Lucene::Text only stores the text or actually indexes
# the text and breaks it down into terms
# this would simply store the fulltext, but not break it down into terms
# the examples dir has a proc to do indexing
doc add_data $textOfTheBook

You simply prefix your categories with a unique prefix, that is not used
in normal terms (in this example you would disallow : in normal terms,
uppercase prefixes are also a possibility if you lowercase all your
terms while indexing).

Michael








More information about the Xapian-discuss mailing list