[Xapian-discuss] Re: Best way to index relational table
Sebastian Araya
cbimax at gmail.com
Mon Jul 24 16:00:48 BST 2006
Michael Schlenker <schlenk <at> isn-oldenburg.de> writes:
>
> Sebastian Araya wrote:
> > Hi!
> >
> >
> > I'm facing the following issue: I need to index with xapian a relational
> > table, which has a few enumerated fields and two text fields:
> >
> > E1 E2 E3 T1 T2
> > ------------------ ----------
> > Enumerated Text
> >
> > I want to index the table in order to narrow by specific field (e.g. for
a
> > given text in T1 and by enumerated fields E2 and E3). For example, if E2 is
> > author's code, E3 is theme's code and T1 is the chapter text, I want to
issue:
> >
> > "there and back again author:tolkien theme:fantasy"
> >
> > I was playing with Lucene and there is an api call to index by term like
> > Lucene::Keyword and Lucene::Text to index text (contents), so I can do the
> > following:
> >
> > addField( Lucene::Keyword, 'author:tolkien' )
> > addField( Lucene::Keyword, 'theme:fantasy' )
> > addField( Lucene::Text, <textofthebook> )
> >
> > I think it is possible to do in Xapian to index termnames and termdata,
but I
> > don't found the right way... could you give an example or a little sample ??
>
> Yes, you can do that easily with Xapian. Depending on your needs a
> combination of a standard relational table for keywords and xapian for
> the fulltext fields could be the best solution.
>
> For the Tcl binding for example that translates to:
>
> xapian::Document doc
> doc add_term author:Tolkien
> doc add_term theme:fantasy
>
Hello Michael,
thanks for your quickly answer !!
I'm working on php, so I will ask you a terrible beginner' questions about
tcl... when you issue:
doc add_term author:Tolkien
doc add_term theme:fantasy
doc add_data $textOfTheBook
'author:Tolkien' and 'theme:fantasy' is treated as a string, right? And
$textOfTheBook is variable which holds the text (ok?). So, when I want to
perform a search I could issue:
there and back again author:Tolkien theme:fantasy
Now, suppouse that I need to narrow my search in two themes or categories,
like 'theme:fantasy' and 'theme:filology', so, how I can create the query ?? I
think this isn't workout:
$query = new_Query( Query_OP_AND, 'there and back again author:Tolkien
theme:fantasy OR theme:filology' );
And the other question is: suppouse my entirely problem: R(E1,E2,E3,T1,T2),
now I need to specify a query in T1, in T2 and, T1 and T2. But T1 and T2 aren't
indexed with termnames.
Let'me explain again: E1, E2, E3 are enumerated fields (author: isbn: theme:
etc.) and T1 and T2 text fields (like chapter text and editor's commentaries).
So, how can I properly index this table in order to search in any text fields
(but without overlaps)?
Really appreciate your help.
Sebastián.
More information about the Xapian-discuss
mailing list