[Xapian-discuss] spliting words algorithm - indexer vs. queryparser

tata668 tata668 tata668 at gmail.com
Fri Dec 26 23:27:58 GMT 2008


Thanks for the reply!

I'm gonna look at that TermGenerator class.

Julien




On Fri, Dec 26, 2008 at 2:24 PM, Richard Boulton <
richard at lemurconsulting.com> wrote:

> On Fri, Dec 26, 2008 at 10:10:04AM -0500, tata 668 wrote:
> > 1) Being able to use the exact same algorithm to split words when adding
> a text to a document and
> > when parsing a query (with the queryparser).
>
> Not quite the same algorithm (because you don't want to handle things like
> "AND" and "OR" and brackets in a query the same way as in a document), but
> the TermGenerator class does what you want.
>
> http://xapian.org/docs/sourcedoc/html/classXapian_1_1TermGenerator.html
>
> > 2) Is it possible to set the "content" (the postings) of a document by
> passing the whole text at
> > once, without the need to split the words by ourself and adding each word
> one by one? That would be
> > perfect for Xapian to use its internal words-spliting algorithm, the same
> that would after be used
> > by the queryparser.
>
> That's what the term generator does for you.
>
> --
> Richard
>


More information about the Xapian-discuss mailing list