[Xapian-discuss] spliting words algorithm - indexer vs. queryparser
Richard Boulton
richard at lemurconsulting.com
Fri Dec 26 19:24:30 GMT 2008
On Fri, Dec 26, 2008 at 10:10:04AM -0500, tata 668 wrote:
> 1) Being able to use the exact same algorithm to split words when adding a text to a document and
> when parsing a query (with the queryparser).
Not quite the same algorithm (because you don't want to handle things like
"AND" and "OR" and brackets in a query the same way as in a document), but
the TermGenerator class does what you want.
http://xapian.org/docs/sourcedoc/html/classXapian_1_1TermGenerator.html
> 2) Is it possible to set the "content" (the postings) of a document by passing the whole text at
> once, without the need to split the words by ourself and adding each word one by one? That would be
> perfect for Xapian to use its internal words-spliting algorithm, the same that would after be used
> by the queryparser.
That's what the term generator does for you.
--
Richard
More information about the Xapian-discuss
mailing list