synonym expansion for boolean prefixes.

Olly Betts olly at survex.com
Sun Jan 10 17:45:51 GMT 2016


On Sat, Jan 09, 2016 at 07:59:09AM -0400, David Bremner wrote:
> Olly Betts <olly at survex.com> writes:
> 
> > On Tue, Jan 05, 2016 at 08:43:13AM -0400, David Bremner wrote:
> >> Well, the configuration needs to be somewhere.  Would it make sense to
> >> from a performance point of view to be looking up foo_tag_term in
> >> document metadata?
> >
> > Calling get_metadata() is pretty much exactly equivalent to reading the
> > synonyms for a term - both read one Btree entry, just from different
> > tables.
> 
> Just this morning I realized any lookup would happen during query parsing,
> which seems pretty unlikely to be a bottleneck.

That perhaps depends a bit on what you consider a bottleneck.

Query parsing needs to happen before query execution, so any time you
spend doing that is simply added to the total time taken.

You need to do a synonym lookup for each query term, and you need to
read the postlist for each query term, so if the postlists are short
that's potentially doubling the number of entries to read (ignoring
actually showing the results).

Short posting lists make for a fast query, so it's only really going
to matter in the "cold cache" case and/or for crazy length queries.

But anyway, reading it from the user metadata should be little different
to reading it from the synonym table.  You could also actually just
store them in the synonym table, and read them using
Xapian::Database::synonyms_begin().

Cheers,
    Olly



More information about the Xapian-discuss mailing list