[Xapian-discuss] Search::Xapian really slow compared to C++ Xapian

Olly Betts olly at survex.com
Thu Sep 30 11:41:34 BST 2010


On Thu, Sep 30, 2010 at 09:43:58AM +0200, Hanzz Solo wrote:
> That's not what I was hoping for because now I have two searchers
> that are really slow then doing a complex wildcard search.

It's not so much the complexity, just the number of terms that the
wildcard expands to in your a* OR b* OR ... case.

Richard experimented with storing extra terms to help these cases, which
helps a lot, though at the expense of increasing database size
significantly:

http://trac.xapian.org/ticket/207

> Is there a way to limit the time that Xapian can use for doing the search
> or maybe a way to limit the allowed complexity of a search query direct in
> Xapian?

You can impose a time limit outside of Xapian - e.g. using alarm(), but
it's not an ideal approach as you have to do a lot of the work and still
not return any results.

You could also call get_description() on the parsed query, and count
how many times OP_SYNONYM occurs (or OP_OR if using Xapian 1.0.x), or
simply how long the result is.

I think it would also be useful to be able to specify the minimum
wildcard "stub" length (so 3 would allow 'the*' but not 'th*' or 't*').

Another possibility is limiting the number of terms that expansion can
generate - ticket 350 has a patch, and links to some discussion on the
list which touches on issues with this approach:

http://trac.xapian.org/ticket/350

Cheers,
    Olly



More information about the Xapian-discuss mailing list