[Xapian-discuss] Search::Xapian really slow compared to C++ Xapian
Olly Betts
olly at survex.com
Thu Sep 30 11:41:34 BST 2010
On Thu, Sep 30, 2010 at 09:43:58AM +0200, Hanzz Solo wrote:
> That's not what I was hoping for because now I have two searchers
> that are really slow then doing a complex wildcard search.
It's not so much the complexity, just the number of terms that the
wildcard expands to in your a* OR b* OR ... case.
Richard experimented with storing extra terms to help these cases, which
helps a lot, though at the expense of increasing database size
significantly:
http://trac.xapian.org/ticket/207
> Is there a way to limit the time that Xapian can use for doing the search
> or maybe a way to limit the allowed complexity of a search query direct in
> Xapian?
You can impose a time limit outside of Xapian - e.g. using alarm(), but
it's not an ideal approach as you have to do a lot of the work and still
not return any results.
You could also call get_description() on the parsed query, and count
how many times OP_SYNONYM occurs (or OP_OR if using Xapian 1.0.x), or
simply how long the result is.
I think it would also be useful to be able to specify the minimum
wildcard "stub" length (so 3 would allow 'the*' but not 'th*' or 't*').
Another possibility is limiting the number of terms that expansion can
generate - ticket 350 has a patch, and links to some discussion on the
list which touches on issues with this approach:
http://trac.xapian.org/ticket/350
Cheers,
Olly
More information about the Xapian-discuss
mailing list