[Xapian-tickets] [Xapian] #679: Memory and speed issues in wildcard searches
Xapian
nobody at xapian.org
Wed May 6 12:57:39 BST 2015
#679: Memory and speed issues in wildcard searches
---------------------------+------------------
Reporter: dk | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone:
Component: Other | Version:
Severity: normal | Keywords:
Blocked By: | Blocking:
Operating System: All |
---------------------------+------------------
Hello,
I have a problem with some searches when wildcarding is involved, which is
xapian eating lots of memory and performing slowly. The problem manifests
itself when expansion of a query with a trailing * in it returns too many
matches, and thus the expanded query contains lots of individual words.
The simple example attached easily eats 1.6G on my machine, and never
gives them back. This is a problem in long-running fastcgi processes, that
grind the server down when users fire lots of these searches. I wonder if
there can be done something about it.
What's more interesting, that most of the time I don't even need all of
the expanded words to match, only some 100 first documents. However
setting a cap on the number of wildcard expansion doesn't help, xapian
casts an exception "Wildcard expands to more than X terms". Surely there's
a reason behind this (I just don't know what is it), but probably there
could be added a flag to QueryParser that forces capping of the expansion?
This will also helps the speed of the search.
Please find attached code examples and the output of memory use.
Sincerely,
Dmitry Karasik
IT System Developer
Novozymes A/S
Krogshoejvej 36
2880 Bagsvaerd Denmark
--
Ticket URL: <http://trac.xapian.org/ticket/679>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list