[Xapian-tickets] [Xapian] #679: Memory and speed issues in wildcard searches

Xapian nobody at xapian.org
Wed May 6 12:57:39 BST 2015


#679: Memory and speed issues in wildcard searches
---------------------------+------------------
        Reporter:  dk      |      Owner:  olly
            Type:  defect  |     Status:  new
        Priority:  normal  |  Milestone:
       Component:  Other   |    Version:
        Severity:  normal  |   Keywords:
      Blocked By:          |   Blocking:
Operating System:  All     |
---------------------------+------------------
 Hello,

 I have a problem with some searches when wildcarding is involved, which is
 xapian eating lots of memory and performing slowly. The problem manifests
 itself when expansion of a query with a trailing * in it returns too many
 matches, and thus the expanded query contains lots of individual words.

 The simple example attached easily eats 1.6G on my machine, and never
 gives them back. This is a problem in long-running fastcgi processes, that
 grind the server down when users fire lots of these searches. I wonder if
 there can be done something about it.

 What's more interesting, that most of the time I don't even need all of
 the expanded words to match, only some 100 first documents. However
 setting a cap on the number of wildcard expansion doesn't help, xapian
 casts an exception "Wildcard expands to more than X terms". Surely there's
 a reason behind this (I just don't know what is it), but probably there
 could be added a flag to QueryParser that forces capping of the expansion?
 This will also helps the speed of the search.

 Please find attached code examples and the output of memory use.

 Sincerely,
 Dmitry Karasik
 IT System Developer
 Novozymes A/S
 Krogshoejvej 36
 2880 Bagsvaerd Denmark

--
Ticket URL: <http://trac.xapian.org/ticket/679>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list