[Xapian-discuss] Search with symbols causes search time to hemorrhage

Olly Betts olly at survex.com
Fri Aug 25 15:48:47 BST 2006


On Sun, Aug 06, 2006 at 07:46:34PM -0800, oscaruser at programmer.net wrote:
> Searching for terms like with non-alpha numerical symbols causes great
> delays before search results appears. I am searching 5 M pages (~76
> GB) of shopping site web data for things like "Men's Levi's
> Low Rise Boot Cut 527 Jeans - Downtown", which has symbols " ' ", "-".

Currently << Men's >> is indexed as << Men >> followed by << s >>, and
at query time we generate a phrase query.  This isn't ideal since as
you've noticed this sometimes gives a very slow search.

This is bug#22:

http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=22

It's likely I'll be looking at it in the near future.

Just for the record, the hyphen is irrelevant in this case.  If you'd
written << Jeans-Downtown >> you'd get a phrase search, but not with
whitespace around the hyphen.

Cheers,
    Olly



More information about the Xapian-discuss mailing list