[Xapian-discuss] Xapian performance on gmane.org compared
henka at cityweb.co.za
Thu Aug 27 15:06:06 BST 2009
Using xapian revision 13300 (chert db).
Test chert database is about 4GB - 320,000 docs.
Performance for typical one or more keyword searches is quick. For
example, search for [upload site page] yields the query:
Xapian::Query((upload:(pos=1) OR site:(pos=2) OR page:(pos=3)))
Takes a second.
However, searching for something like [co.uk] is mind-numbingly and
Xapian::Query((co:(pos=1) PHRASE 2 uk:(pos=2)))
Looks like it interprets this search as a phrase.
Takes over _40_ seconds.
Typical phrase searches, such as ["your email"] take a few seconds
longer than normal keyword searches (as expected), but nowhere near as
slow as 40+s.
I'm trying to get a handle on how best to improve the situation, so
having something to compare against would be informative. I notice
that gmane.org has about 70 million articles, yet the same search
[co.uk] returns in 4s. Yes, these are plain text and relatively small
docs, but still...
I must be doing something wrong.
If I may:
What DB format is gmane.org using (chert/flint)?
What's the DB size on disk?
How many search servers is gmane.org using? Their approx. spec?
Any comments would be appreciated.
More information about the Xapian-discuss