[Xapian-discuss] Speed of range queries

Dmitry Karasik dmitry at karasik.eu.org
Tue Oct 23 21:00:07 BST 2012


> > 
> > Xapian::Query((VALUE_RANGE 70 stringA stringB AND KEYvalue:(pos=1)))
> > 
> > takes 0.6 seconds, whereas this one
> > 
> > Xapian::Query((KEYvalue:(pos=1) AND VALUE_RANGE 70 stringA stringB))
> > 
> > takes 20 seconds. I'm in the process of isolating the problem, so that I can
> 
> The order shouldn't make a difference - internally the subqueries of a
> tree of AND-like operators are gathered up and rearranged into a shape
> which is likely to work most efficiently.
> 
> So if that's working properly, the only thing I can think is that the
> two subqueries have the same estimated term frequency.  Does term
> KEYvalue match a lot of documents?

Indeed yes, KEYvalue matches a lot, around 1/10th of the whole base, which is
about 10^5-10^6 documents. The value #70 on the contrary is more or less
unique, and the range captures about 50 documents. Does that qualify for the
same estimated term frequency?

-- 
Sincerely,

	Dmitry Karasik






More information about the Xapian-discuss mailing list