[Xapian-discuss] xapian performance

Fernando Nemec fernando.nemec at folha.com.br
Tue Nov 21 21:16:52 GMT 2006


Hi Olly,

Sorry my delay, but here in Brazil we got two holidays in the past
week. :)

After so many patches I opt to get a fresh new source copy from svn.
As far as I see you committed almost all patches you produced in the
last days.

Sadly I didn't figure out any new improvement. I made a simple list
with a variety of queries and all of them return in more or less the
same time (a few tens of seconds).

Is there any information I can supply to you to help to find what's
going on phrase searches?

Thanks,

Nemec



Thursday, November 16, 2006, 2:12:56 PM, you wrote:

> On Thu, Nov 16, 2006 at 01:00:01PM -0200, Fernando Nemec wrote:
>> As I told you, the improve in search time by queries like "A B C" was
>> great.

> That's good.

>> In the other hand, I try to search for "A B" (considering A and
>> B are very common words) and it took 90 seconds when before the patch
>> it used to take 60 seconds.

> I suspect the regression for "A B" is due to using wdf instead of the
> true positionlist length when deciding which term to check first.  For
> the 2 term case we can use the true statistics though.  Actually, we
> can use the true statistics to order the two terms with the highest
> wdf for any case.  Try this updated patch:

> http://www.oligarchy.co.uk/xapian/patches/xapian-experimental-phrase-optimisation-v2.patch

> If anyone else has a large database with positional information, please
> give this patch a whirl.  I'd be slightly cautious about using it live
> in a production system - correctness shouldn't be an issue, but there
> could be performance regressions for some cases still.

> Cheers,
>     Olly

--
[]s
Fernando Nemec
fernando.nemec at folha.com.br
http://www.folha.com.br/





More information about the Xapian-discuss mailing list