[Xapian-discuss] xapian performance

Fernando Nemec fernando.nemec at folha.com.br
Thu Nov 16 14:45:27 GMT 2006


Hi Olly,

> I'm not sure I understand the difference between "first try" and "first
> run" here?

I meant for the first time I search for an expression.

> There have been a couple of other reports of similar very slow search
> times from cold in some cases, and I'm starting to wonder if there's a
> bug which is causing us to read a lot more data than we need to or
> something like that.  Let me take a look.

Actually this search in gmane took 11.5 secs:

http://search.gmane.org/?group=gmane.comp.search.xapian.general&query=%22planned+changes+which+should+improve+this%22

This one took 20 secs:

http://search.gmane.org/?query=%22A+position+list+is+defined+for+a%22&author=&group=gmane.comp.search.xapian.general&sort=relevance&DEFAULTOP=and&xP=planned.changes.which.should.improve.this.&xFILTERS=Gcomp.search.xapian.general---A

And I did one once which took 29 secs.

> If it's phrase searching speed you're particularly trying to improve,
> using flint should help, as should compacting the database before
> searching it.

I already did both options and it helps. In fact the patch you send
yesterday helps greatly, at least in my index.

A search which took about 40 seconds now took 4 secs. A great improve
indeed!

I'm going to do more tests here and I let you know.

Thanks again,

Nemec


Wednesday, November 15, 2006, 1:47:58 AM, you wrote:

> On Tue, Nov 14, 2006 at 02:55:27PM -0200, Fernando Nemec wrote:
>> When I try to do a search with several phrased words like this:
>> "ronaldo jogou somente trinta minutos ontem", the search time go very
>> high. This example takes about 30 seconds in the first try. The very
>> same search took < 700 ms on the first run and < 100 ms forth.

> I'm not sure I understand the difference between "first try" and "first
> run" here?

> A phrase search where all the terms occur together in a lot of documents
> but rarely as a phrase will be a slow case when searching from "cold"
> (with none of the database in the cache).

> 30 seconds is worse than I'd expect though.  Even allowing for seeking
> around, you can get a lot of data off a modern hard drive in 30 seconds!

> There have been a couple of other reports of similar very slow search
> times from cold in some cases, and I'm starting to wonder if there's a
> bug which is causing us to read a lot more data than we need to or
> something like that.  Let me take a look.

> If it's phrase searching speed you're particularly trying to improve,
> using flint should help, as should compacting the database before
> searching it.

> Cheers,
>     Olly

--
[]s
Fernando Nemec
fernando.nemec at folha.com.br
http://www.folha.com.br/





More information about the Xapian-discuss mailing list