James Aylett james-xapian at tartarus.org
Fri Jan 11 13:12:05 GMT 2008

On Fri, Jan 11, 2008 at 12:30:45AM +0000, Olly Betts wrote:

> > We'd need to devise a test case (better, several cases) with
> > concurrent queries, using some sort of valid (or validatable)
> > distribution of queries, against a database for which those queries
> > are valid.
> Tweakers.net have kindly supplied some sanitised query logs.  They're
> predominantly Dutch, but could reasonably be run against an index of
> Dutch wikipedia data.
> Otherwise, anyone with a large live system split over several databases
> could run tests and report the results.

It'd be nice if we could have some realistic test runs available
publicly somewhere, but I imagine most people will be unwilling to
give them out. Otherwise I'm worried that if (for instance) I start
playing around with profiling on Solaris, I'll end up optimising for
my usage pattern. (I can ignore things that appear obviously biased,
but at the end of the day any optimisation is going to be biased

> > Do you know (or can you look up) the proportion of GMane queries that
> > are restricted to a specific group?
> I could, though I don't really have time for such data-mining at the
> moment.  I'm not sure what you'd hope to learn from that though...

There was something, I'm sure of it. Can't remember now, though :(


  James Aylett                                                  xapian.org
  James Aylett

