[Xapian-discuss] Tryout paches for faster chert search: http://trac.xapian.org/ticket/326
Chris
chris at s-4-u.net
Thu Sep 8 11:46:01 BST 2011
On 09/08/2011 11:51 AM, Richard Boulton wrote:
> Sources of realistic
> query data are harder to come across - anyone got any good ideas for
> that?
>
Reminds me about the AOL fuckup a few years ago (they released the
search queries of 650.000 users, by mistake).
Mirror: http://www.gregsadetsky.com/aol-data/
Combined with Wikipedia, Stackoverflow and product-data of a few hundred
online shops (affili.net et al) could(?) provide a nice and diversed
dataset.
On the other side, the database should probably be in-memory, to not be
limited by disk io, which gives a 40GB index if just using the online
shop product data.
Greets, Chris
More information about the Xapian-discuss
mailing list