[Xapian-discuss] Searching in all "fields" by default
Daniel Ménard
Daniel.Menard at ehesp.fr
Mon Nov 24 10:24:47 GMT 2008
Henry a écrit :
> I'll give that a try. Have you measured what the performance penalty
> is doing it this way? ie, is it severe enough to add significant
> delays to queries?
>
I don't have any benchmark, it's just "fast enough for us"...
However, I've read on this list that I/O was always the main factor:
indexing twice (one time with prefixes and one without) will
significantly impact the database size, doubling the number of terms in
the index. On the other hand, parsing the query and expanding terms is
very fast. Xapian will have to search for more terms but the index will
be more compact and is more likely to be completely kept in memory.
If you have time, I'm pretty sure that the list would be interested by
some benchmarks, measuring response times in both scenarii (once the
database is warmed up).
You can also mix the approaches: twice indexing for some often used
fields containing lot of different terms, field expansion for others.
Also, I guess you won't really want to search *every* field when a
not-prefixed query is performed, but only those for which it makes sense...
Last, there are some other aspects to take into consideration: a smaller
database is easier to manipulate, reindexing takes less time and so on.
Best regards,
--
Daniel Ménard
More information about the Xapian-discuss
mailing list