[Xapian-discuss] high update-frequency strategy
Olly Betts
olly at survex.com
Thu Aug 13 12:35:44 BST 2009
On Thu, Aug 13, 2009 at 09:18:40AM +0200, Jan wrote:
> Is there any way to make get_document "lazier" i.e. not do lookups in
> the persistent index - and do the meta-date replace "dirty" i.e. simply
> write the new value in the cache and don't make it persistent until
> flush() ?
This patch helps in many cases (for apt-xapian-index, it improved a
testcase of updating just values from about 40 seconds to less than one):
http://oligarchy.co.uk/xapian/patches/xapian-flint-lazy-update-backport-for-1.0.patch
It's quite likely to be in 1.0.15 (and more success stories would make
that more likely).
It's already in the 1.1.x development releases.
> What are the performance dis-/advantages of modeling meta-data as
> prefix-terms vs. document values ?
It depends how you want to use it really. If you want to select one
or a few of the possible values, a prefixed boolean term is good. But
if you want to select potentially large ranges, or perform more complex
tests than "is a member of" (e.g. geographical distance filtering) then
values are more flexible.
With 1.1.x, you can also use externally stored meta-data and
Xapian::PostingSource.
Cheers,
Olly
More information about the Xapian-discuss
mailing list