[Xapian-discuss] Incremental updates and disk space ...
Marinos Yannikos
mjy at geizhals.at
Wed Aug 31 00:07:31 BST 2011
On 30.08.2011 15:15, Olly Betts wrote:
> Are you deleting a lot of documents?
Yes, there's a lot of deleting going on, we also use positional
information. As for updates, I don't have an exact number, but they're
in the millions of documents every day.
> Or is there something else which might be unusual about your update
> patterns?
One unusual thing is that we don't have natural text, but product names,
specifications etc. so we also (probably) have an unusually large number
of terms (in case it matters):
number of documents = 21400477
number of distinct terms = 28270299
Updates are done in batches with a large threshold (500K) and with
regular explicit flushes when we think we can spare the time (i.e. not
too many updates queued).
We haven't tried the brass backend lately, is it likely to behave
differently for such a use case? (we're quite happily using chert though)
Thanks,
Marinos
More information about the Xapian-discuss
mailing list