[Xapian-discuss] Incremental updates and disk space ...

Marinos Yannikos mjy at geizhals.at
Wed Aug 31 00:07:31 BST 2011


On 30.08.2011 15:15, Olly Betts wrote:
> Are you deleting a lot of documents?

Yes, there's a lot of deleting going on, we also use positional 
information. As for updates, I don't have an exact number, but they're 
in the millions of documents every day.

> Or is there something else which might be unusual about your update
> patterns?

One unusual thing is that we don't have natural text, but product names, 
specifications etc. so we also (probably) have an unusually large number 
of terms (in case it matters):

number of documents = 21400477
number of distinct terms = 28270299

Updates are done in batches with a large threshold (500K) and with 
regular explicit flushes when we think we can spare the time (i.e. not 
too many updates queued).

We haven't tried the brass backend lately, is it likely to behave 
differently for such a use case? (we're quite happily using chert though)

Thanks,
  Marinos



More information about the Xapian-discuss mailing list