[Xapian-discuss] Avoiding "stale" Search Results

James Aylett james-xapian at tartarus.org
Thu May 21 00:31:28 BST 2009


On Wed, May 20, 2009 at 04:09:23PM -0700, Miki Tebeka wrote:

> Hello James,

Please keep replies on-list so everyone can help and learn from each
other.

> > If you read all 10 (say) items in the MSet as soon as you grab it, you
> > should only rarely need to reopen the db in the process. 
>
> My system gets about 100000 new documents a day, so this is happening all
> the time.

That's only 1-2 per second, and if you batch things up you should be
able to get through a search between batches almost all the time.

> > Or (better, and neater) stick the entire thing behind some sort of facade
> > interface that does the reopening for you.
>
> Currently I have something in the lines of:
> def get_data(doc, db):
>     for i in range(10):
>         try:
>             return doc.data.copy()
>         except xapian.DatabaseModifiedError, e:
>             db.reopen()

You probably want to add:

      raise e

to the end (set e to a 'help unexplained!' error at the start of the
function). You may already have something equivalent, but thought it
worth pointing out.

> I'm still getting these errors when the load is heavy.

Definitely look at batching your writes, then. (I assume the error is
always from there and not elsewhere.) Note that by default the library
will batch writes anyway, so I'm guessing you're creating a new
process (or at least a new WriteableDatabase instance) for each new
document?

If you can't batch the writes cleanly, you could consider writing in
updates to a copy of the database, duping the entire thing
periodically (zero-cost snapshots in your file system would help, else
you want to exit the write process during the copy), and using a stub
database file to switch the readers between the two databases -- one
is being read from only, one written to only.

(I could have sworn we had a document on the latter strategy, but...I
can't find it, so I'm probably wrong.)

J

-- 
  James Aylett

  talktorex.co.uk - xapian.org - uncertaintydivision.org



More information about the Xapian-discuss mailing list