[Xapian-discuss] allterms iterator dangerous?

Richard Boulton richard at lemurconsulting.com
Thu Apr 24 09:36:11 BST 2008


Francis Irving wrote:
> Is there a safe way of completely deleting a database, but from the
> API, not by wiping it on the filesystem* ? This is for a 'rebuild all'
> function.
> 
> I used to use the all terms iterator to do it but a) that seems slow!
> and b) it is now marked "dangerous" (at least in the Ruby API).
> Why is it dangerous, and is there an alternative?

I've not been following the ruby bindings, but I think it's just that 
the "allterms_begin()" and "allterms_end()" methods have been deprecated 
and replaced by a ruby iterator obtained by calling "allterms()".

> The all terms iterator was also useful for the slightly less crazy
> thing of looking for stray stuff that is in the index, but no longer
> in the database.

That seems perfectly sensible, yes.


> * OK, you can try and convince me I should do this instead if you
> like! It's just there might be things reading the database, so you'd
> likely get errors pulling the database away erratically under their
> feet.

I'd tend to go this way, simply because it saves a lot of effort.

If you're on Unix, you can use the fact that xapian keeps the database 
file handles open during a search.  This means that if you unlink the 
files which a search is in progress, your search should still complete 
correctly.  There is a brief period where some of the file handles are 
open but others haven't yet been opened, so I'd suggest using a symlink 
to point to the "live database", building a new live database, then 
switching the symlink over, then wait a few seconds (to ensure no 
searches are in the process of opening the database), and then delete 
the old database.

This doesn't work on windows, of course.

-- 
Richard



More information about the Xapian-discuss mailing list