[Xapian-discuss] Re: Fetching a list of doc-id's from a database

Olly Betts olly at survex.com
Sun Oct 30 11:34:31 GMT 2005


On Sun, Oct 30, 2005 at 12:30:13AM +0200, Eric Bus wrote:
> One other question: at the moment, I'm using a queue to send new 
> mutations to scriptindex. A couple of processes are putting data in the 
> queue and another process dequeues them and runs scriptindex. For now, 
> I'm restarting scriptindex for each dequeued command. Would it be 
> possible to keep scriptindex running in the background? The dequeueing 
> process would start up scriptindex once and use the input/output handles 
> to send commands to it.

You could, but scriptindex only forces a flush at the end of each input
file, so you either need to tweak scriptindex, or accept that not all
changes will go live right away.

> It's possible, but how does a running scriptindex affect the searches? 
> Will the database be in some sort of locking state during scriptindex's 
> run? I want to avoid the overhead of starting scriptindex for every new 
> mutation, but if that results in slower searches, I'll pass :)

Searches continue to work during updates.  If you're hammering in
updates at such a rate that two happen during a search, then with
Quartz the search may fail with "DatabaseModifiedError" (which you
would generally respond to by restarting it).  Flint should avoid
this annoyance, though currently it uses the same Btree manager as
quartz so also has it.

There's no difference for the search process here whether you close and
reopen the database or not.  It's when updates are flushed
(automatically or forced) that's the issue.

Have you actually measured the overhead of starting scriptindex?  It
seems unlikely it's a major factor.  Unless you are running it once
for each updated document perhaps.

Cheers,
    Olly



More information about the Xapian-discuss mailing list