[Xapian-discuss] Re: Evaluating Xapian

Olly Betts olly at survex.com
Thu Feb 10 12:43:52 GMT 2005


On Mon, Jan 31, 2005 at 01:44:42PM +0000, Richard Boulton wrote:
> On Fri, 2005-01-28 at 20:56 +0100, Arne Georg Gleditsch wrote:
> > Well, I'm fiddling with using Xapian for a source-code indexing system
> > where I want to index several releases of the same source code base
> > (the Linux kernel, primarily).
> 
> As a side point - you might want to take a look at the "cvssearch"
> application in "xapian-applications/cvssearch", which is aiming at a
> somewhat similar task.  I'm not sure exactly what state it is in - Olly
> has been gradually bringing it up to scratch as a Xapian application.

It still needs work.  Mostly ensuring all CGI input is sanitised, and
sorting out some better documentation.

> > Where the same file exists in several
> > releases in an identical revision (which is true for a lot of files,
> > especially in a stable branch), I'd like to index this [file,revision]
> > only once.  So I'm tagging the indexed documents with the releases
> > they occur in, incrementally adding tags as I index new releases.

If this means you often end up calling replace_document for the same
documents, the implicit flushes are probably what's making indexing slow.

> replace_document can cause an implicit flush of the database (but won't
> always). Specifically, if the document being modified was added or
> modified in the currently buffered batch, the database is flushed.  This
> is because it's fiddly to handle this case, and for most usage patterns
> it's a fairly uncommon operation.
> [...]
> In the longer term, perhaps it would be worthwhile for us to try and
> remove this constraint.

We probably should.  I suspect it's not actually especially hard to
handle - I added the forced flush because my original code didn't handle
this case properly and I wanted to concentrate on getting the improved
buffering working for the common cases.

Cheers,
    Olly



More information about the Xapian-discuss mailing list