filtering by docid range?

Olly Betts olly at survex.com
Fri Feb 12 04:47:55 GMT 2021


On Mon, Feb 08, 2021 at 06:06:38PM +0000, Eric Wong wrote:
> Hey all, is there a way to exclude results by docid?

There's nothing built in currently.

It can be done with a custom PostingSource subclass, but that's not
possible from Search::Xapian.

> I'm using a combined DB and guaranteeing the order for
> ->add_database calls.  All sub DBs have monotonically
> increasing docids, so the combined docid will remain
> monotonically increasing.
> 
> I'll store the maximum docid from a previous search ($OLD_MAX),

I'm not sure I see how this works.

Say there are two databases with docids in use:

A = {1,2}
B = {1,2,3,4}

Then the combined database is:

A+B = {1=A1,2=B1,3=A2,4=B2,   6=B3,   8=B4}

(and 5 and 7 are unused).  This means $OLD_MAX is 8

Then we add a document to A:

A {1,2,3}
B {1,2,3,4}

A3 is 5 in the combined database, which is below $OLD_MAX, so this new
document won't be returned by an incremental search.

I think this would only work if you carefully add documents in a
round-robin fashion, or otherwise take care to avoid this issue.

Cheers,
    Olly



More information about the Xapian-discuss mailing list