does Xapian::Enquire hold an MVCC revision?

Olly Betts olly at survex.com
Tue Aug 22 06:26:12 BST 2023


I've pushed a change to eliminate that 8 bytes of extra padding per
result on x86-64 which should be in 1.4.24.

On Sat, Aug 19, 2023 at 10:52:00PM +0000, Eric Wong wrote:
> Olly Betts <olly at survex.com> wrote:
> > Incidentally you don't mind the export order and only have single term
> > queries you can just use a PostingIterator to get a stream of document
> > ids matching a particular term (in the order documents were added),
> > which should use at most ~80KB (per shard if you're using a sharded
> > database).
> 
> Thanks for that tip on PostingIterator, I'll keep it in mind;
> but I think there's usually >= 2 terms.

We could probably add a method to expose the internal tree of PostList
objects that is built to run the query as a PostingIterator, which would
allow getting a full stream of results in ascending docid order for an
arbitrary query with a modest memory requirement (or a partial stream as
it evaluates lazily and you can just stop iterating whenever you want).

This could also provide a neat way to allow seeing what the query
optimiser has done which would be handy for writing testcases,
and for debugging and development work on this area of the code -
one could call get_description() on the returned PostingIterator to
find this information.

I haven't looked at the details, but it ought to be easy for a single
local shard, and probably also for multiple local shards.  Not sure
about remote shards - remote matches are done remotely and an MSet
serialised and sent back to be merged with MSets from other remotes
and an MSet from any local shards.  It could just throw
UnimplementedError.  If it was implemented it would likely end up
fetching the full lists of postings non-lazily for remote shards, but I
think that's OK provided it's documented. 

I'll take a look when I get a chance.  It'd probably be git master
only, but we are making progress towards a new release series.

Cheers,
    Olly



More information about the Xapian-discuss mailing list