does Xapian::Enquire hold an MVCC revision?
Eric Wong
e at 80x24.org
Thu Aug 17 22:28:26 BST 2023
In other words, is it possible to avoid duplicates if new
documents are inserted into the DB by another process in-between
->get_mset calls when reusing Xapian::Enquire objects?
I do some expensive processing on each mset window, so I always
limit the results to limit heap usage even if I'm planning on
going through a big chunk of the DB:
$mset = $enq->get_mset(0, 1000);
do_something_slow_with_mset($mset);
$mset = $enq->get_mset(1000, 1000);
do_something_slow_with_mset($mset);
$mset = $enq->get_mset(2000, 1000);
do_something_slow_with_mset($mset);
I'm not reusing Xapian::Enquire objects right now since the
original code was made for HTML pagination and there's no
guarantee subsequent pages would even hit the same HTTP process.
Now with local batch reports and streaming dumps, reusing the
Xapian::Enquire object might make sense if duplicates (or skips)
can be avoided on DBs where another process is writing to it.
Neither query parsing nor setting up the Enquire object seems
to take a measurable amount of time compared to the work that
needs to be done with the $mset.
Thanks.
More information about the Xapian-discuss
mailing list