sorting large msets
Olly Betts
olly at survex.com
Mon Apr 9 07:18:37 BST 2018
On Fri, Apr 06, 2018 at 07:24:23PM +0000, Eric Wong wrote:
> > > Olly Betts <olly at survex.com> wrote:
> > > >
> > > > The reverse order (ENQ_ASCENDING) is really fast - about 0.0001 seconds.
> > > > This is because in that case we can just stop once we've found 200
> > > > matches.
>
> With a few million documents, that ENQ_ASCENDING sounds promising :)
>
> So, it looks like if I had ideal ordering, I could do something
> along the lines of:
>
> my $doc_id = $db->get_metadata('last_doc_id') || 0xffffffff;
>
> $db->replace_document($doc_id--, $_) foreach (@doc);
>
> $db->set_metadata('last_doc_id', $doc_id);
>
> And get killer performance.
Yes, though that's likely to be slower to index than this, since
appending a document is handled more efficiently:
$db->add_document($_) foreach (reverse @doc);
Cheers,
Olly
More information about the Xapian-discuss
mailing list