[Xapian-discuss] Iterating through all the documents of a db

Olly Betts olly at survex.com
Fri Jun 24 23:28:00 BST 2005


On Fri, Jun 24, 2005 at 05:27:33PM -0400, Marco Tabini wrote:
> Is there a way to iterate through all the documents in a database? I *can*
> just get the last doc id and work my way through sequentially, trapping any
> errors which indicate that a specific document doesn't exist... But that
> seems like such an inefficient way of doing things :)

That's the way to do it at present.  See:

http://svn.xapian.org/trunk/xapian-core/examples/copydatabase.cc?rev=6097&view=markup

Should be ok efficiency-wise.

There's a plan to add a way to iterate directly over all documents:

http://xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=47

The idea is that you'll just be able to iterate over the postlist for
an empty termname, and that'll actually iterator over all documents
in the database.  It'll be as if there's a magic term "" which indexes
all documents.  Currently postlist_begin("") throws an exception.

I got as far as a prototype patch, but there was some obstacle which
made it easier to leave until something else got done.  Sadly I can't
remember what that was now!  When I next have a spare moment, I'll try
applying the patch and see.

I'll attach the (non-usable) patch to the bugzilla entry to make sure
it doesn't get lost.

Cheers,
    Olly



More information about the Xapian-discuss mailing list