[Xapian-tickets] [Xapian] #270: More efficient valuerangepostlist iteration

Xapian nobody at xapian.org
Fri May 23 14:49:38 BST 2008


#270: More efficient valuerangepostlist iteration
---------------------+------------------------------------------------------
 Reporter:  richard  |       Owner:  olly 
     Type:  defect   |      Status:  new  
 Priority:  normal   |   Milestone:  1.1.0
Component:  Other    |     Version:       
 Severity:  normal   |    Keywords:       
Blockedby:           |    Platform:  All  
 Blocking:           |  
---------------------+------------------------------------------------------
 Currently, if a pure OP_VALUE_RANGE (or _GE or _LE) search is performed,
 ValueRangePostList::next() is called repeatedly to iterate through the
 documents.  This starts at docid=1, and iterates through all documents ids
 <= lastdocid, checking for suitable values.  If the docids used in the are
 sparse, this can result in a very slow iteration.  It also results in lots
 of Xapian::DocNotFoundError exceptions being thrown, and then caught,
 while testing whether a particular document ID exists.

 Instead, it would be better to use a direct iterator across the database.
 One approach is to use an all document postlist to get a iterator across
 the documents in the database.  I'll attach a patch against SVN HEAD which
 implements such an approach to this ticket shortly.  This approach has the
 downside that it usually requires iterating through the termlist table
 (with the current database backends, anyway).  However, this table is
 already checked with the current approach when checking if a document for
 which get_value() has returned the empty string exists in the database, so
 this may not be much of a downside.

 The ideal approach would be to add methods to the database interface to
 iterate through all the values in a particular slot, to use this iterator
 in value range postlists, and to implement such iterators efficiently in
 the database backends.

-- 
Ticket URL: <http://trac.xapian.org/ticket/270>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list