[Xapian-devel] How the matcher knows when to prune and decay

Olly Betts olly at survex.com
Thu Mar 4 02:44:47 GMT 2010


On Wed, Mar 03, 2010 at 08:00:39PM -0500, Matt Chaput wrote:
> > On Wed, Mar 03, 2010 at 06:33:10PM -0500, Matt Chaput wrote:
> >> But how can the leaf object reading the posting list know its maximum  
> >> future score?
> > 
> > It asks its weight object (or just knows if it is a PostingSource).
> 
> I'm asking about the on-disk structures that allow a posting source to "just
> know".

A PostingSource is a user provided source of postings:

http://trac.xapian.org/browser/trunk/xapian-core/docs/postingsource.rst

If it supplies weight information, then it has to know what the highest weight
it can give is.  It's a user extension, so any on-disk structures it uses for
doing this are up to the user implementing it to provide too.

> If the postings are written to disk in document order, and the posting
> reader object is reading through them linearly, then I would naively assume
> it can't know what weights are coming up in the list.

The weighting scheme has a formula for the max weight, derived from the
formula for the weight.  The chert and brass backends track statistics like
upper and lower bounds on the document length which make it possible to come
up with a formula which gives a tighter bound.

Cheers,
    Olly



More information about the Xapian-devel mailing list