[Xapian-discuss] Unique Term Listings

Olly Betts olly at survex.com
Tue Nov 7 15:17:59 GMT 2006


On Tue, Nov 07, 2006 at 02:10:26PM +0000, richard at lemurconsulting.com wrote:
> You want the "get_termfreq()" methods.

I think Martin is asking about how to find out the number of occurences
of a particular term *in documents which match the query*.

This statistic isn't easy for Xapian to calculate exactly, since it
tries hard to avoid checking every single document which matches the
query (while still returning the same results as if it had).  We
could perhaps try and build an estimate for this for each term during
the match based on the documents we do look at, but currently we don't.

Your best bet for now is probably to rerun the query OP_FILTER the term
you want this information for.  You can run it asking for a mset size of
0 which will just return estimated statistics which is very quick,
but even a full match should be fairly quick as most of the blocks
needed will be cached already.

Cheers,
    Olly



More information about the Xapian-discuss mailing list