[Xapian-discuss] Unique Term Listings

Martin Hearn martinhearn at mac.com
Tue Nov 7 20:25:42 GMT 2006


Many thanks for your assistance. The only trouble with the way you  
suggest is that you have to know what terms to search for.
If I wanted to group all my results by date tag, I'd have to run the  
OP_FILTER 365 times for a year, which is not efficient.

Thanks

Martin

On 7 Nov 2006, at 15:17, Olly Betts wrote:

> On Tue, Nov 07, 2006 at 02:10:26PM +0000,  
> richard at lemurconsulting.com wrote:
>> You want the "get_termfreq()" methods.
>
> I think Martin is asking about how to find out the number of  
> occurences
> of a particular term *in documents which match the query*.
>
> This statistic isn't easy for Xapian to calculate exactly, since it
> tries hard to avoid checking every single document which matches the
> query (while still returning the same results as if it had).  We
> could perhaps try and build an estimate for this for each term during
> the match based on the documents we do look at, but currently we  
> don't.
>
> Your best bet for now is probably to rerun the query OP_FILTER the  
> term
> you want this information for.  You can run it asking for a mset  
> size of
> 0 which will just return estimated statistics which is very quick,
> but even a full match should be fairly quick as most of the blocks
> needed will be cached already.
>
> Cheers,
>     Olly




More information about the Xapian-discuss mailing list