[Xapian-devel] Merging stats from multiple databases for expand

Richard Boulton richard at lemurconsulting.com
Tue Mar 6 09:19:58 GMT 2007


Olly Betts wrote:
> In matcher/expandweight.cc we have:
> 
> OmExpandBits
> operator+(const OmExpandBits &bits1, const OmExpandBits &bits2)
> {
>     OmExpandBits sum(bits1); 
>     sum.multiplier += bits2.multiplier;
>     sum.rtermfreq += bits2.rtermfreq;
>     
>     // FIXME - try to share this information rather than pick half of it
>     if (bits2.dbsize > sum.dbsize) {
>         DEBUGLINE(WTCALC, "OmExpandBits::operator+ using second operand: " <<
>                   bits2.termfreq << "/" << bits2.dbsize << " instead of " <<
>                   bits1.termfreq << "/" << bits1.dbsize);
>         sum.termfreq = bits2.termfreq;
>         sum.dbsize = bits2.dbsize;
>     } else {
>         DEBUGLINE(WTCALC, "OmExpandBits::operator+ using first operand: " <<
>                   bits1.termfreq << "/" << bits1.dbsize << " instead of " <<
>                   bits2.termfreq << "/" << bits2.dbsize);
>         // sum already contains the parts of the first operand
>     }
>     return sum;
> }
> 
> Why don't we "share this information" by just replacing the "if" by:
> 
>     sum.termfreq += bits2.termfreq;
>     sum.dbsize += bits2.dbsize;
> 
> Am I missing some subtlety here?

I don't think so - I think changing the code to do that would be fine.

> I looked at the history of this code, and the essence of this is unchanged
> since it was first checked in back in 1999 (and looking at that commit, it
> doesn't look like this code came from another file):

Going that far back, I think it was probably a case of getting it to 
work for the single database case, and worrying about the multiple 
database case later.  I can't think now why I'd have thought that a 
simple approach like that wouldn't work.

-- 
Richard



More information about the Xapian-devel mailing list