[Xapian-devel] Merging stats from multiple databases for expand
Richard Boulton
richard at lemurconsulting.com
Tue Mar 6 09:19:58 GMT 2007
Olly Betts wrote:
> In matcher/expandweight.cc we have:
>
> OmExpandBits
> operator+(const OmExpandBits &bits1, const OmExpandBits &bits2)
> {
> OmExpandBits sum(bits1);
> sum.multiplier += bits2.multiplier;
> sum.rtermfreq += bits2.rtermfreq;
>
> // FIXME - try to share this information rather than pick half of it
> if (bits2.dbsize > sum.dbsize) {
> DEBUGLINE(WTCALC, "OmExpandBits::operator+ using second operand: " <<
> bits2.termfreq << "/" << bits2.dbsize << " instead of " <<
> bits1.termfreq << "/" << bits1.dbsize);
> sum.termfreq = bits2.termfreq;
> sum.dbsize = bits2.dbsize;
> } else {
> DEBUGLINE(WTCALC, "OmExpandBits::operator+ using first operand: " <<
> bits1.termfreq << "/" << bits1.dbsize << " instead of " <<
> bits2.termfreq << "/" << bits2.dbsize);
> // sum already contains the parts of the first operand
> }
> return sum;
> }
>
> Why don't we "share this information" by just replacing the "if" by:
>
> sum.termfreq += bits2.termfreq;
> sum.dbsize += bits2.dbsize;
>
> Am I missing some subtlety here?
I don't think so - I think changing the code to do that would be fine.
> I looked at the history of this code, and the essence of this is unchanged
> since it was first checked in back in 1999 (and looking at that commit, it
> doesn't look like this code came from another file):
Going that far back, I think it was probably a case of getting it to
work for the single database case, and worrying about the multiple
database case later. I can't think now why I'd have thought that a
simple approach like that wouldn't work.
--
Richard
More information about the Xapian-devel
mailing list