[Xapian-discuss] BM25

Olly Betts olly at survex.com
Sat Nov 27 20:12:56 GMT 2004


On Tue, Nov 09, 2004 at 05:31:20AM +0000, Olly Betts wrote:
> Reading through a few of the papers Stephen Robertson has coauthored
> which talk about BM25, I've found suggested default parameter values of:
> 
> b (which Xapian calls D) "around 0.75" (we default to 0.5)
> k1 (which Xapian calls B) 1.2 (we default to 1)
> k3 (which Xapian calls A) "7 or 1000 (effectively infinite)" (we default to 1)
>     
> I haven't yet seen any suggested default for k2 (in Xapian C = 2 * k2),
> although some papers don't mention the extra term which uses this
> constant and that is equivalent to using k2 = 0 (which is Xapian's current
> default).

I've done some more reading.  It indeed appears that everyone sets k2
to 0, though perhaps that's largely because the original papers did.

Maybe we should consider changing the BM25Weight default from (1, 0, 1, .5)
to (1.2, 0, 7, 0.75).  I'm fairly sure the current defaults were just
arbitrary choices.

> Incidentally, I think it's confusing that Xapian has a unique naming scheme
> for the parameters, while most other references are consistent in their
> choice of names.  I think we should use the standard names instead.
> [...]

I've now renamed the parameters, removed the extra factor of two from C,
and reordered the parameters to k1, k2, k3, b.

Cheers,
    Olly



More information about the Xapian-discuss mailing list