[Xapian-discuss] Similarity Measures
Olly Betts
olly at survex.com
Wed Jul 12 18:11:29 BST 2006
On Tue, Jun 27, 2006 at 09:21:47AM -0700, Gavin Mendel-Gleason wrote:
> I'm currently trying to implement a similarity measure
> for xapian. Ideally I'd like to be able to calculate
> the following:
I've been thinking about this. Currently Xapian weighting schemes must
be able to be express the total weight for document d in the form:
sum ( weight(t,d) ) + weight(d)
t indexes d
It wouldn't be too hard to extend that by multiplying the whole thing
by a scaling factor which depends only on the document. If we have
bounds on the scaling factor, we could adapt the optimisations to take
this scaling factor into account.
This would allow more weighting schemes to be implemented, though it's
likely to further complicate the matcher code.
Cheers,
Olly
More information about the Xapian-discuss
mailing list