[Xapian-discuss] Similarity Measures
plywn at yahoo.com
Tue Jun 27 17:21:47 BST 2006
I'm currently trying to implement a similarity measure
for xapian. Ideally I'd like to be able to calculate
for document i, and document j
s_ij = a_ij / ( L_i + L_j + a_ij)
Where L_i is the number of terms in the document i
and a_ij is:
a_ij = Sum[ t_ik t_jk ]
Where t_ij is 1 if term "j" occurs in document i.
>From looking at the source code for weights it appears
that the sum should be cut up into peices that can be
calculated incrementally. Is it possible to calculate
this value within the current weight framework.
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
More information about the Xapian-discuss