[Xapian-discuss] Re: Xapian document matching

Peter Karman peter at peknet.com
Wed May 2 17:23:10 BST 2007



Denis Kuzmenok scribbled on 4/30/07 11:07 AM:
> Denis Kuzmenok <denis.kuzmenok <at> gmail.com> writes:
> 
>> Hi, i'm wondering is there a possibility to do like  ABCSok do 
>> (http://nyheter.abcsok.no/), to make "Main article" and "Same articles" 
>> collapsed to it.
>> Like on http://news.google.com/?hl=en the same thing. "Parent" and "same 
>> article on other sites" (they do differ from each other a little bit).
>> Maybe somebody know how to do that thing or where to read theory on doing 
> such 
>> things.
>> Thank you
>>
> 
> I find this module on CPAN
> http://search.cpan.org/~sid/WordNet-Similarity-1.04/lib/WordNet/Similarity.pm
> That's what i mean, to find if there is a similar document in the base and 
> collapse follow-ups to a thread.. Is there ary implementation in Xapian?
> Thanks
> 

I don't believe there's a similarity implementation in Xapian. There was a 
similar thread on the Swish list a couple years ago:

  http://swish-e.org/archive/2005-02/8977.html

which seemed to suggest using the Levenshtein distance algorithm to determine 
similarity before indexing. Maybe the use of a 'similarity' field (value?) in 
Xapian could achieve something similar.

-- 
Peter Karman  .  http://peknet.com/  .  peter at peknet.com



More information about the Xapian-discuss mailing list