[Xapian-discuss] Term extraction with Xapian

David Levy dvid.levy at gmail.com
Sun Feb 12 09:42:44 GMT 2006


I've been successfully using Xapian/Omega for several monthes on my website
to provide product catalog search functionality.
But now, I have a new need and I can't figure out if Xapian can meet it :
I want to reproduce the term extraction algorithm provided by "Yahoo! Term
extraction WS" (
http://developer.yahoo.net/search/content/V1/termExtraction.html), which is
limited to 5000 queries is day - not enough for me :(.
Let's say I have a raw text of 300 words. I want to extract terms
(nouns/phrases) like "ipod nano", "sony z1", "tom cruise", etc

I wonder how I could do that with Xapian (which provide really good
performance!) using its termlist and maybe some fuzzy logic operators ?

Thanks in advance

(sorry for my English, I am just a French frog !!)

David LEVY {selenium}
Website ~ http://www.davidlevy.org
Wishlist Zlio ~ http://david.zlio.com/wishlist
Blog ~ http://selenium.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060212/384d057e/attachment.htm

More information about the Xapian-discuss mailing list