[Xapian-discuss] n-gram / cjk serializer

Joss Shaw jossblowing at yahoo.co.uk
Tue Aug 19 20:29:46 BST 2008


Hi all!

I've been trawling through the archives and I found reference to an n-gram query parser plugin which some guy made.  I don't think it's been included into the main Xapian distro yet but I would be really interested in such a tokenizer if there were plans!  

His tokenizer apparently plugs into Xapian, but I'm not sure how you plug extra query parsing engines in - could someone possibly shed some light on this for me please?  Additionally, would any plugin be able to take advantage of the term prefixes? Or is that something that would need to be reimplemented with each query parsing / tokenizing engine ?

The guy put all the code here: http://code.google.com/p/cjk-tokenizer/

>j



(btw - xapian is looking really fantastic at the moment - thanks to all involved, Olly, Richard, James, etc.)


Send instant messages to your online friends http://uk.messenger.yahoo.com 


More information about the Xapian-discuss mailing list