[Xapian-tickets] [Xapian] #180: Add support for CJK text to queryparser and termgenerator
Xapian
nobody at xapian.org
Thu Apr 9 01:35:59 BST 2009
#180: Add support for CJK text to queryparser and termgenerator
-------------------------+--------------------------------------------------
Reporter: richard | Owner: richard
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.1.7
Component: QueryParser | Version: SVN trunk
Severity: normal | Resolution:
Keywords: | Blockedby:
Platform: All | Blocking:
-------------------------+--------------------------------------------------
Changes (by olly):
* type: defect => enhancement
* milestone: => 1.1.7
Old description:
> Some code to do this kind of tokenisation is now available at
> http://code.google.com/p/cjk-tokenizer/ which should probably be used as
> the
> basis for supporting this in Xapian.
New description:
Some code to do this kind of tokenisation is now available at
http://code.google.com/p/cjk-tokenizer/ which should probably be used as
the
basis for supporting this in Xapian.
We could add this as a QueryParser/TermGenerator option without breaking
API compatibility. Marking for considering later in 1.1.x, but it could
probably go in 1.2.x as it's likely to be ABI compatible too.
--
Comment:
Fabrice Colin said on xapian-discuss:
Pinot uses a slightly modified version of Yung-Chung Lin's
cjk-tokenizer that can be found at
http://svn.berlios.de/wsvn/dijon/trunk/cjkv/CJKVTokenizer.cc
For an example, see the XapianIndex and TokensIndexer classes at
http://svn.berlios.de/wsvn/pinot/trunk/IndexSearch/Xapian/XapianIndex.cpp
--
Ticket URL: <http://trac.xapian.org/ticket/180#comment:3>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list