[Xapian-tickets] [Xapian] #180: Add support for CJK text to queryparser and termgenerator
Xapian
nobody at xapian.org
Thu Aug 18 03:01:07 BST 2011
#180: Add support for CJK text to queryparser and termgenerator
-------------------------+--------------------------------------------------
Reporter: richard | Owner: richard
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.3.0
Component: QueryParser | Version: SVN trunk
Severity: normal | Resolution:
Keywords: | Blockedby:
Platform: All | Blocking:
-------------------------+--------------------------------------------------
Comment(by olly):
I've committed the latest patch on a branch in git, cleaned up a few
things, and fixed a bug with dereferencing an iterator before the end
check:
https://github.com/ojwb/xapian-chinese-segmentation/tree/cjk-from-
ticket-180
The plan is to merge in the Chinese segmentation support as well, since
that needs to hook in in very similar places.
I noticed an issue with the term positions - currently the code blindly
assigns a different position to every n-gram it generates, which doesn't
seem a good approach.
I'm not sure what the best approach is though. The key thing is we want
phrases and the NEAR and ADJ operators to work in a natural way for users.
--
Ticket URL: <http://trac.xapian.org/ticket/180#comment:24>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list