[Xapian-tickets] [Xapian] #719: Tokenized CJK query terms wrongly combined with respect to prefixes

Xapian nobody at xapian.org
Fri Nov 25 01:07:29 GMT 2016


#719: Tokenized CJK query terms wrongly combined with respect to prefixes
-------------------------+---------------------------
 Reporter:  liweitianux  |             Owner:  olly
     Type:  defect       |            Status:  closed
 Priority:  normal       |         Milestone:  1.2.25
Component:  QueryParser  |           Version:  1.2.23
 Severity:  normal       |        Resolution:  fixed
 Keywords:  CJK, prefix  |        Blocked By:
 Blocking:               |  Operating System:  All
-------------------------+---------------------------

Comment (by liweitianux):

 I recently upgraded to Xapian v1.4.1, and the Chinese query parser works
 as expected.
 Here is the new and **correct** behavior:

 {{{
 #!python
 xapian.version_string()
 # '1.4.1'

 qp = xapian.QueryParser()
 qp.add_prefix("subject", "S")
 qp.add_prefix("s", "S")
 qp.add_prefix("body", "B")
 qp.add_prefix("b", "B")
 qp.add_prefix("", "B")
 qp.add_prefix("", "S")

 qstr1 = "中文"
 q1 = qp.parse_query(qstr1)
 print(q1)
 # Query(((B中@1 AND B中文@1 AND B文@1) OR (S中@1 AND S中文@1 AND S文@1)))
 }}}

 I also tried to rebuild recent `mu` (see issue
 https://github.com/djcb/mu/issues/123 ), and now the Chinese search works.

 Thank you!

--
Ticket URL: <https://trac.xapian.org/ticket/719#comment:8>
Xapian <https://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list