[Xapian-devel] patch - Some CJK codepoints are also punctuation
Olly Betts
olly at survex.com
Fri Mar 22 00:22:06 GMT 2013
On Thu, Mar 21, 2013 at 05:06:07PM +1100, Greg Banks wrote:
> On Thu, Mar 21, 2013, at 03:20 PM, Olly Betts wrote:
>
> > Are you OK with the patch licensing requirements?
> > (See "Licensing of patches" in xapian-core/HACKING).
>
> Yes, and to the best of my knowledge so is my employer.
Thanks. I've applied the patch, and also added a testcase for the
QueryParser to cover this change.
I think we should backport this for 1.2.15 - the patched code should work
fine with databases built with an older version (they'll just have some
extra terms in), and updating an old database with a new version should
work fine too.
Building a database with a new version and searching with an older
version will work a little less well for queries containing CJK non-word
characters, but such queries don't work well before this patch anyway
(since the non-word characters have to match too), so overall
backporting seems better.
Anyone disagree?
Cheers,
Olly
More information about the Xapian-devel
mailing list