How to set environment variable XAPIAN_CJK_NGRAM?

Peter Zhao peterzhaonj at 163.com
Tue Feb 13 01:32:26 GMT 2018


Olly, Thanks a lot!
I installed Xapian 1.2.25 on Ubuntu 14.04. How to set  environment variable XAPIAN_CJK_NGRAM? I'm a newbie  to Xapian.
Best wishes,
Peter









At 2018-02-12 20:00:02, xapian-discuss-request at lists.xapian.org wrote:
>Send Xapian-discuss mailing list submissions to
>	xapian-discuss at lists.xapian.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
>	https://lists.xapian.org/mailman/listinfo/xapian-discuss
>or, via email, send a message with subject or body 'help' to
>	xapian-discuss-request at lists.xapian.org
>
>You can reach the person managing the list at
>	xapian-discuss-owner at lists.xapian.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of Xapian-discuss digest..."
>
>
>Today's Topics:
>
>   1. Re: How to let Xapian support  Chinese searching (Olly Betts)
>   2. Re: How to ensure thread-safety (Olly Betts)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Sun, 11 Feb 2018 20:34:44 +0000
>From: Olly Betts <olly at survex.com>
>To: Peter Zhao <peterzhaonj at 163.com>
>Cc: xapian-discuss at lists.xapian.org
>Subject: Re: How to let Xapian support  Chinese searching
>Message-ID: <20180211203444.GH12724 at survex.com>
>Content-Type: text/plain; charset=us-ascii
>
>On Sat, Feb 10, 2018 at 08:26:52PM +0800, Peter Zhao wrote:
>> I installed  Eprints, but it can not search Chinese. EPRINTS use
>> Xapian to index data, how to let Xapian support CHINESE searching?
>
>Current releases support indexing ngrams for CJK text - to enable this
>you need to pass FLAG_CJK_NGRAM to TermGenerator when indexing and to
>QueryParser when searching.
>
>You can also activate this flag without code changes by setting
>environment variable XAPIAN_CJK_NGRAM to a non-empty value (don't forget
>to export it if you're setting it via the shell).
>
>There's also a patch to add support for using libicu to find word
>boundaries:
>
>https://github.com/xapian/xapian/pull/114
>
>That'll get merged soon hopefully (mostly we need to sort out how to
>manage the libicu dependency - do we make it a hard dependency, or an
>option for how to build xapian-core, etc) but if you're happy to build
>xapian-core from source please try it and give feedback on how well
>it works.
>
>An algorithm to identify word boundaries should result in a
>significantly smaller database than indexing ngrams, but it's reliant on
>the algorithm finding the correct boundaries.  If the wrong boundaries
>are identified that can lead to both false positives and false
>negatives.
>
>Cheers,
>    Olly
>
>
>
>------------------------------
>
>Message: 2
>Date: Sun, 11 Feb 2018 20:51:35 +0000
>From: Olly Betts <olly at survex.com>
>To: Kim Walisch <kim.walisch at gmail.com>
>Cc: xapian-discuss at lists.xapian.org
>Subject: Re: How to ensure thread-safety
>Message-ID: <20180211205135.GI12724 at survex.com>
>Content-Type: text/plain; charset=us-ascii
>
>On Thu, Feb 08, 2018 at 04:18:12PM +0100, Kim Walisch wrote:
>> But it is still not clear to me how to ensure thread-safety when using
>> libxapian (C++ API). Usually when doing multi-threading many threads can
>> read the same variable concurrently without locking provided none of the
>> threads modifies the variable.
>
>That's true for simple types, but breaks down for classes because they
>may have mutable members - e.g. for caching values computed lazily:
>
>class FactorialFactory {
>  private:
>     mutable int r = -1;
>     mutable int n;
>  public:
>     FactorialFactory() {}
>
>     int calc(int v) const {
>         if (r < 0 || n != v) {
>	     r = n;
>	     n = v;
>	     for (int i = n - 1; i > 1; --i) {
>	         r *= i;
>	     }
>	 }
>	 return r;
>     }
>};
>
>It's not safe to concurrently call f.calc() from different threads, even
>though conceptually calc() is a read-only method.
>
>Cheers,
>    Olly
>
>
>
>------------------------------
>
>Subject: Digest Footer
>
>_______________________________________________
>Xapian-discuss mailing list
>Xapian-discuss at lists.xapian.org
>https://lists.xapian.org/mailman/listinfo/xapian-discuss
>
>
>------------------------------
>
>End of Xapian-discuss Digest, Vol 162, Issue 3
>**********************************************


More information about the Xapian-discuss mailing list