[Xapian-discuss] UTF8 support plans (without stemming)
Alexandre
Xlex0x835 at rambler.ru
Wed Apr 27 21:09:26 BST 2005
On Apr 27, 2005, at 23:47, rm at fabula.de wrote:
> On Wed, Apr 27, 2005 at 11:32:30PM +0400, Alexandre wrote:
>> Good day,
>>
>> does there is any plans about support of the UTF-8 (I talk about lib
>> core, not about stemming)?
>
> What exactly do you mean by UTF-8 support? You can pretty much stuff
> anything into a xapian database (see some recent posts in this list).
> But -- without stemming statistical information retieval doesn't really
> work as expected in most western languages :-/
Ralf, do you mean this post
(http://lists.tartarus.org/pipermail/xapian-discuss/2005-April/
000821.html)?
If so, "query parser ... currently assume latin1" - that's not very
good, isn't it?
Hm, and can you tell me, please, more about stemming influence on IR in
western languages? Is it only about probabilistic IR or about vector
search too?
And another one question (not exactly about subject): why Xapian stick
to the probabilistic approach? Probably some historical links/docs?
Thank you in advance,
Regards,
/Alexandre.
More information about the Xapian-discuss
mailing list