[Xapian-discuss] TermGenerator incorrectly tokenizes German text which contains special characters

Bjorn Lamers bjorn.lamers at gmail.com
Mon Jun 14 07:57:26 BST 2010


Sorry for my late reply.

I downloaded my binaries from:
http://www.flax.co.uk/xapian_binaries
http://www.flax.co.uk/xapian/120/xapian-1.2.0-bindings-php.zip

Besides that I think I found my problem, want to do some extra checks later
this day. But I think it had to do with html-entities. The only think I
don't understand, and which I want to find out, is that why ä in some
way get indexed as ä, So why does it ignores the & and "stops" at the ;

Kind regards,
Bjorn

On Thu, Jun 10, 2010 at 4:07 PM, Olly Betts <olly at survex.com> wrote:

> On Thu, Jun 10, 2010 at 03:00:28PM +0100, Charlie Hull wrote:
> > On 10/06/2010 03:57, Olly Betts wrote:
> >> On Wed, Jun 09, 2010 at 04:48:24PM +0200, Bjorn Lamers wrote:
> >>> Xapian Support enabled Xapian
> >>> Compiled Version @PACKAGE_VERSION@
> >>
> >> Charlie, can you fix that?
> >
> > I could, if I knew where it came from! I've checked all the Windows
> > build files and I'm not sure where this is defined.
>
> xapian-bindings/xapian-version.h.in
>
> Cheers,
>    Olly
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>


More information about the Xapian-discuss mailing list