[Xapian-discuss] Not separating words when parsing HTML in Omega

Olly Betts olly at survex.com
Thu Feb 10 02:50:02 GMT 2011


On Wed, Feb 09, 2011 at 03:11:18PM -0600, Crowell, Brian wrote:
> We noticed, when indexing a Word 2007 document, that two words in
> adjacent paragraphs got melded together in the Xapian database. For
> example:

What version of Omega is this with?  I have a feeling I fixed something
to do with running words together fairly recently, but I'm not seeing
it in the ChangeLog.

> I could send a sample document that produces the error, if that helps.

That would be useful if you have something you don't mind making public.
Bonus points if you're happy to license it for use in a testsuite!

Cheers,
    Olly



More information about the Xapian-discuss mailing list