[Xapian-discuss] Not separating words when parsing HTML in Omega
Olly Betts
olly at survex.com
Thu Feb 10 02:50:02 GMT 2011
On Wed, Feb 09, 2011 at 03:11:18PM -0600, Crowell, Brian wrote:
> We noticed, when indexing a Word 2007 document, that two words in
> adjacent paragraphs got melded together in the Xapian database. For
> example:
What version of Omega is this with? I have a feeling I fixed something
to do with running words together fairly recently, but I'm not seeing
it in the ChangeLog.
> I could send a sample document that produces the error, if that helps.
That would be useful if you have something you don't mind making public.
Bonus points if you're happy to license it for use in a testsuite!
Cheers,
Olly
More information about the Xapian-discuss
mailing list