[Xapian-discuss] Re: 1.0 news, and a call for testing

Richard Boulton richard at lemurconsulting.com
Sun May 6 13:56:55 BST 2007

Jean-Francois Dockes wrote:
> Is there somewhere a document describing what's new for application
> developpers in the 1.0 xapian-core API ? 

There isn't yet, but we'll certainly have to put such a thing together 
before the release.

> Especially, there is a mention on the wiki of a "new unicode/utf-8 API in
> xapian-core". What's this ?
> Do the stemmers now take utf-8 ? 

Yes, they do.  Not only that, but there's a new piece of code which 
parses a UTF-8 string into terms.  Accent handling has changed (Xapian's 
query parser used to normalise accents in a slightly dubious way). 
Also, the english stemmer now understands apostrophes, so there is code 
to normalise the different representations of apostrophes to the 
representation that the stemmer understands, and the term generator will 
generate appropriate terms.

> I am quite certain that this information passed in messages through my
> mailbox, but a small abstract with pointers to the detailed information
> (include files accepted...) would certainly be appreciated. 

We'll start generating one soon.  Perhaps it would be best if we did 
this on the wiki, so that we can avoid duplicated effort.


More information about the Xapian-discuss mailing list