[Xapian-discuss] Python bindings and unicode strings

Deron Meranda deron.meranda at gmail.com
Thu Aug 30 20:02:22 BST 2007


I understand that the Xapian core uses UTF-8, but is there a way to
get the Python bindings to always work with Python's native unicode
string type so that the underlying UTF-8 is not exposed?  It appears
that I can store unicode strings, like;

>>>  document.set_term( u'panach\u00e9' )

but then when I get them back out they're plain byte sequences (UTF-8
encoded) rather than nice unicode strings,

>>>  [t.term for t in document.allterms()]
['panach\xc3\xa9']

I would have expected to get [u'panach\u00e9'] out instead.

Deron Meranda



More information about the Xapian-discuss mailing list