[Xapian-tickets] [Xapian] #346: Python 3 support
Xapian
nobody at xapian.org
Tue Jul 24 23:35:00 BST 2012
#346: Python 3 support
--------------------------------------+-------------------------------------
Reporter: olly | Owner: richard
Type: defect | Status: assigned
Priority: highest | Milestone: 1.3.2
Component: Xapian-bindings (Python) | Version: SVN trunk
Severity: normal | Keywords:
Blockedby: | Platform: All
Blocking: |
--------------------------------------+-------------------------------------
Comment(by barry):
Here's another fundamental problem. get_description() for any type return
std::string, and these can contain arbitrary byte sequences. The SWIG
bindings do this:
%rename(__str__) get_description;
but this seems wrong for Python 3. If get_description() returns a
std::string with non-UTF8 characters in it, as happens with Query objects
in smoketest3.py, then str() of that will raise a UnicodeDecodeError.
What you really want in Python 3 is for get_description() to be
%rename(__bytes__) so that you can then do bytes(my_query) and be assured
of getting a reasonable answer. In Python 3 you cannot return a bytes
object from __str__().
So what should __str__() do when there are non-UTF8 characters in the
query (i.e. arbitrary bytes)? You could argue that raising a UDE still
makes sense, or, since this is just the str() of an object, it could use
'replace' decoding to just ignore the bogus characters. Of course, this
means that smoketest3.py is not correct for Python 3 anyway, and we still
need a better %rename for get_description. On top of all that, we still
need a better %typemap for std::string, or raise the issue in the SWIG
tracker.
--
Ticket URL: <http://trac.xapian.org/ticket/346#comment:48>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list