[Xapian-tickets] [Xapian] #346: Python 3 support
Xapian
nobody at xapian.org
Mon Jul 23 23:30:09 BST 2012
#346: Python 3 support
--------------------------------------+-------------------------------------
Reporter: olly | Owner: richard
Type: defect | Status: assigned
Priority: highest | Milestone: 1.3.2
Component: Xapian-bindings (Python) | Version: SVN trunk
Severity: normal | Keywords:
Blockedby: | Platform: All
Blocking: |
--------------------------------------+-------------------------------------
Comment(by barry):
I've gotten back to this and spent more time on the swig-based bindings.
It's not going well. One of the big problems I have is that I'm not sure
swig's Python 3 mappings for std::string are entirely right, but maybe I
don't know swig or c++ that well. An example: iiuc, std::string is just a
container for bytes not unicodes, but swig wants to convert to and from
unicodes to std::strings. E.g. SWIG_AsCharPtrAndSize() only checks its
first arg for PyUnicode-ness, not PyBytes-ness, but I don't see any reason
why it shouldn't accept bytes, or maybe *only* bytes. In Python 3, it
then tries to decode it as a utf8 string, but there's no reason why it
must be utf8 encoded data.
As another example, look at smoketest3.py. There's code that does the
equivalent of xapian.Query(xapian.Query.OP_OR, ('foo, 'bar\xa3')). A
strict translation of that to Python 3 ought to be
xapian.Query(xapian.Query.OP_OR, (b'foo', b'bar\xa3')) which incidentally
will work exactly the same in Python >= 2.6. With some hacker to accept
the bytes object, I can make this work, but then swig once again gets in
the way because it wants to convert the Query's .get_description()
std::string to a Python 3 unicode using utf-8. It's the moral equivalent
of b'bar\xa3'.decode('utf-8') which isn't value utf-8.
Suggestions welcome.
--
Ticket URL: <http://trac.xapian.org/ticket/346#comment:46>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list