[Xapian-tickets] [Xapian] #346: Python 3 support

Xapian nobody at xapian.org
Tue Jul 24 23:35:00 BST 2012


#346: Python 3 support
--------------------------------------+-------------------------------------
 Reporter:  olly                      |       Owner:  richard  
     Type:  defect                    |      Status:  assigned 
 Priority:  highest                   |   Milestone:  1.3.2    
Component:  Xapian-bindings (Python)  |     Version:  SVN trunk
 Severity:  normal                    |    Keywords:           
Blockedby:                            |    Platform:  All      
 Blocking:                            |  
--------------------------------------+-------------------------------------

Comment(by barry):

 Here's another fundamental problem.  get_description() for any type return
 std::string, and these can contain arbitrary byte sequences.  The SWIG
 bindings do this:

 %rename(__str__) get_description;

 but this seems wrong for Python 3.  If get_description() returns a
 std::string with non-UTF8 characters in it, as happens with Query objects
 in smoketest3.py, then str() of that will raise a UnicodeDecodeError.
 What you really want in Python 3 is for get_description() to be
 %rename(__bytes__) so that you can then do bytes(my_query) and be assured
 of getting a reasonable answer.  In Python 3 you cannot return a bytes
 object from __str__().

 So what should __str__() do when there are non-UTF8 characters in the
 query (i.e. arbitrary bytes)?  You could argue that raising a UDE still
 makes sense, or, since this is just the str() of an object, it could use
 'replace' decoding to just ignore the bogus characters.  Of course, this
 means that smoketest3.py is not correct for Python 3 anyway, and we still
 need a better %rename for get_description.  On top of all that, we still
 need a better %typemap for std::string, or raise the issue in the SWIG
 tracker.

-- 
Ticket URL: <http://trac.xapian.org/ticket/346#comment:48>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list