[Xapian-tickets] [Xapian] #346: Python 3 support

Xapian nobody at xapian.org
Sun Jun 17 14:03:15 BST 2012


#346: Python 3 support
--------------------------------------+-------------------------------------
 Reporter:  olly                      |       Owner:  richard  
     Type:  defect                    |      Status:  assigned 
 Priority:  highest                   |   Milestone:  1.3.2    
Component:  Xapian-bindings (Python)  |     Version:  SVN trunk
 Severity:  normal                    |    Keywords:           
Blockedby:                            |    Platform:  All      
 Blocking:                            |  
--------------------------------------+-------------------------------------

Comment(by olly):

 A global "encoding switch" doesn't really seem workable to me.

 For starters, Xapian::sortable_unserialise() '''definitely''' needs to be
 passed bytes (it takes a binary string as returned by
 Xapian::sortable_serialise(), so passing a Unicode string converted to
 UTF-8 makes no sense at all, and having to convert your binary input to
 Unicode to pass it isn't going to work well), so there will need to be
 exceptions which the switch doesn't affect, or else some settings of the
 switch will render such functions unusable.

 The problem with "manage everything manually" when a single setting of the
 encoding isn't enough is that the case of wanting both text and binary
 data isn't at all esoteric.  For example, look at omindex, which indexes
 text but adds a (binary) document checksum as a value for collapsing
 identical documents.  Alternatively, you could change the global encoding
 each time you want to pass the other sort, but then you're flipping it
 back and forth for every document you index.

 And if you want to write something reusable you'll find yourself having to
 save the encoding state, and then set it to what you want, do your calls
 to Xapian, and then restore the encoding state.  That really seems worse
 than having to specify the encoding at every call site.

 In terms of text encodings, xapian-core only really supports UTF-8.  In a
 lot of places, you just get back what you put in, but anything that
 actually looks at the contents as text expects UTF-8.  So the only
 settings of the "encoding" switch which make sense on the C++ side are
 UTF-8 and binary data.

 In response to Barry:

 >> I guess if you're trying to get everyone onto Python 3 for Ubuntu,
 you've looked at quite a few upstreams already - has a standard pattern
 for resolving such situations already emerged?

 >> Well, the only upstream I currently have to support is software-center,
 since we're only converting to Python 3 on the standard desktop image (for
 12.10 anyway). So its use case will be my primary driver. We have maybe a
 dozen reverse depends on python-xapian in total.

 I meant other upstream Python projects you're needing to get on to Python
 3, not reverse dependencies of python-xapian (though getting the reverse
 dependencies ported to "python3-xapian" could take significant work,
 especially if we totally throw out compatibility with the current python-
 xapian API).  I was wondering if there was a standard pattern for handling
 wrapping an interface like this.

 > One big question is this: what version of Python 2 do you still need to
 support (please tell me, nothing earlier than 2.6 :), and how should we
 handle cases where the API has to change for Python 3?

 Adding Python 3 support really shouldn't change which Python 2 versions
 are supported.  If the changes are invasive enough that this is really an
 issue, then I suggest we split the Python 3 bindings into a separate
 subdirectory - we already have different versions of all the tests, though
 currently they're mostly the result of 2to3, plus a few tweaks.
 Especially for 1.2.x, we really don't want to be risking breaking Python 2
 support - our general policy is not to break compatibility with a version
 of other software within a Xapian stable release series without a very
 good reason.

 We've not made a final decision on the versions of things we'll aim to
 support in 1.4.x - 2.6 may well be a sane cut-off there, but that doesn't
 really help since you want this support in 1.2.x.

 As for the minimum 3.x to support, I certainly wouldn't worry about 3.0 -
 my impression is that the early adopters who actually tried it will have
 quickly moved on to the new cutting edge, while the conservative types
 will have feared the ".0".

-- 
Ticket URL: <http://trac.xapian.org/ticket/346#comment:35>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list