Xapian-Haystack is available in Python 3

Jorge Cardoso Leitão jorgecarleitao at gmail.com
Sat Nov 14 12:27:32 GMT 2015


Hi,

I'm the current developer of Xapian-Haystack, and I'm glad to announce that
we've been finally able to install and pass all the tests of
Xapian-Haystack with Xapian 1.3.3 in both Python 2 and 3, which means that
Xapian-Haystack now supports Python 3. This naturally would not be possible
without your efforts to push Xapian bindings to Python 3, and I thank you
for that.

Here I report some of the "features" that hindered this task from our
perspective, so that Xapian devs are aware of the kind of problems a user
may face on this process.

1. Dev version of Xapian has different names for their tools, namely
xapian-config and delve. xapian-config became xapian-config-1.3, delve
became xapian-delve-1.3.

Suggestion: make names independent of oddity of the minor version. I don't
find a compelling reason to force users to code:

XAPIAN_VERSION = [int(x) for x in xapian.__version__.split('.')]

if XAPIAN_VERSION[1] <= 2:
    # old versions use "delve".
    executable = 'delve'
else:
    # new versions use 'xapian-delve'
    executable = 'xapian-delve'

# dev versions (odd minor) use a suffix
if XAPIAN_VERSION[1] % 2 != 0:
    executable = executable+'-%d.%d' % tuple(XAPIAN_VERSION[0:2])

or, in languages where version comparison is more tricky, atm our (bad)
solution is

if [ $VERSION = "1.3.3" ]; then
    XAPIAN_CONFIG=$VIRTUAL_ENV/bin/xapian-config-1.3
else
    XAPIAN_CONFIG=
fi

2. Almost all Xapian bindings output is in non-unicode that can be
converted to unicode via `decode('utf-8')`, which is great. Yet, this is
still not perfect because e.g. `xapian.sortable_unserialise(12.345)` is not
decodable to utf-8. Thus, depending on the type of field (string, int,
float) (in the user side), its value will be either a string or byte
strings, something that is against any Python idiom.

Suggestion: make all public interface of Xapian in Python to return either
unicode or utf-8 decodable strings. IMO, at the current state of Python
development where unicode is *the* standard, it is the bindings
responsibility to return unicodes. If that is not possible in Xapian
bindings, at least consider making the output to be totally undecodable so
a user can be sure that any Xapian public interface allows .decode('utf-8').

3. In Xapian-Haystack we use TravisCI to build against different Python,
Django and Xapian versions. Installing Xapian takes 95% of the total build
time. Any suggestion how to reduce this? For concreteness, here is the
installation file we are using:
https://github.com/notanumber/xapian-haystack/blob/master/install_xapian.sh

Again, thanks for your work,
Regards,
Jorge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20151114/190a9774/attachment.html>


More information about the Xapian-devel mailing list