[Xapian-discuss] acts_as_xapian, pre-release (Ruby on Rails)

Francis Irving francis at flourish.org
Fri Apr 25 11:51:00 BST 2008


Hi all,

I've been using Ruby on Rails, and finally got fed up with Solr/Lucene. So I've
made acts_as_xapian. An early version is available here:

https://secure.mysociety.org/cvstrac/dir?d=mysociety/foi/vendor/plugins/acts_as_xapian

It works, but isn't deployed on a live site yet (will be on our UK Freedom of
Information request filing/archiving site www.whatdotheyknow.com soon)

I've put the parts of the documentation which compare it to acts_as_solr at
the bottom of this email.

Any suggestions as to features it should have that would be easy to add? It's
got sort, date range, collapse, spelling, offline indexing, and integration
with Rail models. Anything else big/obvious that most people will need?
Or anything easy to add and genius looking (like spelling was!)?

If anyone can try it out, patches welcome!, then that would be super awesome.
It's really not been used much yet as I only made it the day before yesterday,
so buyer beware.

Francis
mySociety

P.S. Is adding highlighting to QueryParser on the development plan? By that I
mean a function which you give it some text and a number of words and a
highlighting prefix/postfix, and it returns an extract of the text highlighted
for the query. I really feel it is something that belongs in QueryParser, as
it is fundamental to the format of queries to do it well (i.e. with quoting,
and prefixes and operators, and even ranges), and nearly every search
application needs it.

# Comparison to acts_as_solr (as on 24 April 2008)
# ==========================
#
# * Offline indexing only mode - which is a minus if you want changes
# immediately reflected in the search index, and a plus if you were going to
# have to implement your own offline indexing anyway.
#
# * Collapsing - the equivalent of SQL's "group by". You can specify a field
# to collapse on, and only the most relevant result from each value of that
# field is returned. Along with a count of how many there are in total.
# acts_as_solr doesn't have this.
#
# * No highlighting - Xapian can't return you text highlighted with a search query.
# You can try and make do with TextHelper::highlight. I found the highlighting
# in acts_as_solr didn't really understand the query anyway.
#
# * Date range searching - maybe this works in acts_as_solr, but I never found
# out how.
#
# * Spelling correction - "did you mean?" built in and just works.
#
# * Multiple models - acts_as_xapian searches multiple models if you like,
# returning them mixed up together by relevancy. This is like multi_solr_search,
# only it is the default mode of operation and is properly supported.
#
# * No daemons - However, if you have more than one web server, you'll need to
# work out how to use Xapian's remote backend http://xapian.org/docs/remote.html. 
#
# * One layer - full-powered Xapian is called directly from the Ruby, without
# Solr getting in the way whenever you want to use a new feature from Lucene.
#
# * No Java - an advantage if you're more used to working in the rest of the
# open source world. acts_as_xapian, it's pure Ruby and C++.
#
# * Xapian's awesome email list - the kids over at xapian-discuss are super
# helpful. Useful if you need to extend and improve acts_as_xapian. The
# Ruby bindings are mature and well maintained as part of Xapian.
# http://lists.xapian.org/mailman/listinfo/xapian-discuss
#



More information about the Xapian-discuss mailing list