[Xapian-discuss] [ANN] mu-0.2, maildir indexer/searcher with xapian support
djcb
djcb.bulk at gmail.com
Thu Sep 11 18:17:22 BST 2008
On Thu, 11 Sep 2008, Olly Betts wrote:
> > Mu uses Xapian for content searches (and SQLite for metadata search), so
> > you can do something like:
>
> Why not just use Xapian for everything? It seems you must be doing some
> complicated post-processing (or using SVN trunk's Xapian::PostingSource)
> currently...
That is a good question; while adding the Xapian support it crossed my
mind a couple of times. There seems to be a bit of a dichotomy when
querying now, and it seems most things should be doable with just
Xapian. Anyway, there are two reasons for doing it this way:
a) Very practical: I started with SQLite; I did not even know something
like Xapian existed one month agon.
b) In the not-too-distant future I'd like to be able to generate some
aggregate information about queries; so after you search for all
mails containing 'wombat OR unicorn', you could get information like:
- the oldest/newest mail that matched; the average size
- number of messages per sender;
- number of messages per thread;
- average number of To:, Cc: recipients
Not sure if all of these are so useful, but in general SQL seems a
bit better at expressing non-literal search criteria, ie. searches
that depend on search results -- joins and so on.
Anyhow, I am not really convinced yet what is the best way. Obviously,
only using Xapian has its advantage too. I'll think about it for some
days :)
(BTW: post-processing is pretty easy; I store the SQLite database IDs in
the Xapian DB, and simply add the ID of match docs in a
'WHERE messsage.id IN (....)')
Best wishes,
Dirk.
--
-----------------------------------------------
Dirk-Jan C. Binnema <djcb at djcbsoftware.nl>
blog: http://www.djcbsoftware.nl/ChangeLog (NL)
http://djcbflux.blogspot.com (EN)
chat: djcb at jabber.org
-----------------------------------------------
More information about the Xapian-discuss
mailing list