[Xapian-discuss] [ANN] mu-0.2, maildir indexer/searcher with xapian support
Olly Betts
olly at survex.com
Sat Sep 13 05:33:26 BST 2008
On Thu, Sep 11, 2008 at 08:17:22PM +0300, djcb wrote:
> b) In the not-too-distant future I'd like to be able to generate some
> aggregate information about queries; so after you search for all
> mails containing 'wombat OR unicorn', you could get information like:
> - the oldest/newest mail that matched; the average size
> - number of messages per sender;
> - number of messages per thread;
> - average number of To:, Cc: recipients
You can do these with a MatchSpy, but that work hasn't been released yet.
One advantage of this approach is you don't need to actually form the
full result set - you're scanning each match as it is found, and if it
doesn't rank highly enough, it can then simply be discarded.
> Not sure if all of these are so useful, but in general SQL seems a
> bit better at expressing non-literal search criteria, ie. searches
> that depend on search results -- joins and so on.
Again unreleased, but PostingSource allows a sort of join-like operation
with an external source.
> (BTW: post-processing is pretty easy; I store the SQLite database IDs in
> the Xapian DB, and simply add the ID of match docs in a
> 'WHERE messsage.id IN (....)')
That's going to suck when you have millions of matching documents
though.
Cheers,
Olly
More information about the Xapian-discuss
mailing list