[Xapian-discuss] using xapian for indexing mails
djcb
djcb.bulk at gmail.com
Sat Aug 30 08:19:39 BST 2008
Dear Xapian,
I am writing a little tool for indexing/searching email messages in
maildirs.
For indexing the message bodies, Xapian looks like an interesting
option, but I have some newbie questions. What I would *like* to do is
being able to add the email bodies to the Xapian database, and then be
able to search for some words.
I am looking at the Quickstart (http://xapian.org/docs/quickstart.html).
and it seems I have to create a Xapian::Document instance, then (1) add
document data with set_data and (2) add some search terms with
add posting.
I could use the message path as the document data, but what about the
search terms? Should I split my body text in words, and add every single
one of them as a search term? That does not sound very attractive... I
seems that 'recoll' (which uses Xapian) is doing that though.
Or is there some easier way to simply provide blobs of text, and being
able to search for them later? I have the feeling I am misunderstanding
something....
Hope someone can give me some hints.
Thanks in advance!
Dirk.
--
-----------------------------------------------
Dirk-Jan C. Binnema <djcb at djcbsoftware.nl>
blog: http://www.djcbsoftware.nl/ChangeLog (NL)
http://djcbflux.blogspot.com (EN)
-----------------------------------------------
More information about the Xapian-discuss
mailing list