[Xapian-discuss] Parsing .msg files
James Aylett
james-xapian at tartarus.org
Fri Sep 12 17:26:30 BST 2008
On Sat, Sep 13, 2008 at 01:53:26AM +0930, Frank J Bruzzaniti wrote:
> I'm trying to parse .msg files.
>
> I found a patch on trac but it looks like it uses a program called
> outlook2txt which I can;t find anywhere.
>
> The other thought was to pipe the file through the utility strings and
> then use the html parser. I do still get a little bit of junk left over
> tho.
>
> Anyone else know of a better way?
If you have access to a Windows machine with Outlook, you can use
python + COM to programmatically access the Outlook object model. It's
a bit fiddly, and there are bits that aren't exposed (although there's
another plugin that is supposed to fix that, I never got it to
work). It was sufficient for me to export several years of emails to
mbox format a while back.
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list