[Xapian-discuss] omindex doesn't check last_mod

Olly Betts olly at survex.com
Fri Aug 25 22:57:16 BST 2006


On Mon, Aug 07, 2006 at 09:10:48PM -0700, Michael Trinkala wrote:
> I recommend storing the last modified time and the document MD5 in the
> value table.

I'd agree using a value makes a lot more sense than using a term, and
is likely to be more efficient than reading the value from the document
data (especially once I get around to rejigging how values are stored
for flint).

Richard's idea of using a separate file has some attractions, but it's
good to keep this information in the database so all changes relating
to a document get committed atomically.  And as you note below, having
the information in a value is useful in other ways.

> These values also work nicely for sorting results by date and
> collapsing duplicate entries.

Indeed, and it would allow Omega to use a MatchDecider for date range
filtering.

> I will gladly contribute these changes and others if the team is
> interested.  I will get a list up on xapian-devel to figure out what
> should/shouldn't be included.

Sure - show us what you've got!

> As for excel support check out xls2cvs and catppt does a nice job with powerpoint
> http://www.45.free.net/~vitus/software/catdoc/

I've added hooks for these to omindex, though we really need to refactor
that code to leave just a generic mechanism for running some command and
either reading its stdout or a file it produces, and allow commands to
be specified in a config file...

Cheers,
    Olly



More information about the Xapian-discuss mailing list