[Xapian-discuss] omindex doesn't check last_mod
Olly Betts
olly at survex.com
Fri Aug 25 22:57:16 BST 2006
On Mon, Aug 07, 2006 at 09:10:48PM -0700, Michael Trinkala wrote:
> I recommend storing the last modified time and the document MD5 in the
> value table.
I'd agree using a value makes a lot more sense than using a term, and
is likely to be more efficient than reading the value from the document
data (especially once I get around to rejigging how values are stored
for flint).
Richard's idea of using a separate file has some attractions, but it's
good to keep this information in the database so all changes relating
to a document get committed atomically. And as you note below, having
the information in a value is useful in other ways.
> These values also work nicely for sorting results by date and
> collapsing duplicate entries.
Indeed, and it would allow Omega to use a MatchDecider for date range
filtering.
> I will gladly contribute these changes and others if the team is
> interested. I will get a list up on xapian-devel to figure out what
> should/shouldn't be included.
Sure - show us what you've got!
> As for excel support check out xls2cvs and catppt does a nice job with powerpoint
> http://www.45.free.net/~vitus/software/catdoc/
I've added hooks for these to omindex, though we really need to refactor
that code to leave just a generic mechanism for running some command and
either reading its stdout or a file it produces, and allow commands to
be specified in a config file...
Cheers,
Olly
More information about the Xapian-discuss
mailing list