[Xapian-tickets] [Xapian] #282: Assorted enhancements to omindex
Xapian
nobody at xapian.org
Fri May 13 02:20:56 BST 2011
#282: Assorted enhancements to omindex
-------------------------+--------------------------------------------------
Reporter: olly | Owner: olly
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.2.x
Component: Omega | Version: SVN trunk
Severity: normal | Keywords:
Blockedby: | Platform: All
Blocking: |
-------------------------+--------------------------------------------------
Old description:
> A patch from Reini Urban at AVL which was pasted into the wiki a while
> back, but a ticket is really a more appropriate way to track it. We
> should look at folding some of these improvements in, though some others
> we probably don't want to include, at least in the form in this patch.
>
> I've updated the patch to compile with latest Omega SVN HEAD, dropping
> parts which Omega now supports anyway, and splitting out some features
> into separate tickets. I've not run-tested it at all.
>
> The remaining features in this patch are:
>
> * Unpacking "container file types" (e.g. archives like .zip, email
> folders like .mbox, email messages with attachments) so we can index the
> sub-parts
> * Logging stderr from filters to a file
> * The seemingly arbitrary addition of more words all starting with "a"
> to the stopword list - stopping some of these seems a bit aggressive to
> me
> * Defaulting to adding the size and lastmod time of the dump file in
> scriptindex. In general, the size of the dump file seems misleading
> (though if you put one document per dump, less so). The lastmod isn't
> particular helpful in many cases either
> * Some tweaks to installing docs in the .spec file, which I don't know
> the reasons for
New description:
A patch from Reini Urban at AVL which was pasted into the wiki a while
back, but a ticket is really a more appropriate way to track it. We
should look at folding some of these improvements in, though some others
we probably don't want to include, at least in the form in this patch.
I've updated the patch to compile with latest Omega SVN HEAD, dropping
parts which Omega now supports anyway, and splitting out some features
into separate tickets. I've not run-tested it at all.
The remaining features in this patch are:
* Unpacking "container file types" (e.g. archives like .zip, email
folders like .mbox, email messages with attachments) so we can index the
sub-parts
* Logging stderr from filters to a file
* Defaulting to adding the size and lastmod time of the dump file in
scriptindex. In general, the size of the dump file seems misleading
(though if you put one document per dump, less so). The lastmod isn't
particular helpful in many cases either
--
Comment(by olly):
Update description for changes in latest patch too (dropped the random
extra stopwords and the doc-related changes to the spec file).
Latest patch builds, but functionality untested and probably isn't right.
--
Ticket URL: <http://trac.xapian.org/ticket/282#comment:9>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list