[Xapian-discuss] Omindex: what are the default numbered indexes?
James Aylett
james-xapian at tartarus.org
Tue Apr 26 13:35:20 BST 2011
On 26 Apr 2011, at 13:12, <xapian at catcons.co.uk> <xapian at catcons.co.uk> wrote:
> How to make Omega CGI remove duplicate documents from its query output?
What you're looking for is called collapsing, which is where the matcher, when building a MSet (list of matching documents) will only include one document for each distinct value.
> Apparently scriptindex can be used to add numbered indexes via the
> INDEX_SCRIPT as documented at http://xapian.org/docs/omega/scriptindex.html.
They aren't numbered indexes, they're numbered values; you want value= or valuenumeric=. As well as collapsing, values can also be used for sorting and range searches.
> Using Omega to query an index built with omindex suggests there are some
> default numbered indexes. Setting &COLLAPSE=<index number> in the URL
> (where <index number> was 1, 2 or 3) got a listing that seemed to have
> duplicates suppressed, the same number of documents for each of the three
> indexes.
Again, you mean "value" not "index", to avoid confusion.
> Before rolling this out to the users it would be nice to know what these
> default numbered indexes are and which, if any, can be safely used to
> suppress duplicates.
Within omega, values 0, 1 and 2 are reserved (for last modification time, 16 byte MD5 checksum and filesize in bytes, respectively). Anything other than that can be used. I'm not convinced this is documented anywhere useful; I've added a note to the missing documentation wiki page about this.
> Is there a way to interrogate the index database? Are the default numbered
> indexes described in the documentation?
You can use delve to interrogate Xapian databases, such as:
$ delve -V -r <docid> <path-to-database>
which will display the values (and also the terms) for that document.
J
--
James Aylett
talktorex.co.uk - xapian.org - devfort.com
More information about the Xapian-discuss
mailing list