[Xapian-discuss] Clarification of values, data, fields, and prefixed terms

Deron Meranda deron.meranda at gmail.com
Tue Sep 4 06:15:03 BST 2007


On 9/3/07, James Aylett <james-xapian at tartarus.org> wrote:
> On Mon, Sep 03, 2007 at 12:59:55AM +0100, Olly Betts wrote:
> > > I'm not certain that it is actually true right now, but in theory
> > > you'll get better performance in some cases by using values as they're
> > > intended (to be looked up and used during the match process), and data
> > > as it's intended (to store additional metadata that Xapian doesn't
> > > care about, for display/whatever in your application).
> >
> > Not just in theory.  Currently to read a value for a document, all the
> > other values for that document have to be read, so abusing values as
> > general purpose fields will mean that more data has to be read for each
> > value accessed during the match - that's clearly going to adversely
> > performance in most cases.
>
> I was pretty sure it would be the case, but I didn't want to stick my
> neck out without being sure ;)

Okay, now I understand that Xapian core has access to values and can
use them during the matching process, etc.  And it basically ignores
the "data" altogether, but provides it as a courtesy storage for applications
that have no better place to store additional document meta-data.
(And Omega is one such application)

But I don't understand the performance arguments.  Even if looking up
one value on a document means that all the values are retrieved,
how is that different from fields inside the data part.  Doesn't it also
have to retrieve all the fields from the data just to get to one of them
as well?

> > I'd like to change how values are stored, but it'll still be a bad idea
> > to misuse them - just for different reasons.
>
> The general point is that Xapian core will be optimised towards the
> expected use of data, values and terms, so it's better to use them as
> designed. Which hopefully is a little clearer now. :)

Yes, that's fine.  I'm just trying to understand better what the
"expected use" or "misuse them" really means.  I think
I'm getting closer.

Thanks for everybody's patience.
-- 
Deron Meranda



More information about the Xapian-discuss mailing list