[Xapian-discuss] Re: Xapian query language

Michel Pelletier michel at dialnetwork.com
Thu Mar 30 18:37:32 BST 2006

Olly Betts wrote:
> On Wed, Mar 29, 2006 at 03:19:55PM -0800, Michel Pelletier wrote:
>>selection variables that define which document values you 
>>want to retrieve from the database

You've been very busy this morning.

> If you're using document values to store general purpose fields for use
> for display of matching documents, then you're misusing them, and should
> be prepared to be disappointed by retrieval performance.  The document
> data is where such general purpose fields should go.

I know it doesn't make any sense to argue with the creator, but I'm not 
sure I understand what you mean above.  Maybe our terminology is crossed 

I was under the impression (from the docs) that Xapian stores four kinds 
of things with a document, positional terms (add_posting) non-positional 
terms (add_document) values (add_value) and data (set_data) and that the 
last, data, is the potentially expensive result (I quickly looked for 
that doc just not, but can't find it, grrr) so I'm a little confused.

Also my performance results are good, query performance of fetching 
values is quite fast, many of our queries execute via a remote twisted 
client/server in under 20ms, and 10ms of that is twisted overhead!  I've 
found most of the query time is spent in sorting.  A 10ms text/boolean 
query over 600K documents is pretty good performance in my opinion.

 From reading the docs I was under the impression that the whole purpose 
of values is to display information about search results.  Is that not 
true?  What is the purpose of values if not to display result 
information like the date, creator, title of the result, etc.  Values 
also apear to be the mechanism that triggers sorting, so it wouldn't 
make sense to me if values were poor performance but also the sorting key.

I imagine I can easily change xaql to store fields in the document data 
instead of the values, but that adds some complexity, I'd have to store 
it in some data structure, pickle that, and then store the pickle in the 
document data.  *That* sounds expensive, so hopefully I don't have to do 
that. ;)


More information about the Xapian-discuss mailing list