[Xapian-discuss] What kind of data in the datafield

Felix Antonius Wilhelm Ostmann ostmann at websuche.de
Thu Jan 4 13:15:46 GMT 2007


Thanks :)

Richard Boulton schrieb:
> Felix Antonius Wilhelm Ostmann wrote:
>> we are building the next google ... you know ;) But, what should we 
>> save in the data-field?
>
> It really depends what you want to do with the data.  In general, you 
> should save what you have a use for, and no more: obviously, the less 
> you save, the smaller the database, and the faster you'll be able to 
> access the data.
it is important what is total faster. get the content from the data or
from additional file :( what kind of content must be tested.

>
> If you have the original data on disk, it's often useful just to save 
> a URL/file path to the data.  But, even in this case, if the data has 
> to pass through an expensive parsing step to extract text, it may be 
> useful to store a sample of the parsed text for display in the result 
> list. You might even want to store the whole parsed text, and generate 
> a summary based on the phrases relevant to the query.
this is the way i think. the user can see a short text (400 byte) with
the terms in the query.

>
>> And the title, the timestamp and other stuff? save in a value or at 
>> the data too? I am confused :(
>
> Save in the data if you want to display them, or use them in some 
> other way, once you've got the document results.
>
> Note that if you're saving something like a timestamp in a value 
> anyway (e.g., for sorting), you can just read the timestamp from the 
> value when displaying the result list, so there's no need to duplicate 
> this.
Hmmm ... and is there any differences in the performance? is data faster
then value? i know that value has special features and if i need them i
must save it as value, but when this is slower, i dont use alle features
i am planing :)




-- 
Mit freundlichen Grüßen

Felix Antonius Wilhelm Ostmann
--------------------------------------------------
Websuche   Search   Technology   GmbH   &   Co. KG
Martinistraße 3  -  D-49080  Osnabrück  -  Germany
Tel.:   +49 541 40666-0 - Fax:    +49 541 40666-22
Email: info at websuche.de - Website: www.websuche.de
--------------------------------------------------
AG Osnabrück - HRA 200252 - Ust-Ident: DE814737310
Komplementärin:     Websuche   Search   Technology
Verwaltungs GmbH   -  AG Osnabrück  -   HRB 200359
Geschäftsführer:  Diplom Kaufmann Martin Steinkamp
--------------------------------------------------





More information about the Xapian-discuss mailing list