[Xapian-devel] Some Questions From the beginner of Xapian

Dave Spencer david.spencerian at gmail.com
Wed Sep 17 07:13:40 BST 2008


Richard Boulton <richard <at> lemurconsulting.com> writes:

> 
> liminghit wrote:
> > (1) I see the Xapian::Document has a method
> > 
> > *  void  add_value (Xapian::valueno valueno, const std::string &value)*
> > 
> >   What's the purpose of this method?  Document will related to the 
> > terms, but what's the purpose of this?
> 
> Values are extra pieces of information which can be used during the 
> search to modify the search in some way.  For example, they can be used 
> to add an extra weight to some documents, or to sort the results in a 
> different order, or to collapse results from a single website.


I've been meaning to ask the same basic question.

One other thing, I believe, is that values can be retrieved just as they were
stored, so it can be a form of additional "structured" metadata associated with
a document. So I believe you can do things like

doc.add_value(0, "URL");
doc.add_value(1, "Title");
doc.add_value(2, "Author");

etc.

It would be nice if there was some page on "concepts" that covered this, or 
at least an update to the developer docs:

http://xapian.org/docs/apidoc/html/classXapian_1_1Document.html#f7babb1a6368b95dd327f60b433016ac


I've wondered what the intent of get_data and set_data was, esp why have
the indexed values (the index being the first arg to get/add value) whereas
with data it's just a single value -- why not have multiple "data" values,
or why not get rid of "data" and just let the get/add value calls cover it?

I'm guessing the intent of 'data' is to store some key piece of info
about a document such as the URL of a doc that represents a web page.

thx,
 Dave


> 
> > (2) add_posting method will add term to a documents.
> > 
> > *   void add_posting (const std::string &tname, Xapian::termpos tpos, 
> > Xapian::termcount wdfinc=1)*
> > 
> > I noticed that
> > 
> > Xapian::TermGenerator has follow method
> > 
> > *  void  index_text (const Xapian::Utf8Iterator &itor, Xapian::termcount 
> > weight=1, const std::string &prefix="")*
> > 
> >  What's the differences and relationship between these two functions?
> 
> I've just added a FAQ which should answer this.
> http://trac.xapian.org/wiki/FAQ/TermGenerator
> 







More information about the Xapian-devel mailing list