[Xapian-devel] Some Questions From the beginner of Xapian
Dave Spencer
david.spencerian at gmail.com
Wed Sep 17 07:13:40 BST 2008
Richard Boulton <richard <at> lemurconsulting.com> writes:
>
> liminghit wrote:
> > (1) I see the Xapian::Document has a method
> >
> > * void add_value (Xapian::valueno valueno, const std::string &value)*
> >
> > What's the purpose of this method? Document will related to the
> > terms, but what's the purpose of this?
>
> Values are extra pieces of information which can be used during the
> search to modify the search in some way. For example, they can be used
> to add an extra weight to some documents, or to sort the results in a
> different order, or to collapse results from a single website.
I've been meaning to ask the same basic question.
One other thing, I believe, is that values can be retrieved just as they were
stored, so it can be a form of additional "structured" metadata associated with
a document. So I believe you can do things like
doc.add_value(0, "URL");
doc.add_value(1, "Title");
doc.add_value(2, "Author");
etc.
It would be nice if there was some page on "concepts" that covered this, or
at least an update to the developer docs:
http://xapian.org/docs/apidoc/html/classXapian_1_1Document.html#f7babb1a6368b95dd327f60b433016ac
I've wondered what the intent of get_data and set_data was, esp why have
the indexed values (the index being the first arg to get/add value) whereas
with data it's just a single value -- why not have multiple "data" values,
or why not get rid of "data" and just let the get/add value calls cover it?
I'm guessing the intent of 'data' is to store some key piece of info
about a document such as the URL of a doc that represents a web page.
thx,
Dave
>
> > (2) add_posting method will add term to a documents.
> >
> > * void add_posting (const std::string &tname, Xapian::termpos tpos,
> > Xapian::termcount wdfinc=1)*
> >
> > I noticed that
> >
> > Xapian::TermGenerator has follow method
> >
> > * void index_text (const Xapian::Utf8Iterator &itor, Xapian::termcount
> > weight=1, const std::string &prefix="")*
> >
> > What's the differences and relationship between these two functions?
>
> I've just added a FAQ which should answer this.
> http://trac.xapian.org/wiki/FAQ/TermGenerator
>
More information about the Xapian-devel
mailing list