[Xapian-discuss] Is this a correct method of indexing?

Tony Lambiris tonylambiris at gmail.com
Mon Jan 19 06:26:40 GMT 2009


I'm kind of new to Xapian and search in general, but I am in the
process of working with Xapian to index documents and I am becoming a
little confused as to all the functions, as from a top-level appear to
accomplish much of the same thing.

What I am trying to do now, is basically index a document but I want
to add more weight to the document title. After multiple tries with
all the various functions (ie: add_term, add_posting, etc), this is
what I ended up with:
doc.add_term(doc_title, 100);

The idea being that if the query matches the exact title, I want to
really rank it high. After that I use index_text_without_positions to
index the entire document as I won't be using any phrase or NEAR
queries, and I also read this method takes up less space.

Does it appear I am doing everything correct? I don't know if it's
over-kill to index the entire document or not, or if there are any
preferred methods. I had toyed with the idea of indexing only the
first paragraph of the document, but I wanted to keep the input method
totally unobtrusive when it came to the format of the text. All I care
about is the title (or file name) and the contents, but I don't know
if this is the best approach.... the database grows quite large and
indexing slows down dramatically.

Thanks in advance for your time.



More information about the Xapian-discuss mailing list