[Xapian-discuss] Theoretical question
shef
shef31 at yahoo.com
Tue Jan 17 18:51:50 GMT 2006
I tried to post this a couple times through gmane, but
it failed. Sorry if it ends up duplicating...
*****************************************
I've been reading the docs on the internal
construction of Xapian. There's discussion of
autopruning and operator decay in the Matching
section.
Elsewhere, though, it says that postings lists are
stored in doc_id order, instead of wdf order, which
suggests that there could be high-ranking documents at
the end of a postings list.
How can autoprune and operator decay really have much
effect, then? You would almost always have to go to
the end of every list.
Example: let's say we have 1000 documents, and we need
to return the top 10 for a single-word query. On
average, the top 10 will be scattered uniformly across
a postings list which is sorted in doc_id order, which
means that at least one of them will commonly be found
90% or 95% of the way into the list.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
More information about the Xapian-discuss
mailing list