[Xapian-discuss] add_posting(): term position significance - line or offset?

Henry henka at cityweb.co.za
Tue Nov 18 17:18:46 GMT 2008


> The usual use is to store the "word number" at which a word appears,  
> and this is probably what you want.  However, you could store the  
> line number if you wanted: phrase searches (with a window of  
> phrase-size) would then match when the words were fairly spread out  
> (ie, up to one per line).
>
> I recommend using word number, anyway, unless you have a very odd  
> situation I've not thought of.

Thanks - I hadn't even thought of word number.

> Note that Xapian currently doesn't modify the weight of a phrase  
> based on how close together the terms are ...

Sorry, I wasn't very clear:  I was thinking in terms of normal  
non-phrase searches.  ie, searching for  [ +candle +stick ] in:

"...the candle stick was made of gold..."

would score higher (because of the proximity of the words, posting  
weights aside) than:

"...the boy decided to use a stick made of wood to break the candle..."

where 'stick' and 'candle' are further apart.

Anyway, you've answered my question, thanks!

Regards
Henry



More information about the Xapian-discuss mailing list