[Xapian-discuss] How term distance impacts the weight?

Bruce Zhang bruce.zhang at trustgo.com
Tue Aug 2 16:02:36 BST 2011


Thank you, Olly,

One of example I can think of is:
in Oriental language, there is no separator between words in sentence.
after word segment, 2 words can be adjacent,  but maybe separated by
some adjective
in some other documents.
at this situation, we can still think that the search match the search
criteria.

so short distance means more accuracy in some situations. but if distance is
far, means they are no relation
with each other of 2 words

thanks,
Bruce


On Tue, Aug 2, 2011 at 12:28 PM, Olly Betts <olly at survex.com> wrote

> On Tue, Aug 02, 2011 at 12:12:59AM +0800, Bruce Zhang wrote:
> > I wonder how position used in query? how it impacts the weight of search?
> > could anyone shed light on this?
>
> Currently positional data is use for phrase searching, and for proximity
> operators (NEAR and ADJ).
>
> It's very likely someone will implement weighting based on proximity at
> some point, but nobody has yet.
>
> > Can I understand that position is more useful for Oriental language like
> > Chinese, Japanese Korean than for Western Languages,
> > because Oriental language need word segmentation?
>
> I don't see why it would be inherently more useful there, though I don't
> know any of those languages in great detail.
>
> Cheers,
>     Olly
>


More information about the Xapian-discuss mailing list