[Xapian-discuss] Statistical query completion

Mon May 4 15:30:16 BST 2009

Does Xapian have any mechanism to efficiently find the terms that are
statistically likely to follow one or more terms?

For example, consider the corpus where the sentence "I like icecream"
occurs a number of times, and the sentence "I like chicken" also
occurs, but more rarely.

Given the sequence of terms ["I", "like"], I would to discover the
possible completions ["I", "like", "icecream"] and ["I", "like",
"chicken"] ranked appropriately.

Does Xapian support something like this, or will I have to build my
own Markov chain model?

A.