[Xapian-devel] some improvements about the latent semantic search

Jianping Wang wangjpzju at gmail.com
Tue Oct 9 16:20:38 BST 2012


-------------------------------------------------------------------------------------------
Dear Olly:
     thanks for your reply, my algorithm does compute the sum of a positive
weight from each
matching term, and rank the documents. Maybe adding a weighting scheme will
implement it.
But I am new to Xapian and have never read the source code of Xapian
before, and I am a college
boy busy with my study all the day. I'd like to describe to my algorithm in
a document and share it,
would anybody in the community who is familiar with Xapian like to help me
to implement it? :-)


best regard
------------------------------------------------------------------------------------------

2012/10/9 Olly Betts <olly at survex.com>

> On Thu, Oct 04, 2012 at 11:48:13PM +0800, Jianping Wang wrote:
> > Recently I invented a new ranking algorithm inspired by the theory of
> > spread activation and probabilistic model, which can find the latent
> > semantic relationship between docs and terms and is almost linear time,
> and
> > I took one afternoon to code and implement this algorithm. And the
> testing
> > result shows that the speed of this algorithm is much faster than the
> > famous Latent Semantic Analysis algorithm, and the affect is almost as
> good
> > as the LSA. I wanna share my idea to all of you and add this algorithm to
> > the Xapian project.
>
> Can you express your algorithm as a sum of a positive weight from each
> matching term, optionally plus a per-document component?  That's a
> requirement for it to be implementable within the Xapian matcher
> framework.  If it doesn't fit into this form, you'll need to do a lot
> more work to fit it into Xapian.
>
> If the algorithm is a product of a contribution per term, then taking
> the log may allow you to express it as such a sum.
>
> To implement a new weighting scheme, you need to subclass Xapian::Weight
> and implement several methods:
>
> http://trac.xapian.org/browser/trunk/xapian-core/include/xapian/weight.h
>
> Cheers,
>     Olly
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20121009/494e08b3/attachment.htm>


More information about the Xapian-devel mailing list