GSoC 2017
Richhiey Thomas
richhiey.thomas at gmail.com
Mon Jan 23 12:23:40 GMT 2017
Hello devs,
My name is Richhiey Thomas and I'm studying Computer Engineering under
Mumbai University. I had worked with Xapian in GSoC 2016 where I had worked
on Clustering of Search Results. I would want to continue working on the
project and was wondering whether it would fit the scope of GSoC.
The clustering branch had a clustering API and KMeans clusterer implemented
but hasnt been merged yet because it had to be optimized further and due to
other smaller issues. I would like to complete work on merging this
clustering branch and implementing a hierarchial clusterer.
Also, a main reason for the performance reduction with large document
corpus was because of the dimensionality of the document vectors. Therefore
a latent semantic analysis to reduce document vectors size is something
that could be necessary.
I would like to have your feedback on the same.
Thanks :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20170123/00e3c718/attachment.html>
More information about the Xapian-devel
mailing list