<div dir="ltr"><p dir="ltr" style="font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Hi Xapian Developers,</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">I am Dhruv, majoring in Mathematics & Computing from Indian</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Institute Of Technology, Guwahati, India. I went through your ideas page</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">and found a few ideas that caught my interest. After doing a bit of the</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">research over those ideas, I feel that  "</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);font-weight:bold;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Clustering of Search Results</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">" and “</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);font-weight:bold;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Weighting Schemes</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">" are the ones that I would like to contribute on, as they aptly fits my profile.</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">A little bit about my background: I have strong software engineering skills</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">with 3 years of commercial C++ and C experience. I am exposed to MPI too</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">and have 2 years of parallel programming experience in CUDA*. I have been studying Machine Learning for the past 3 years and have implemented quite a few advanced techniques in C++ and CUDA like Q-Learning using Convolutional Neural Network as Q-value estimator in parallel[1]. Also, I would like to mention that last year, I participated in Google Summer of Code 2014 and worked on the project “Real Time Vectorization of Brain Atlases” mentored by INCF.  Following is the link to the project repository : </span><a href="https://github.com/INCF/Vectorization-of-brain-atlases" target="_blank" style="text-decoration:none"><span style="font-size:15px;font-family:Arial;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">https://github.com/INCF/Vectorization-of-brain-atlases</span></a></p><br style="font-size:13px"><p dir="ltr" style="font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">According to my understanding of the above projects, I feel that a </span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);font-weight:bold;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">parallel Clustering algorithm</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"> like PDBSCAN[2] will be much suitable for “Clustering of Search Results” in the context of time complexity and will surely provide a major speed up. But, the code need not to depend on the availability of multiple processors, instead we can have generalized structured code that is capable of taking advantage of the available processors (and even GPU’s). <b>What do you think?</b> Also, it’s not at all mandatory to implement (only) a density-based clustering algorithm, we may have multiple other schemes of parallel clustering[3] in our project but surely the one which can provide the highest speed up, feasibly, has to be identified and should be implemented first. As in my last year gsoc project where me and my mentors discovered a new bitmap vectorization algorithm, we may come up with a new parallel algorithm for faster clustering,</span></p><p dir="ltr" style="font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">For “Weighting Schemes”, I believe that implementation would not be much of a problem instead correct implementation according to the concerned mathematical formula will be the main concern.</span></p><p dir="ltr" style="font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">I would love to contribute in the above mentioned project with my full dedication and would love to have a discussion on them.</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Thanks & Best regards,</span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><br></span><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Dhruv</span></p><br style="font-size:13px"><p dir="ltr" style="font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">Link to my github repo: </span><a href="https://github.com/chiggum" target="_blank" style="text-decoration:none"><span style="font-size:15px;font-family:Arial;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">https://github.com/chiggum</span></a></p><br style="font-size:13px"><p dir="ltr" style="font-size:13px;line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;white-space:pre-wrap;background-color:transparent">References:</span></p><ol style="font-size:13px;margin-top:0pt;margin-bottom:0pt"><li dir="ltr" style="margin-left:15px;list-style-type:decimal;font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;background-color:transparent"><p dir="ltr" style="line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><a href="http://www.cs.toronto.edu/~vmnih/docs/dqn.pdf" target="_blank" style="text-decoration:none">http://www.cs.toronto.edu/~vmnih/docs/dqn.pdf</a> </span><span style="vertical-align:baseline;white-space:pre-wrap;background-color:transparent"> (my project repo is private).</span></p></li><li dir="ltr" style="margin-left:15px;list-style-type:decimal;font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;background-color:transparent"><p dir="ltr" style="line-height:1.15;margin-top:0pt;margin-bottom:0pt"><a href="http://www.cs.gsu.edu/~wkim/index_files/papers/fastParallel_XU.pdf" target="_blank" style="text-decoration:none"><span style="text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent">http://www.cs.gsu.edu/~wkim/index_files/papers/fastParallel_XU.pdf</span></a></p></li><li dir="ltr" style="margin-left:15px;list-style-type:decimal;font-size:15px;font-family:Arial;color:rgb(0,0,0);vertical-align:baseline;background-color:transparent"><p dir="ltr" style="line-height:1.15;margin-top:0pt;margin-bottom:0pt"><span style="text-decoration:underline;vertical-align:baseline;white-space:pre-wrap;background-color:transparent"><a href="http://www.cs.gsu.edu/~wkim/index_files/SurveyParallelClustering.pdf" target="_blank" style="text-decoration:none">http://www.cs.gsu.edu/~wkim/index_files/SurveyParallelClustering.pdf</a></span></p></li></ol><div style="font-size:13px"><font color="#000000" face="Arial"><span style="font-size:15px;line-height:17.25px"><br></span></font></div><div style="font-size:13px"><font color="#000000" face="Arial"><span style="font-size:15px;line-height:17.25px">*</span></font><span style="color:rgb(0,0,0);font-family:Arial;font-size:15px;line-height:17.25px;white-space:pre-wrap">our team won the first cuda challenge in India organized by Nvidia in 2014</span></div></div>