<div dir="ltr">I think should explain the proposed algorithm in the proposal more clearly. I did not do that because I thought it would make the proposal lengthy. Is there a word limit for the proposal?? <br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 23, 2016 at 4:40 PM, MURTUZA BOHRA <span dir="ltr"><<a href="mailto:murtuzabohra88@gmail.com" target="_blank">murtuzabohra88@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>Hello sir,<br><br></div>You have interpreted correctly that clustering will be done by generating the ring around the Document(i.e. the basic idea of LSI). But it is not like increasing the radius and the next shell will be another cluster, Rather it would pick one document (based on relevance score) and form a ring around it to cluster the document, then from the remaining documents(not in the cluster but are there in the search result) again another document will be picked and next cluster will be formed, this will go on till all the search results are exhausted.<br><br></div>I have attached a file to geometrically illustrate the algorithm, please have a look at it.<br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 23, 2016 at 12:21 AM, Olly Betts <span dir="ltr"><<a href="mailto:olly@survex.com" target="_blank">olly@survex.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On Tue, Mar 22, 2016 at 02:08:23PM +0530, MURTUZA BOHRA wrote:<br>

</span><span>> How Latent semantic indexing would help?<br>

><br>

> In LSI we project query (considering as a pseudo document) on to the<br>

> term-document vector space and based on some threshold we find the relevant<br>

> documents. Very similarly if we use LSI for clustering, and instead of<br>

> query if we take one of our search result and set different thresholds and<br>

> based on each threshold we can cluster the search result at single shot.<br>

<br>

</span>So if I follow, you take one document (how do you decide which) and then<br>

generate a set of clusters as (multi-dimensional) rings around it of<br>

increasing radius?<br>

<br>

That doesn't sound like it's going to do a good job of producing useful<br>

clusters.  The group around the "seed" document is probably related,<br>

but once you get beyond that the documents in the cluster are defined<br>

only by distance from the seed.<br>

<br>

In geographical terms, locations which are < 10km from a given point<br>

might be a useful cluster, but locations between 10 and 20km from that<br>

point is much less likely to be.<br>

<br>

Cheers,<br>

    Olly<br>

</blockquote></div><br></div>

</div></div></blockquote></div><br></div>