Hey James , Hi :) Thanks for your review of the example. I saw some questions on IRC,Help Wanted and in the FAQ about indexing single terms and stemming strategies.Is it okay if I rewrite the example to answer those questions ? We don't have any example currently which explains doc.add_term( ) and stemming strategies .<br>
<br>-Regards<br>-Aarsh<br><br><div class="gmail_quote">On Fri, Feb 8, 2013 at 3:11 AM, James Aylett <span dir="ltr"><<a href="mailto:james-xapian@tartarus.org" target="_blank">james-xapian@tartarus.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 27 Jan 2013, at 20:09, aarsh shah <<a href="mailto:aarshkshah1992@gmail.com">aarshkshah1992@gmail.com</a>> wrote:<br>
<br>
> Hey guys,I have added a python indexer example to the SampleCode page of our wiki.Please do have a look.The code can also be found here :-<br>
><br>
> <a href="https://github.com/aarshkshah1992/xapian/blob/efcf443527b74326119bbc0935fc41a002ce60db/xapian-bindings/python/docs/examples/simpleindexgrep.py/" target="_blank">https://github.com/aarshkshah1992/xapian/blob/efcf443527b74326119bbc0935fc41a002ce60db/xapian-bindings/python/docs/examples/simpleindexgrep.py/</a><br>
<br>
</div>Aarsh — what are you actually trying to do here? Because what your comments say you're doing isn't what the code does. Three problems:<br>
<br>
1) English uses capitals at the start of sentences, so you're actually just indexing more or less everything<br>
<br>
2) you're running xapian.TermGenerator.index_text() on single words, which isn't really what it's designed to do (it has its own word-splitting algorithm)<br>
<br>
3) you don't support sentences broken across lines, which doesn't match the majority of use cases — although you may have a particular one in mind<br>
<br>
Does what you're trying to do show how to use an aspect of Xapian that we don't already show in the existing examples? Or at least show it more clearly?<br>
<span class="HOEnZb"><font color="#888888"><br>
J<br>
<br>
--<br>
James Aylett, occasional trouble-maker<br>
<a href="http://xapian.org" target="_blank">xapian.org</a><br>
<br>
</font></span></blockquote></div><br>