Understood your point about caching of results. Will work on the suggestions you gave on how to improve the parser functionality.<div>As you mentioned "<span style>it's probably better to cache after the </span><span style>QueryParser and key the cache on the serialised form of the final Query </span><span style>object plus any parameters you set on Enquire.</span>"</div>
<div>I have a question - Does Xapian at present cache the result at any level ? If not, then I can add the functionality of caching ("after" the Query has been parsed as you rightfully explained that <span style>it's probably better to cache after the </span><span style>QueryParser</span>) in my proposal.</div>
<div>Currently I am going through the test cases in <span style>queryparsertest.cc and figuring out the different ways in which the parsed form of those queries can be improved.</span></div><div><span style>Will let you know if I face any doubts in quryparsertest.cc. </span></div>
<div><span style><br></span></div><div><span style>Cheers,</span></div><div><span style>Sehaj</span></div><div><br><div class="gmail_quote">On Fri, Mar 23, 2012 at 8:10 AM, Olly Betts <span dir="ltr"><<a href="mailto:olly@survex.com">olly@survex.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Thu, Mar 22, 2012 at 07:32:23PM +0530, Sehaj Singh Kalra wrote:<br>
> Maintaining logs will improve parser as the present query can be matched<br>
> against the recent queries. This way, suppose for example, if we find the<br>
> exact query, the time taken by search engine<br>
> can be reduced.<br>
<br>
</div>Caching of results is certainly useful, but is the QueryParser the right<br>
place to do it? In many cases, the query consists of more than just the<br>
string the user types in - there are probably filters, options like<br>
sorting and collapsing, and so on. It's hard to handle these if you<br>
try to do the caching in the QueryParser, because it knows nothing about<br>
them. You could pass all this data in, but having to pass in lots of<br>
data is a warning sign that you've got the module boundaries wrong.<br>
<br>
If you cache results at the application level, you can key the cache off<br>
the parameters you feed to the search (for a web search, you could just<br>
key off the query part of the URL, though you probably want to at<br>
least normalise it). Another benefit of caching here is you can cache<br>
the rendered results (HTML for a web search, JSON or XML for a web API,<br>
etc).<br>
<br>
If you cache inside Xapian, then it's probably better to cache after the<br>
QueryParser and key the cache on the serialised form of the final Query<br>
object plus any parameters you set on Enquire.<br>
<div class="im"><br>
> Also even if the exact query can't be found, this will<br>
> help parser in making sane and better Query object trees by matching<br>
> against some logs and using algorithms like longest common sub-sequence<br>
> etc.<br>
<br>
</div>How would this help the parser do this? It's easy to assert that "X<br>
will help Y", but we're looking supporting evidence in a proposal.<br>
<div class="im"><br>
> This way query can be modified a bit to make more sense from the free<br>
> form text.<br>
<br>
</div>Again, how would this work?<br>
<div class="im"><br>
> These were the plans suggested to improve parser functioning.<br>
> Please guide me, about the other ways in which the parser can be modified<br>
> for better outputs.<br>
<br>
</div>There are a lot of testcases in queryparsertest.cc, some artificial and<br>
some examples of real world queries. The parsed forms of quite a few of<br>
the real world queries could be improved upon. Some have comments<br>
noting this, but not all do.<br>
<br>
Currently some parse errors trigger a fall-back mode which turns off<br>
various flags and reparses the query. Overall this is beneficial, but<br>
it can result in sometimes surprising parses for some queries. Really<br>
it is papering over the real issue.<br>
<br>
We have the "spelling suggestion" feature, which allows us to return<br>
a parsed query, but suggest what the user might have meant. It would<br>
be cool to reuse this mechanism for cases where the query seems<br>
malformed and there are two (or more) reasonable options.<br>
<br>
Cheers,<br>
Olly<br>
</blockquote></div><br></div>