Implementation of substring search in omegascript

James Aylett james-xapian at tartarus.org
Mon Feb 15 11:38:43 GMT 2016


Hi, Ayush — congratulations on getting stuck into this! As a matter of course, can you please keep replies copied to the mailing list so everyone can help and benefit from the discussion? [xapian-devel added back to cc.]

Because you haven’t provided this as a patch or PR (more on that later), it’s harder for me to incorporate into my codebase and test, and I haven’t had time to do that yet. However it looks in principal good to me, with a couple of comments. (You should also try to ensure that you match our indentation style.)

I think you’re making life more complex than you need to in creating the string version of the integer offset. Have a look at how the `$add{}` command does it (ll 1253–1260 	https://github.com/xapian/xapian/blob/master/xapian-applications/omega/query.cc#L1253).

You’ll also need to add documentation to `docs/omegascript.rst` to complete this project; just a short sentence describing what it does will be sufficient.

To move this further forward, you also need to provide a patch in some form rather than a gist that isn’t formatting as a patch. The easiest way of doing this is probably to fork the Xapian project on github, push a commit to a branch there and then open a pull request. (Alternatively you can create a patch by using git locally, avoiding github. However since github is currently so popular among open source projects, it’s a good idea to learn how to use it.)

Github has some documentation on forking and pull requests which may help (https://help.github.com/categories/collaborating-on-projects-using-pull-requests/), but if you have problems, please shout out and someone will try to help!

J

> On 15 Feb 2016, at 10:56, Ayush Gupta <ayushgp10 at gmail.com> wrote:
> 
> Hi James,
> I have implemented the function here: https://gist.github.com/ayushgp/fd0b23a6d214e451edf1
> Please have a look. 
> 
> 
> On Mon, Feb 15, 2016 at 6:11 AM, James Aylett <james-xapian at tartarus.org> wrote:
> On 14 Feb 2016, at 18:52, Ayush Gupta <ayushgp10 at gmail.com> wrote:
> 
> > I tried implementing the bite sized project idea posted here: https://trac.xapian.org/wiki/ProjectIdeas#AddnewOmegaScriptcommandtodoasubstringsearch but could not understand what needs to be returned when this command is called.
> 
> Hi, Ayush. It’s probably worth you creating a short plan as you figure out what’s needed here. Having a plan before you write code is always a good idea, and if you create it as a gist or similar, you can share it with other people on the list when asking questions. It’ll help others know where you’ve got to and what you’ve tried, and is good practice for when you get into larger projects (such as GSoC) which last weeks or months.
> 
> > This line is not so clear to me: expand to the offset of the first occurrence of fish from the start of the string if $query, or to if $query doesn't contain the substring fish.
> 
> Based on the description, here are some examples of what `$contains{fish,$query}` would expand to based on different values of `$query`:
> 
> $query          $contains{fish,$query}
> --------------------------------------
> hello           (empty string)
> fish            0
> hello fish      6
> 
> So you’d be able to do things like `$if{$contains{fish,$query},fish present,no fishes}`.
> 
> Hopefully that makes it more clear.
> 
> > Also in query.cc file in omega dir, when defining function descriptions, what does the evalargs and ensure arguments mean?
> 
> ensure means that the command requires either a query (‘Q’) or a query and a match (‘M’) to have been evaluated in the script up to that point. (See lines 1246–1248 of query.cc.)
> 
> evalargs (ll 1238–1244) controls the number of arguments to the command that are evaluated, ie that themselves can contain OmegaScript commands. (Some commands explicitly evaluate some of their arguments themselves, so this is more convenience where a number of the arguments need evaluating unconditionally — such as for `$def{}` where the macroname is evaluated, but the expansion of the macro is not, and `$if{}` where we only want to evaluate one of the result branches, not both.)
> 
> Ideally they’d both be explained in comments in the source code (around the structure definition and macro). Feel free to make a small patch and pull request to cover that!
> 
> (Note that it looks to me like there’s a bug in the evalargs handling, where minargs == N.)
> 
> J
> 
> --
>  James Aylett, occasional trouble-maker
>  xapian.org
> 
> 

-- 
 James Aylett, occasional trouble-maker
 xapian.org




More information about the Xapian-devel mailing list