Welcome to the "Xapian-discuss" mailing list

James Aylett james at tartarus.org
Thu Jun 21 20:05:42 BST 2018


Please keep replies on the mailing list — more people can help (and benefit) that way :)

So OP_NEAR looks for its terms close to each other (hence "near"). The window is how far away they can be. Probably the easiest way to play with this is using the NEAR syntax in the query parser. So if you had a plain text document:

I am walking, always walking.

And index it in a very simple fashion (in python):

import xapian
db = xapian.WritableDatabase("testdb")
doc = xapian.Document()
tg = xapian.TermGenerator()
tg.set_document(doc)
tg.index_text("I am walking, always walking.")
db.add_document(doc)

Then you can run NEAR queries:

import xapian
db = xapian.Database("testdb")
qp = xapian.QueryParser()
qp.set_database(db)

def query(query):
    enq = xapian.Enquire(db)
    q = qp.parse_query(query)
    enq.set_query(q)
    for match in enq.get_mset(0, 10):
        print(match.docid)

query("I NEAR/1 walking") # prints nothing
query("I NEAR/2 walking") # prints 1

There's no document in the database where "I" is adjacent to "walking". However there is one where it's within two ("I am walking…"). Likewise:

query("I NEAR/2 always") # nothing
query("am NEAR/2 always") # prints 1
query("walking NEAR/2 always") # prints 1 again

Hope that helps a little!

J

> On 20 Jun 2018, at 21:23, Gaby Goldberg <gaby.goldberg at rivdata.com> wrote:
> 
> I'm a bit confused on how the operator works. Does it find the distance between the two terms?
> 
> On Wed, Jun 20, 2018 at 1:09 PM James Aylett <james at tartarus.org> wrote:
> On 20 Jun 2018, at 20:39, Gaby Goldberg <gaby.goldberg at rivdata.com> wrote:
> 
> > I'm new to Xapian and wanted to know if it has a specific feature. I want
> > to be able to check the relation between two terms on a page based on how
> > close they are together on the page. I want to use a combination of n-gram
> > based labeling and the "slop" feature found in Elasticsearch. Does Xapian
> > have this/a similar feature? I haven't been able to find any programs that
> > have features similar to the "slop" feature on Elasticsearch yet.
> 
> Hi, Gaby — you're probably looking for the window parameter of the NEAR positional operator. I realise as I write this that it isn't terribly well-documented in the API, but there are hints here:
> 
> https://xapian.org/docs/apidoc/html/classXapian_1_1Query.html#adb287c496f72327d1c1411fac0570ea9
> 
> I've added some notes to our missing documentation list [1] that we need to work on this!
> 
> [1] https://trac.xapian.org/wiki/MissingDocumentation
> 
> J
> 
> -- 
>  James Aylett
>  devfort.com — spacelog.org — tartarus.org/james/
> 
> 
> 
> -- 
> Gaby Goldberg
> Data Analysis and Marketing Intern
> p: 805.452.5413 
> w: carpe.io e: gaby.goldberg at rivdata.com
> 
> 		

-- 
 James Aylett
 devfort.com — spacelog.org — tartarus.org/james/




More information about the Xapian-discuss mailing list