Welcome to the "Xapian-discuss" mailing list
James Aylett
james at tartarus.org
Thu Jun 21 20:05:42 BST 2018
Please keep replies on the mailing list — more people can help (and benefit) that way :)
So OP_NEAR looks for its terms close to each other (hence "near"). The window is how far away they can be. Probably the easiest way to play with this is using the NEAR syntax in the query parser. So if you had a plain text document:
I am walking, always walking.
And index it in a very simple fashion (in python):
import xapian
db = xapian.WritableDatabase("testdb")
doc = xapian.Document()
tg = xapian.TermGenerator()
tg.set_document(doc)
tg.index_text("I am walking, always walking.")
db.add_document(doc)
Then you can run NEAR queries:
import xapian
db = xapian.Database("testdb")
qp = xapian.QueryParser()
qp.set_database(db)
def query(query):
enq = xapian.Enquire(db)
q = qp.parse_query(query)
enq.set_query(q)
for match in enq.get_mset(0, 10):
print(match.docid)
query("I NEAR/1 walking") # prints nothing
query("I NEAR/2 walking") # prints 1
There's no document in the database where "I" is adjacent to "walking". However there is one where it's within two ("I am walking…"). Likewise:
query("I NEAR/2 always") # nothing
query("am NEAR/2 always") # prints 1
query("walking NEAR/2 always") # prints 1 again
Hope that helps a little!
J
> On 20 Jun 2018, at 21:23, Gaby Goldberg <gaby.goldberg at rivdata.com> wrote:
>
> I'm a bit confused on how the operator works. Does it find the distance between the two terms?
>
> On Wed, Jun 20, 2018 at 1:09 PM James Aylett <james at tartarus.org> wrote:
> On 20 Jun 2018, at 20:39, Gaby Goldberg <gaby.goldberg at rivdata.com> wrote:
>
> > I'm new to Xapian and wanted to know if it has a specific feature. I want
> > to be able to check the relation between two terms on a page based on how
> > close they are together on the page. I want to use a combination of n-gram
> > based labeling and the "slop" feature found in Elasticsearch. Does Xapian
> > have this/a similar feature? I haven't been able to find any programs that
> > have features similar to the "slop" feature on Elasticsearch yet.
>
> Hi, Gaby — you're probably looking for the window parameter of the NEAR positional operator. I realise as I write this that it isn't terribly well-documented in the API, but there are hints here:
>
> https://xapian.org/docs/apidoc/html/classXapian_1_1Query.html#adb287c496f72327d1c1411fac0570ea9
>
> I've added some notes to our missing documentation list [1] that we need to work on this!
>
> [1] https://trac.xapian.org/wiki/MissingDocumentation
>
> J
>
> --
> James Aylett
> devfort.com — spacelog.org — tartarus.org/james/
>
>
>
> --
> Gaby Goldberg
> Data Analysis and Marketing Intern
> p: 805.452.5413
> w: carpe.io e: gaby.goldberg at rivdata.com
>
>
--
James Aylett
devfort.com — spacelog.org — tartarus.org/james/
More information about the Xapian-discuss
mailing list