[Xapian-discuss] range query for terms

张吉财 zjc5415 at 163.com
Sun Mar 29 12:07:25 BST 2015


Thank you, Olly!

I tried to figure out a picture about how index/query related to the B-tree block access on disk.
I think I'm all messed up and failed.
now I am trying to index docs in json format, and came to a question about prefix mapping:
a json doc like:    {"starttime":1111,"endtime":2222}
considerring mapping prefix to slot number in two ways:
1.starttime-->0,endtime--->1
2.startime--->hash(starttime), endtime--->hash(endtime), while hash(key) is a random int, which may be very sparse but unique, for example, using BKDR hash.

after simple test, both ways seemed to work well. can I use the second way(do not have to maintain a mapping), is there performance issues?


At 2015-03-16 03:36:48, "Olly Betts" <olly at survex.com> wrote:
>On Sat, Mar 14, 2015 at 09:25:24PM +0800, 张吉财 wrote:
>> then I'd like to ask if it is possible to do a range query on
>> terms(like the range query on values), or if it is just a
>> wildcard(right truncation) match.
>
>Currently only right truncation is supported.
>
>> the case is searching ip address bettween  “10.10.0.0” and “10.10.255.255”
>> the user want :
>> 1.   query "10.10.10.10" < ip < "10.10.10.12"  gives "10.10.10.11"
>> 2.   query "*.*.*.10" gives all ip addresses ended with 10.
>> 
>> how can I achieve this?
>
>You should really consider using a value for this, as such wildcards can
>expand to an awful lot of terms - "*.*.*.10" potentially matches 16
>million terms.  With a value, there's only one thing to check for every
>candidate document.
>
>But if you only actually have a small number of IP addresses and really
>want to use terms, you can just iterate allterms from the Database
>object and build an OP_SYNONYM query from all the matching terms.  In
>1.2.x, that's exactly how OP_WILDCARD is implemented (in master
>OP_WILDCARD expansion is delayed until we process the Query tree, which
>means we can avoid creating Query objects for every term in the
>wildcard).
>
>Cheers,
>    Olly


More information about the Xapian-discuss mailing list