[Xapian-devel] questions about move_to_chunk_containing

Olly Betts olly at survex.com
Mon Mar 17 09:58:30 GMT 2014


On Mon, Mar 17, 2014 at 02:53:47PM +0800, Hurricane Tong wrote:
> (void)cursor->find_entry(BrassPostListTable::make_key(term, desired_did))
> Does this function make cursor point to the chunk where the first id
> in the chunk is less than desired_did and the first id in next chunk
> is bigger than desired_did ?

It is documented in the header here:

http://trac.xapian.org/browser/trunk/xapian-core/backends/brass/brass_cursor.h#L283

So if there's a chunk with exactly the desired key (which is a chunk
for term starting with desired_did, the cursor will point to that.

Otherwise the cursor points to the last chunk with a key < the desired
key.

You might notice that this isn't actually quite ideal - if we have a
chunk with docids 10-1000 and one with docids 2000-3000, then looking
for 1500 will land us on the first chunk, whereas the next chunk we're
interested in is actually the 2000-3000 one.

I can't see how to avoid this without an incompatible change to the
format, and in practice, the loss of efficiency from this is probably
not dramatic (the majority of the time the chunk before will be in the
same block as the chunk we actually want, and seeking to a gap doesn't
happen every time).  But it's something I've had in mind to look at one
day.  I think you'd probably have to make the key use the *last* docid
in the chunk instead of the first, which is a bit awkward as that
changes when we append to a chunk.

> If did1 and did2 is in the same chunk,  make_key returns different key.
> But how can find_entry turn to same chunk with different key ?

Since did1 and did2 are in the same chunk, there can't be a chunk which
starts with any docid between did1 and did2, so the cursor must end up
on the same chunk when you search for the keys built for the same term
plus did1 or did2).

Cheers,
    Olly



More information about the Xapian-devel mailing list