[Xapian-devel] GSoC xapian node binding thoughts

Olly Betts olly at survex.com
Mon May 28 22:44:48 BST 2012


I was just having a look over the API notes:

https://github.com/mtibeica/node-xapian/blob/master/docs

Some feedback:

I wouldn't bother wrapping WritableDatabase::flush().  It's only there
for compatibility with older code, so for a new binding you can just
wrap commit().

Generally, uint32 isn't necessarily the right type to use everywhere,
and means things will go wrong if someone patches Xapian and rebuilds
it to use (e.g. 64 bit document ids).  Maybe it's hard to use the
appropriate Xapian::docid, Xapian::doccount, Xapian::termcount, etc
typedefs here though.

    A query consisting of two or more subqueries, opp-ed together.
    AND, OR, SYNONYM, NEAR and PHRASE can take any number of subqueries. 
    Other operators take only the first two subqueries.
    {
	op: string,
	queries: [ object_querystructure1, ...]
    }

XOR can also take any number of subqueries.  And on trunk, OP_FILTER,
OP_AND_NOT, and OP_AND_MAYBE can also take any number of subqueries
(with OP(A, B, C) being interpreted as OP(OP(A, B), C)

Also, it would be nice to support a mixture of strings and query objects
as the subqueries (like we do in most of the dynamically typed languages).

I'm dubious about wrapping the various iterators as methods which read
all the entries from the iterator and return an array.  That's
potentially a huge amount of data to read and store in memory when
the user may only want a small subset, or to be able to process it as
a stream.  Or are these actually implemented like Perl tied arrays?

Cheers,
    Olly



More information about the Xapian-devel mailing list