[Xapian-discuss] Node.js binding

Richard Boulton richard at tartarus.org
Thu Oct 20 20:06:40 BST 2011


On 20 October 2011 18:11, Liam <xapian at networkimprov.net> wrote:
>> >  MSetIterator:: operator *(), get_percent(), get_document()
>> get_document() is not safe here - the documents can be lazily loaded
>> into the MSet object, so this can hit disk or network.
>
> Document::get_data() does I/O, so what does MSetIterator::get_document() do?

Depends (on the backend, and possibly on other things); there's lazy
loading going on here.  Also, note that Document::get_data() only
returns the "user data" part of the document; it may or may not
(again, depending on the backend) result in the rest of the data
associated with a document (ie, terms and values) being read.  It's
all quite complicated ;)

>> One concern about putting some Xapian accesses into a subthread; it is
>> not safe to call methods on Xapian API objects concurrently, so you'll
>> need to protect calls with some locking scheme, or some convention to
>> avoid this.  Seems very tricky to do right, to me, and might therefore
>> be safer to just do everything in a subthread.
>
> While it is possible to "parallelize" I/O functions as below, typically you
> sequence them in nested callbacks as in my prior example code. All
> Javascript code is confined to the main thread -- which makes it possible to
> hang everything with while(true) {} :-P

What I was concerned about was concurrent calls to methods of Xapian
objects, which this doesn't avoid.  For example, if the main thread
has a "db" variable pointing to a Xapian database, and starts a
get_mset() operation, the get_mset() operation will be performed using
the Xapian database in a subthread.  From what you describe, there's
nothing stopping the main thread kicking off another get_mset()
operation (or any other operation which would access the database)
before the subthread finishes, which would cause problems.

Of course, the programmer could just be warned not to do that, but in
such a setup it seems very likely that accidental violation of that
would happen (after all, most node programmers won't be expecting to
have to worry about avoid threading issues; that's kind-of the point
of node as far as I understand it).

-- 
Richard



More information about the Xapian-discuss mailing list