[Xapian-devel] Re: Xapian jni

Olly Betts olly at survex.com
Thu Dec 28 23:19:06 GMT 2006


On Thu, Dec 28, 2006 at 12:12:37PM -0500, Eric B. Ridge wrote:
> On Dec 27, 2006, at 8:19 PM, Olly Betts wrote:
> >You couldn't implement Java subclassing of C++ classes if you
> >used this technique either, and this is currently supported for at  
> >least MatchDecider and ExpandDecider.
> 
> Now that's true, but one could argue that it'd be acceptable to  
> implement these in C++.

Possibly.  Or perhaps for them just not to be supported.

> I haven't given it much thought but maybe  there'd be a way to extend
> the wire protocol to support client-side  Match/ExpandDeciders?

It's possible, but the worst case behaviour doesn't bear thinking about.
If your MatchDecider throws away most documents, and the unfiltered
query matches millions of documents, that's millions of documents you
need to send across for consideration.

Currently we perform the match remotely, and stream back the MSet.  If
multiple databases are involved, this is merged with other MSets to
form the final MSet.

I did experiment in the BrightStation days with streaming back the
candidate MSet entries instead, but even across fast ethernet (100M in
those days) it was just too slow.  1G Ethernet's now commonly available,
but everything else is faster too.  Probably less than 10 times, but
even so I doubt the balance has shifted enough.

It's also nice that the current protocol works well with a slower,
higher latency connection as well as a fast, low latency one.

> >Plus you'd need to transparently allocate ports in a manner which
> >doesn't interfere with other processes on the box and manage  
> >permissions
> >on the tcp backend servers (so other processes can't intercept "your"
> >tcp backend before you manage to open it).
> 
> I dunno about all that.  I think in this scenario you'd have to think  
> of Xapian as a remote database backend a la MySQL or Postgres.  Maybe  
> the Xapian backend would need a master server listener that would  
> fork the client to use the backend database it requested.

I was assuming you had in mind that the remote server would be
started on demand.

Having a "master server" helps avoid some of these problems, though
it adds admin overhead (not requiring a server was a concious design
choice incidentally).

> This probably means an entirely new wire protocol and backend  
> infrastructure, but it might be worth considering from a client- 
> accessibility point of view.

It needs some sort of authentication and a way to specify the database
you want to open - neither requires major changes, though currently I
think using xapian-progsrv over ssh is the cleanest way to go if
authentication or encryption is required.

> >A major motivation for moving Java to use SWIG is to reduce the  
> >effort required to update it as the C++ API evolves (currently Java
> >takes several  times as long to update as all the SWIG-using
> >languages together).
> 
> Refresh my memory... what's the hold-up for just switching to SWIG?

Time!

The effort required depends a lot on how faithfully we want to
reimplement the JNI wrapped API.  I don't actually know how important
that is to users.  It could even be there features of the current
wrapping which it might be better to change.

I don't have a good feeling for how widely used the current bindings
are, but if there are existing users who are willing to help out with
testing by checking their code works with SWIG-based Java bindings, it 
wouldn't take too long to do I think.  SmokeTest.java covers a number
of features, but it's not complete by any means.

> I meant to also add "with a BSD license".  And about 5 more smilies!

Longer term I'd like to replace all the GPL code we can't persuade the
copyright holder(s) to relicense.  GPL isn't really the best fit for a
library.  BrightStation's model was going to be to offer a commercial
licence (for a fee) as an alternative to the GPL version, but that's not
an option for someone wanting to use Xapian with non-GPL code since
BrightStation did the "dot com dive".

Cheers,
    Olly



More information about the Xapian-devel mailing list