[Xapian-discuss] Remote database search issues
Ron Kass
ron at pidgintech.com
Sat Oct 27 18:03:34 BST 2007
Hi all.
First, a note about remote database connection over Perl. We actually
found an easy way to work around the unwrapped Renote::open issue.. We
use a stub file.
You might say that open_stub in also not wrapped.. which is true...
HOWEVER... looking at the code, we realized that Database::open() opts
to using stub_open if the argument is a string pointing to a stub file
rather than a database directory... So instead of
Database::open('/data/ftsdirectory') you can do
Database::open('/stubfile.dat')
Pretty handy trick.. not as nice as proper remote database open (since
with that you can dynamically control via code which servers to connect
to) but still, it works.
So we then tested remote search... We faced several problems...
1) Only the xapian-tpcsrv worked. We couldn't figure out how to use
xapian-progsrv. The problem was the stub file format.
This works:
remote 10.0.0.27:33333
But these don't work
remote ssh ftsuser at 10.0.0.27 xapian-progsrv /data/fts/Database2/
or
remote ssh 10.0.0.27 xapian-progsrv /data/fts/Database2/
The error we get is
Error creating DB with stub file: Exception: Bad line 1 in stub
database file `/fts/stub.dat' at ...
Can anyone shed a light on this one? ssh is configured properly. ftsuser
is allowed to ssh without a prompt (using proper key files). database is
in the right location.. the error seems to be from the parsing of the line.
What are we doing wrong? whats the right format for remote over
xapian-progsrv?
2) We tried remote search over tcpsrv... which we can not really use
besides for testing, until it supports parallel searches, which is
something xapian-progsrv does support as far as we understand.
search speed was bad. A search for a single word (like gift) takes well
over a second for the first search. something that fast when running
locally. Even when done on the same machine (with localhost) its not
that fast.
Furthermore, fetching the documents also takes a long time and even
worse than that, fetching the matching words. Even on localhost.
TCP overhead shouldn't be that bad, should it? Maybe its tcpsrv
performance in general?
It probably doesn't help that search speed is slow in general in our
searches (this issue is being discussed in another discussion in this
mailing list), but nonetheless, its much slower than the regular slow
search.
Any tips, ideas, thoughts on these two issues? Did anyone manage using
multiple remote databases effectively?
Best regards,
Ron
More information about the Xapian-discuss
mailing list