[Xapian-discuss] flush problem
Michael A. Lewis
MAL at ICGINC.COM
Sun Jan 20 20:30:04 GMT 2008
________________________________
From: xapian-discuss-bounces at lists.xapian.org on behalf of James Aylett
Sent: Sun 1/20/2008 3:12 PM
To: xapian-discuss at lists.xapian.org
Subject: Re: [Xapian-discuss] flush problem
The main thing the system seems to be doing is IOWAIT (17-68% is the approx. range I'm seeing over a few minutes). The only appilcation running on this system is the XAPIAN code. At the time of the insert, no other processing is searching or inserting. Basically, only the flush code is running. The getdatabase code is as follows:
static vector<string> dbNames;
static vector<string> dbErrors;
static map<char*, Xapian::WritableDatabase*> dbHash;
//
// Gets a database handle or creates it if necessary
//
Xapian::WritableDatabase* getDatabase () {
map<char*, Xapian::WritableDatabase*>::iterator iter;
iter = dbHash.find(dbname);
if (iter != dbHash.end()) {
return iter->second;
}
else {
dbHash[dbname] = new Xapian::WritableDatabase (dbname, DB_CREATE_OR_OPEN);
return dbHash[dbname];
}
}
It's pretty simple. The average document length is about 300k of standard english text. Nothing remarkable or esoteric with the exception of a number of email addresses. In my previous posting I sent the output from a top command while it is flushing (which it currently is doing). Appears to be using 2.8gb of memory with 1.1gb free.
--Michael
On Sun, Jan 20, 2008 at 01:32:19PM -0500, Michael A. Lewis wrote:
> I am having a problem with flushing a database. I am adding N
> records to the DB (which amounts to 1 - 2000). At then end of the
> run, I issue a flush() call. The problem is that the flush call
> never seems to do anything. Every 10000 additions to the database
> and the library performs a flush (which can take up to 3 hours on a
> 560,000 document database) as if my flush call was never performed.
>
> 1) This seems entirely too long, is it?
Sounds high to me, but it depends on so many factors: number of terms,
size of document data, available memory, how much memory is used by
Xapian to hold the 10k documents before flushing, logical to physical
volume layout, file systems involved...
What are you seeing as the main activity during flush? If you're on a
Unix machine it'll probably be one of system, user or iowait.
> 2) Why would my flush be ignored (no tranactions being used, just
> straight add using the term generator).
>
> This is my flush code:
>
> try {
> Xapian::WritableDatabase* database = getDatabase();
> database->flush();
> } catch (const Xapian::Error & err) {
> s="ERROR:"+err.get_msg();
> log_it(s.c_str());
> write(c_id,"ERROR:-3",8);
> }
> return;
Assuming that getDatabase() implements the Singleton pattern
correctly, that you aren't clearing its instance, and that you aren't
using threading (or if you are you know what you're doing with
Singleton), this is odd.
I've had a quick look over the flint code, and I can't see how it
could not be working for you. If you compile with --enable-log and
then run with XAPIAN_DEBUG_LOG set to a file, and XAPIAN_DEBUG_FLAGS
set to -1, you'll get (lots!) of messages. You particularly should get
an apply call from flint after your flush; if you don't, it's not
working for some reason.
Everything will be slower with debugging on, of course.
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
_______________________________________________
Xapian-discuss mailing list
Xapian-discuss at lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss
More information about the Xapian-discuss
mailing list