[Xapian-discuss] Re: BUG IN XAPIAN_FLUSH_THRESHOLD
Mark Clarkson
mark.clarkson at smorg.co.uk
Wed Jul 18 09:35:48 BST 2007
Probably unnecessary to suggest but, if using bash, export is required:
$ export XAPIAN_FLUSH_THRESHOLD=2000000
$ your_program
On Tue, 2007-07-17 at 12:51 -0700, Kevin Duraj wrote:
> Okay XAPIANS I found the Bug!
>
> flint_database.cc for what ever reason is not picking up the
> environment variable XAPIAN_FLUSH_THRESHOLD and makes the indexing
> VERY SLOW, because it defaults it to 10000 documents. I was going
> crazy for passed month after we switched to FLINT not able to figure
> out why indexing goes so slow. Therefore I hard coded my own
> flush_threshold directly to flint_database.cc and now indexing going
> fast as before!
>
> PS: Sometimes you just got to hack it yourself ... welcome to open
> source ... *hahaha*
>
>
> -= MY HACK =-
> vi flint_database.cc
>
> size_t FlintWritableDatabase::flush_threshold = 20000000;
>
> FlintWritableDatabase::FlintWritableDatabase(const string &dir, int action,
> int block_size)
> : freq_deltas(),
> doclens(),
> mod_plists(),
> database_ro(dir, action, block_size),
> total_length(database_ro.postlist_table.get_total_length()),
> lastdocid(database_ro.get_lastdocid()),
> changes_made(0)
> {
> DEBUGCALL(DB, void, "FlintWritableDatabase", dir << ", " << action << ", "
> << block_size);
> //if (flush_threshold == 0)
> //{
> // const char *p = getenv("XAPIAN_FLUSH_THRESHOLD");
> // if (p) flush_threshold = atoi(p);
> //}
> //if (flush_threshold == 0) flush_threshold = 10000;
> flush_threshold = 20000000;
> }
>
>
>
>
> On 7/17/07, Kevin Duraj <kevin.softdev at gmail.com> wrote:
> > There is is bug when setting XAPIAN_FLUSH_THRESHOLD=20000000
> >
> > When trying for force Xapian flush documents to flush after 20 million
> > documents Xapian ignores the size and flush it after only 10,000
> > documents.
> >
> > Data captured from delve after 60 seconds interval when has been set as follow:
> > XAPIAN_FLUSH_THRESHOLD=20000000
> >
> > perl -e ' while(1) { system("delve ."); sleep(60); } '
> >
> > number of documents = 8510000
> > average document length = 13.5538
> > number of documents = 8520000
> > average document length = 13.5537
> > number of documents = 8530000
> > average document length = 13.5543
> > number of documents = 8530000
> > average document length = 13.5543
> > number of documents = 8540000
> > average document length = 13.5548
> > number of documents = 8550000
> > average document length = 13.5548
> > number of documents = 8550000
> > average document length = 13.5548
> > number of documents = 8560000
> > average document length = 13.5545
> > number of documents = 8570000
> > average document length = 13.5549
> > number of documents = 8570000
> > average document length = 13.5549
> > number of documents = 8580000
> > average document length = 13.5563
> > number of documents = 8590000
> > average document length = 13.5568
> >
> > PS: Please do not ask me create smaller index and then merge them. I
> > am indexing 500 million documents. 20 million is my small index.
> >
> > --
> > Cheers,
> > Kevin Duraj
> >
>
>
More information about the Xapian-discuss
mailing list