[Xapian-discuss] How to speed up indexing ?
cel tix44
celtix44 at gmail.com
Mon Aug 25 00:21:36 BST 2008
There was a logical bug in my indexing function (below) --- I was
reusing the same document instance without clearing its terms. Having
added doc.clear_terms() prior to index.index_text(), I saw the
expected throughput of ~4000 doc/sec.
Thanks everyone for your time & advice.
Regards
Celto
/////////////////////////////////////////////
void XXIndexRecord(char* text)
/////////////////////////////////////////////
{
/* !!!!!!!!!!!! CLEAR DOCUMENT TERMS !!!!!!!!!!!!!!!!!!!!!!! */
doc.clear_terms();
/* !!!!!!!!!!!! CLEAR DOCUMENT TERMS !!!!!!!!!!!!!!!!!!!!!!! */
indexer.index_text(text); // index text
// Add the document to the database
xdb->add_document(doc);
xrn++ ;
if (xrn > 200000) {
//MessageBox(NULL, "committing transaction", "msg", 0);
xrn = 0;
xdb->commit_transaction();
xdb->begin_transaction(true);
}
}
On Sun, Aug 24, 2008 at 11:24 PM, Olly Betts <olly at survex.com> wrote:
> On Fri, Aug 22, 2008 at 07:16:57PM -0700, mark wrote:
>> I have the exact same problem in x86_64 fedora core 9 linux, 16GB
>> RAM, dual quad core, using python xappy library.
>
> There's a separate mailing list for xappy, which is a better place to
> bring up issues you have when using xappy unless you can also reproduce
> them directly with Xapian.
>
> Cheers,
> Olly
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>
More information about the Xapian-discuss
mailing list