[Xapian-discuss] improve indexing performance

Graham Jones your-name-here at grahamjones.org
Thu Sep 6 22:41:15 BST 2012


Kevin, are you replying to me or Elod?
I'm a bit confused about you MySQL comment.

You do get that I said 10,000 records a second; not 10,000 total? We have over 200 billion records indexed.

On 07/09/2012, at 2:05, Kevin Duraj <kevinduraj at gmail.com> wrote:

> For such a small amount of documents as 10,000 you can use any search engine as MySQL, Lucene or even MSSQL. 
> 
> If you want to index 300 millions of documents as I do at 
> http://myhealthcare.com nothing else works, only Xapian. 
> 
> PS: Hopefully, one day you become a professional search engine developer and you will know it, when you start counting your documents in millions, not thousands. 
> 
> Kevin Duraj
> MyHealthcare.com
> 
> On Sep 6, 2012, at 3:51 AM, Graham Jones <your-name-here at grahamjones.org> wrote:
> 
>> 1) index in ram - i.e. put your files in a ramdisk
>> 2) Index in parallel and merge with xapian-compact afterwards
>> 3) Just use the Xapian api as documented - you don't need to do anything special.
>> Its good for over 10,000 documents a second with a modest number of parallel processes (say 10-20) on a typical enterprise server.
>> 
>> 
>> On 06/09/2012, at 7:30 PM, Előd Biszak <biszakelod at gmail.com> wrote:
>> 
>>> Hi!
>>> 
>>> I'm indexing a huge amount of documents. I'm adding the documents one by
>>> one to tha database. Is there a way of improving indexing performance? I'm
>>> interested in suggestions also programmatical and hardware wise.
>>> 
>>> Thanks in advance,
>>> Biszak Előd
>>> _______________________________________________
>>> Xapian-discuss mailing list
>>> Xapian-discuss at lists.xapian.org
>>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>> 
>> 
>> _______________________________________________
>> Xapian-discuss mailing list
>> Xapian-discuss at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-discuss



More information about the Xapian-discuss mailing list