[Xapian-discuss] improve indexing performance

Kevin Duraj kevinduraj at gmail.com
Thu Sep 6 17:05:51 BST 2012


For such a small amount of documents as 10,000 you can use any search engine as MySQL, Lucene or even MSSQL. 

If you want to index 300 millions of documents as I do at 
http://myhealthcare.com nothing else works, only Xapian. 

PS: Hopefully, one day you become a professional search engine developer and you will know it, when you start counting your documents in millions, not thousands. 

Kevin Duraj
MyHealthcare.com

On Sep 6, 2012, at 3:51 AM, Graham Jones <your-name-here at grahamjones.org> wrote:

> 1) index in ram - i.e. put your files in a ramdisk
> 2) Index in parallel and merge with xapian-compact afterwards
> 3) Just use the Xapian api as documented - you don't need to do anything special.
> Its good for over 10,000 documents a second with a modest number of parallel processes (say 10-20) on a typical enterprise server.
> 
> 
> On 06/09/2012, at 7:30 PM, Előd Biszak <biszakelod at gmail.com> wrote:
> 
>> Hi!
>> 
>> I'm indexing a huge amount of documents. I'm adding the documents one by
>> one to tha database. Is there a way of improving indexing performance? I'm
>> interested in suggestions also programmatical and hardware wise.
>> 
>> Thanks in advance,
>> Biszak Előd
>> _______________________________________________
>> Xapian-discuss mailing list
>> Xapian-discuss at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
> 
> 
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss



More information about the Xapian-discuss mailing list