[Xapian-discuss] improve indexing performance
Graham Jones
your-name-here at grahamjones.org
Thu Sep 6 22:41:15 BST 2012
Kevin, are you replying to me or Elod?
I'm a bit confused about you MySQL comment.
You do get that I said 10,000 records a second; not 10,000 total? We have over 200 billion records indexed.
On 07/09/2012, at 2:05, Kevin Duraj <kevinduraj at gmail.com> wrote:
> For such a small amount of documents as 10,000 you can use any search engine as MySQL, Lucene or even MSSQL.
>
> If you want to index 300 millions of documents as I do at
> http://myhealthcare.com nothing else works, only Xapian.
>
> PS: Hopefully, one day you become a professional search engine developer and you will know it, when you start counting your documents in millions, not thousands.
>
> Kevin Duraj
> MyHealthcare.com
>
> On Sep 6, 2012, at 3:51 AM, Graham Jones <your-name-here at grahamjones.org> wrote:
>
>> 1) index in ram - i.e. put your files in a ramdisk
>> 2) Index in parallel and merge with xapian-compact afterwards
>> 3) Just use the Xapian api as documented - you don't need to do anything special.
>> Its good for over 10,000 documents a second with a modest number of parallel processes (say 10-20) on a typical enterprise server.
>>
>>
>> On 06/09/2012, at 7:30 PM, Előd Biszak <biszakelod at gmail.com> wrote:
>>
>>> Hi!
>>>
>>> I'm indexing a huge amount of documents. I'm adding the documents one by
>>> one to tha database. Is there a way of improving indexing performance? I'm
>>> interested in suggestions also programmatical and hardware wise.
>>>
>>> Thanks in advance,
>>> Biszak Előd
>>> _______________________________________________
>>> Xapian-discuss mailing list
>>> Xapian-discuss at lists.xapian.org
>>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>>
>>
>> _______________________________________________
>> Xapian-discuss mailing list
>> Xapian-discuss at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
More information about the Xapian-discuss
mailing list