[Xapian-devel] Re: writabledatabase_delete_document()

Alexander Lind malte at webstay.org
Fri Dec 1 20:44:37 GMT 2006


Is the answer to my question here to split the data into multiple databases?
I have tried to read as much as I can find on this subject on xapian.org
and elsewhere, but can't really seem to find the answer to how it should
be done.
Technically I know how to do it, but not logically. How many databases
should I aim for - ie, should I aim for them not to be over a certain
size, contain a certain amount of documents, or something else?
Should I distribute documents sequentially into them, or randomly, or
use some other scheme?

The data in the documents are products, and I add, delete and update
them continuously. There is no first-in-first-out rule, any product can
be updated or removed at any time, so putting them in the order they
arrive is of no particular use.

Starting to feel a little lonely in my thread here :p

Alec

Alexander Lind wrote:
> I should add that I use xapians own document_id as the argument to
> writabledatabase_delete_document().
> I have made it so the indexing part of my script saves the xapian
> document_id in the sql db.
>
> I just read a post somewhere on the net about how you can use term names
> to ID and delete items out of an xapian index instead of using a
> document_id. Is this faster?  (would seem strange if it was).
>
> Alec
>
> Alexander Lind wrote:
>   
>> Hi guys
>>
>> I have implemented xapian on a website, and it currently has about 2M
>> items in its index.
>>
>> Its all been working quite nicely so far, until I tried removing some
>> old items from the index (removing items when the index was smaller was
>> no problems at all).
>>
>> When I try to remove them now (using writabledatabase_delete_document()
>> via php), it halfway freezes up the machine, and the apache httpd runs
>> amok spawning more and more children, until I break the php script that
>> is trying to remove documents from xapian.
>>
>> >From my laymens point of view, it seems that the xapian delete document
>> function freezes up the OS on a filesystem level. Is this a correct
>> assessment?
>>
>> Any ideas on how I can get xapian _not_ to freeze up the system like
>> this when deleting documents?
>>
>> I'm using php 5.2.0, and xapian 0.9.9.
>>
>> This is a 'ls -lha' of the xapian index dir:
>> total 1.3G
>> drwxr-xr-x    2 malte    users         408 Dec  1 17:22 .
>> drwxr-xr-x   16 malte    users         400 Nov 15 20:21 ..
>> -rw-------    1 malte    users           0 Dec  1 17:22 db_lock
>> -rw-r--r--    1 malte    users          10 Nov 24 00:39 meta
>> -rw-r--r--    1 malte    users        468M Dec  1 17:22 position_DB
>> -rw-r--r--    1 malte    users        7.4K Dec  1 16:46 position_baseB
>> -rw-r--r--    1 malte    users        324M Dec  1 17:23 postlist_DB
>> -rw-r--r--    1 malte    users        5.1K Dec  1 16:46 postlist_baseB
>> -rw-r--r--    1 malte    users         61M Dec  1 17:22 record_DB
>> -rw-r--r--    1 malte    users         956 Dec  1 16:46 record_baseB
>> -rw-r--r--    1 malte    users        339M Dec  1 17:22 termlist_DB
>> -rw-r--r--    1 malte    users        5.4K Dec  1 16:46 termlist_baseB
>> -rw-r--r--    1 malte    users         99M Dec  1 17:22 value_DB
>> -rw-r--r--    1 malte    users        1.6K Dec  1 16:46 value_baseB
>>
>>
>> Thanks for your help.
>> Alec
>>
>>   
>>     
>
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tartarus.org/pipermail/xapian-devel/attachments/20061201/68c67be2/attachment.htm


More information about the Xapian-devel mailing list