[Xapian-devel] Re: writabledatabase_delete_document()
Alexander Lind
malte at webstay.org
Fri Dec 1 20:44:37 GMT 2006
Is the answer to my question here to split the data into multiple databases?
I have tried to read as much as I can find on this subject on xapian.org
and elsewhere, but can't really seem to find the answer to how it should
be done.
Technically I know how to do it, but not logically. How many databases
should I aim for - ie, should I aim for them not to be over a certain
size, contain a certain amount of documents, or something else?
Should I distribute documents sequentially into them, or randomly, or
use some other scheme?
The data in the documents are products, and I add, delete and update
them continuously. There is no first-in-first-out rule, any product can
be updated or removed at any time, so putting them in the order they
arrive is of no particular use.
Starting to feel a little lonely in my thread here :p
Alec
Alexander Lind wrote:
> I should add that I use xapians own document_id as the argument to
> writabledatabase_delete_document().
> I have made it so the indexing part of my script saves the xapian
> document_id in the sql db.
>
> I just read a post somewhere on the net about how you can use term names
> to ID and delete items out of an xapian index instead of using a
> document_id. Is this faster? (would seem strange if it was).
>
> Alec
>
> Alexander Lind wrote:
>
>> Hi guys
>>
>> I have implemented xapian on a website, and it currently has about 2M
>> items in its index.
>>
>> Its all been working quite nicely so far, until I tried removing some
>> old items from the index (removing items when the index was smaller was
>> no problems at all).
>>
>> When I try to remove them now (using writabledatabase_delete_document()
>> via php), it halfway freezes up the machine, and the apache httpd runs
>> amok spawning more and more children, until I break the php script that
>> is trying to remove documents from xapian.
>>
>> >From my laymens point of view, it seems that the xapian delete document
>> function freezes up the OS on a filesystem level. Is this a correct
>> assessment?
>>
>> Any ideas on how I can get xapian _not_ to freeze up the system like
>> this when deleting documents?
>>
>> I'm using php 5.2.0, and xapian 0.9.9.
>>
>> This is a 'ls -lha' of the xapian index dir:
>> total 1.3G
>> drwxr-xr-x 2 malte users 408 Dec 1 17:22 .
>> drwxr-xr-x 16 malte users 400 Nov 15 20:21 ..
>> -rw------- 1 malte users 0 Dec 1 17:22 db_lock
>> -rw-r--r-- 1 malte users 10 Nov 24 00:39 meta
>> -rw-r--r-- 1 malte users 468M Dec 1 17:22 position_DB
>> -rw-r--r-- 1 malte users 7.4K Dec 1 16:46 position_baseB
>> -rw-r--r-- 1 malte users 324M Dec 1 17:23 postlist_DB
>> -rw-r--r-- 1 malte users 5.1K Dec 1 16:46 postlist_baseB
>> -rw-r--r-- 1 malte users 61M Dec 1 17:22 record_DB
>> -rw-r--r-- 1 malte users 956 Dec 1 16:46 record_baseB
>> -rw-r--r-- 1 malte users 339M Dec 1 17:22 termlist_DB
>> -rw-r--r-- 1 malte users 5.4K Dec 1 16:46 termlist_baseB
>> -rw-r--r-- 1 malte users 99M Dec 1 17:22 value_DB
>> -rw-r--r-- 1 malte users 1.6K Dec 1 16:46 value_baseB
>>
>>
>> Thanks for your help.
>> Alec
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tartarus.org/pipermail/xapian-devel/attachments/20061201/68c67be2/attachment.htm
More information about the Xapian-devel
mailing list