[Xapian-discuss] Re: Re: get_docid over multi-database search

Andrey alpha04 at netvigator.com
Fri Dec 14 19:18:12 GMT 2007


Kevin

Unfortunately, I didn't have a chance to compare the data since I already 
break-up the db at beginning, in my xapian writer.

The outcome of my 40M doc over 2 db (1 keep flushing 30mins) is still very 
good @below 1 sec...(my query is very complicated with lots of (a b c d)OR(f 
g h i g)AND_MAYBE_AND 4*(x x ). I think there isn't much performance lost by 
breaking up the db

I will try to combine them and print out the results over these 2 scenarios 
and post it here when i able to..
But i personally think the idea of breaking into multiple dbs has more gain 
than loss.
easiler to handle / backup
incase 1 corrupted, u still have somthing to serve
base db(non-flusing) vs updating db(will flush), cache(warmup) of base db 
stays when flusing 2nd db. (i am not sure about this, just a guess :P)

from my own experience, breaking up into dbs will not cause a big 
preformance lost, like from 1sec to 2 secs, it just works like querying 1 db 
after cached up
maybe you can try to duplicate another copy of your db and serach over them 
together, its very easy with just 1 extra line 
db=db.add_database(xapian.Database(''db"))

Andrey


"Kevin Duraj" <kevin.softdev at gmail.com> wrote in message 
news:562be3af0712132340nb216e26re53fc70f4276bfb0 at mail.gmail.com...
> Andrey,
>
> Did you measure the performance loss by searching two databases
> instead of one database?
> And if, how much slower is to search two databases compare to one database 
> ?
>
>
> _________________________________
>  Kevin Duraj
>  http://UncensoredWebSearch.com
>
>
> On Nov 21, 2007 4:43 PM, Andrey <alpha04 at netvigator.com> wrote:
>> Very Nice, thanks
>>
>> did_raw = (did_merged - 1) / number_of_databases + 1
>>
>> offset = did_merged % number_of_databases
>>
>> Cheers
>> Andrey
>>
>>
>>
>>
>> "Olly Betts" <olly at survex.com> wrote in message
>> news:20071122001120.GJ3839 at survex.com...
>>
>> > On Wed, Nov 21, 2007 at 01:44:11PM -0800, Andrey wrote:
>> >> say I have 2 databases, DB1 and DB2
>> >>
>> >> after i preformed a search over these 2 DBs, i have 1 result and I 
>> >> want
>> >> to
>> >> delete this resulting doc, how do i identify which database (DB1 / 
>> >> DB2)
>> >> this
>> >> document resides? and how to get its docid which is needed during the
>> >> delete
>> >> process
>> >
>> >
>> > http://article.gmane.org/gmane.comp.search.xapian.general/1375
>> >
>> >> (delete process must over single writeableDatabase,right?)
>> >
>> > Yes, WritableDatabase doesn't currently allow multiple subdatabases.
>> >
>> > Cheers,
>> >    Olly
>> >
>>
>>
>>
>>
>> _______________________________________________
>> Xapian-discuss mailing list
>> Xapian-discuss at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>>
>
>
>
> -- 
> Cheers
> __________________________________
>  Kevin Duraj
>  http://UncensoredWebSearch.com 






More information about the Xapian-discuss mailing list