[Xapian-discuss] Two questions

roki roki rokiroki at gmx.net
Tue May 3 23:19:45 BST 2005


> Do you mean "reduced by 60%" (rather than "reduced to 60%")?

Reduced by 65%, database went form ~9GB to 3.8GB.

I want to mention that I have compiled Xapian with BM25Weight() : k1(10000),
k2(0), k3(0), b(0), min_normlen(0.5) because I need to get documents with
most frequencies at the top.

I also use only add_term with a lot frequencies (based on "pagerank" and
html formatting)  and for adding/replacing documents I use the nest
procedure:

$code= unique id from my mysql database

$doc = Search::Xapian::Document->new(); 
 
$doc->set_data("$code"); 

$doc->add_term("blablabla", 500);

$database->replace_document($code,$doc);


I can create Xapian database with a few hundreds documents and sent it to
you if you want.

Thanks!
Roki

> "By 60%" would be quite an extreme reduction (even by 40% is rather more
> than I've generally seen), but lots of replacement can leave blocks less
> full.  If you can provide a program and datafiles I can use to reproduce
> this, I'll take a look and see if this can be improved.
> 
> Cheers,
>     Olly


-- 
+++ Neu: Echte DSL-Flatrates von GMX - Surfen ohne Limits +++
Always online ab 4,99 Euro/Monat: http://www.gmx.net/de/go/dsl



More information about the Xapian-discuss mailing list