Hi, all:<div><br></div><div>I have wrote a demo patch for Backend for Lucene format indexes, Lucene version is 3.6.2.</div><div><a href="http://lucene.apache.org/core/3_6_2/fileformats.html">http://lucene.apache.org/core/3_6_2/fileformats.html</a></div>
<div><br></div><div>Now, this demo patch just support the basic features in Lucene. Compound File(.cfs/.cfe)¡¢term vector(.tvx/.tvd/.tvf)</div><div>delete document(.del) are not supported, skip list in .fdx is not supported too</div>
<div><br></div><div>example/quest.cc is used to test this demo. query like this: field_name:term, or file_name:term1 AND field_name:term2</div><div><br></div><div>Until now, I found some data needed for BM25 in Xapian are not existed in Lucene:</div>
<div>1. doclength_lower_bound¡¢doclength_upper_bound</div><div>2. wdf_lower_bound¡¢wdf_uppper_bound</div><div>3. total_length</div><div>4. doclength(for each document)</div><div>1-3 are statistics data, can be caculated when doing copydatabase, and store them in somewhere. But doclengh is</div>
<div>hard to do this way.</div><div><br></div><div>1. some other data instead of doclength?</div><div>2. Xapian support other rank algorithm which does not need doclength? </div><div>Is there some suggestions to solve this problem?</div>
<div><br></div><div>And the demo patch is here:</div><div><a href="https://github.com/white127/xapian-patch/blob/master/xapian_lucene_demo.patch">https://github.com/white127/xapian-patch/blob/master/xapian_lucene_demo.patch</a></div>
<div><br></div><div>Regards</div><div>Jiang</div>