<div>Additional, I set fixed default values to datas which not existed in Lucene, to make this demo runable,<div>the demo is not fully tested </div><br><div class="gmail_quote">2013/6/16 jiangwen jiang <span dir="ltr"><<a href="mailto:jiangwen127@gmail.com" target="_blank">jiangwen127@gmail.com</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi, all:<div><br></div><div>I have wrote a demo patch for Backend for Lucene format indexes, Lucene version is 3.6.2.</div>
<div><a href="http://lucene.apache.org/core/3_6_2/fileformats.html" target="_blank">http://lucene.apache.org/core/3_6_2/fileformats.html</a></div>
<div><br></div><div>Now, this demo patch just support the basic features in Lucene. Compound File(.cfs/.cfe)¡¢term vector(.tvx/.tvd/.tvf)</div><div>delete document(.del) are not supported, skip list in .fdx is not supported too</div>
<div><br></div><div>example/quest.cc is used to test this demo. query like this: field_name:term, or file_name:term1 AND field_name:term2</div><div><br></div><div>Until now, I found some data needed for BM25 in Xapian are not existed in Lucene:</div>
<div>1. doclength_lower_bound¡¢doclength_upper_bound</div><div>2. wdf_lower_bound¡¢wdf_uppper_bound</div><div>3. total_length</div><div>4. doclength(for each document)</div><div>1-3 are statistics data, can be caculated when doing copydatabase, and store them in somewhere. But doclengh is</div>
<div>hard to do this way.</div><div><br></div><div>1. some other data instead of doclength?</div><div>2. Xapian support other rank algorithm which does not need doclength? </div><div>Is there some suggestions to solve this problem?</div>
<div><br></div><div>And the demo patch is here:</div><div><a href="https://github.com/white127/xapian-patch/blob/master/xapian_lucene_demo.patch" target="_blank">https://github.com/white127/xapian-patch/blob/master/xapian_lucene_demo.patch</a></div>
<div><br></div><div>Regards</div><span><font color="#888888"><div>Jiang</div>
</font></span></blockquote></div><br></div>