Hi all,<div>I wrote a program that implement the variable length encoding and fixed length encoding, and compares their index size and speed of search doc length.</div><div>You can see the comparison result from the attachment snapshot.</div>
<div><br></div><div>1. The posting list is in all memory;</div><div>2. The search strategy of fixed length encoding is skipping with exponential step (1, 2, 4, 8, ...). Once exceeds the desired doc id, back to previous step and skip with step 1.</div>
<div>3. The implemented fixed length encoding uses 4 bytes as fixed length. This is not optimal and can be further optimized in PFD. </div><div><div>4. The program generates uniform random doc id gap and doc len to make posting list.</div>
<div><br></div><div><b>You can access the code via my github: <a href="https://github.com/zwxxx/pfd_simple_test">https://github.com/zwxxx/pfd_simple_test</a></b></div>-- <br>Weixian Zhou<br>Department of Computer Science and Engineering<br>
University at Buffalo, SUNY<br><br>
</div>