<div>I am trying to read source code and implementing DPH. and come across some questions. could anyone give some help? thanks</div><div><br></div><div>Question about "Weighting Schema" source code.</div><div>The following code is from "\xapian-core-1.2.4\include\xapian\weight.h", </div>
<div><i><br></i></div><div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
/// An lower bound on the maximum length of any document in the database.</blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
Xapian::termcount doclength_lower_bound_;</blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
<br></blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
/// An upper bound on the maximum length of any document in the database.</blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
Xapian::termcount doclength_upper_bound_;</blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
<br></blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
/// An upper bound on the wdf of this term.</blockquote><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; font-style: italic; ">
Xapian::termcount wdf_upper_bound_;</blockquote><div style="font-style: italic; "><br></div><div style="font-style: italic; ">.........................</div><div style="font-style: italic; ">..........................</div>
<div style="font-style: italic; "> </div><div style="font-style: italic; "><div> /** Allow the subclass to perform any initialisation it needs to.</div><div> *</div><div> * @param factor<span class="Apple-tab-span" style="white-space:pre">        </span> Any scaling factor (e.g. from OP_SCALE_WEIGHT).</div>
<div> */</div><div> virtual void init(double factor) = 0;</div><div> </div><div> /** Calculate the weight contribution for this object's term to a document.</div><div> *</div><div> * The parameters give information about the document which may be used</div>
<div> * in the calculations:</div><div> *</div><div> * @param wdf The within document frequency of the term in the document.</div><div> * @param doclen The document's length (unnormalised).</div>
<div> */</div><div> virtual Xapian::weight get_sumpart(Xapian::termcount wdf,</div><div><span class="Apple-tab-span" style="white-space:pre">                                </span> Xapian::termcount doclen) const = 0;</div><div><br></div>
<div> /** Return an upper bound on what get_sumpart() can return for any document.</div><div> *</div><div> * This information is used by the matcher to perform various</div><div> * optimisations, so strive to make the bound as tight as possible.</div>
<div> */</div><div> virtual Xapian::weight get_maxpart() const = 0;</div><div><br></div><div> /** Calculate the term-independent weight component for a document.</div><div> *</div><div> * The parameter gives information about the document which may be used</div>
<div> * in the calculations:</div><div> *</div><div> * @param doclen The document's length (unnormalised).</div><div> */</div><div> virtual Xapian::weight get_sumextra(Xapian::termcount doclen) const = 0;</div>
<div><br></div><div> /** Return an upper bound on what get_sumextra() can return for any</div><div> * document.</div><div> *</div><div> * This information is used by the matcher to perform various</div><div>
* optimisations, so strive to make the bound as tight as possible.</div><div> */</div><i><div style="display: inline !important; "><div style="display: inline !important; "> virtual Xapian::weight get_maxextra() const = 0;</div>
</div></i> </div><div style="font-style: italic; "><br></div><div style="font-style: italic; "><br></div><div><b>Q1:</b> what is the purpose of "<div style="display: inline !important; "><div style="display: inline !important; ">
<div style="display: inline !important; ">virtual Xapian::weight get_maxpart() const = 0;</div></div></div><div style="display: inline !important; ">" and "</div><div style="display: inline !important; "><div style="display: inline !important; ">
<div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; "> virtual Xapian::weight get_maxextra() const = 0;</div></div> </div></div></div><div style="display: inline !important; ">
<div style="display: inline !important; "> " ? when do these methods be called ?</div></div></div><div><div style="display: inline !important; "><div style="display: inline !important; "><br></div></div></div><div><div style="display: inline !important; ">
<div style="display: inline !important; "><b>Q2:</b> In Xapian, BM25Weight is the fault weighting method. I want to know when and where and how </div></div><div style="display: inline !important; "><div style="display: inline !important; ">
<div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; ">BM25Weight is used in Xapian's source code? maybe this question involved many codes. I think that Weighting happens after submitting query terms, and during the match. for example in "multimatch.cc </div>
</div></div></div></div><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; ">
<div style="display: inline !important; "><div style="display: inline !important; ">void </div></div></div></div></div></div></div><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; ">
<div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; ">
<div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; "><div style="display: inline !important; ">MultiMatch::get_mset(...)", but this method is quite complex. I am not sure about it. </div>
</div></div></div></div></div></div></div></div></div></div></div></div></div><div style="font-style: italic; "><br></div></div><i><div><i><br></i></div><div><i><br></i></div>Wenjin Wu</i><div><br></div><br>
<br><br><div class="gmail_quote">2011/3/29 wuwenjin <span dir="ltr"><<a href="mailto:kevin.wu86@gmail.com">kevin.wu86@gmail.com</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div>hi, Olly</div><div>I have submitted my proposal for "Weighting Schema" . if you get some time to read my proposal, I will appreciate your suggestions about it. </div><a href="http://socghop.appspot.com/gsoc/proposal/review/google/gsoc2011/kevinking/1001#" target="_blank">http://socghop.appspot.com/gsoc/proposal/review/google/gsoc2011/kevinking/1001#</a><div>
<br></div><div><a href="http://socghop.appspot.com/gsoc/proposal/review/google/gsoc2011/kevinking/1001#" target="_blank"></a>Regards<br clear="all"><div><i><br></i></div><i>Wenjin Wu</i>
</div>
</blockquote></div><br>