[Xapian-tickets] [Xapian] #280: Review storage of parameters in Query

Xapian nobody at xapian.org
Sun Jul 19 18:31:26 BST 2009


#280: Review storage of parameters in Query
-------------------------+--------------------------------------------------
 Reporter:  richard      |       Owner:  olly     
     Type:  enhancement  |      Status:  assigned 
 Priority:  low          |   Milestone:  1.1.7    
Component:  Library API  |     Version:  SVN trunk
 Severity:  normal       |    Keywords:           
Blockedby:               |    Platform:  All      
 Blocking:               |  
-------------------------+--------------------------------------------------

Old description:

> Currently, Xapian::Query::Internal stores any "double" parameter value as
> a string using serialise_double().  There is a FIXME in the code for
> set_dbl_parameter() and get_dbl_parameter() (around line 976 of
> api/omqueryinternal.cc) saying: "FIXME: rework for 1.1.0".
>
> This hasn't been changed until now due to fear of breaking ABI
> compatibility.
>
> Instead, we should store double parameters as doubles in Query::Internal.
>
> While reorganising this, it might be worth making parameter storage a bit
> more general, and tidying it up.  We currently have the following
> parameters stored in Query::Internal:
>
>  - op: The operation to perform
>  - subqs: A list of subqueries
>  - parameter: A "termcount" - used by NEAR and PHRASE to be the window
> size, used by ELITE_SET to be the number of terms, and used by RANGE to
> be the value number to apply the range to.  For the last of these, a
> "termcount" type isn't really appropriate (though it is probably the same
> storage size as "valueno" at present, so it probably works correctly).
>  - tname: A string holding the term, for a leaf query.  The start of the
> range, for a range query.
>  - str_parameter: The end of the range for a range query.  The result of
> serialise_double() on the multiplier for OP_SCALE_WEIGHT queries.
>  - term_pos: The position of the term for leaf queries.
>  - wqf: The within query frequency, for leaf queries.
>  - external_source: The external source, for external source queries.
>
> Two approaches seem plausible to me - firstly, we could define a union
> with the possible parameter types, and store the parameters in a list of
> these unions.  Alternatively, we could subclass Query::Internal for each
> of the possible query types, and just store the appropriate parameters
> for each.
>
> The latter approach seems cleaner to me, and more likely to be flexible
> for future expansion of the available query operators, but I've not
> thought about this much yet.

New description:

 Currently, Xapian::Query::Internal stores any "double" parameter value as
 a string using serialise_double().  There is a FIXME in the code for
 set_dbl_parameter() and get_dbl_parameter() (around line 976 of
 api/omqueryinternal.cc) saying: "FIXME: rework for 1.1.0".

 This hasn't been changed until now due to fear of breaking ABI
 compatibility.

 Instead, we should store double parameters as doubles in Query::Internal.

 While reorganising this, it might be worth making parameter storage a bit
 more general, and tidying it up.  We currently have the following
 parameters stored in Query::Internal:

  - op: The operation to perform
  - subqs: A list of subqueries
  - parameter: A "termcount" - used for the wqf of leaf queries, in NEAR
 and PHRASE it's the window size, used by ELITE_SET to be the number of
 terms, and used by RANGE to be the value number to apply the range to.
 For the last of these, a "termcount" type isn't really appropriate (though
 it is probably the same storage size as "valueno" at present, so it
 probably works correctly).
  - tname: A string holding the term, for a leaf query.  The start of the
 range, for a range query.
  - str_parameter: The end of the range for a range query.  The result of
 serialise_double() on the multiplier for OP_SCALE_WEIGHT queries.
  - term_pos: The position of the term for leaf queries.
  - external_source: The external source, for external source queries.

 Two approaches seem plausible to me - firstly, we could define a union
 with the possible parameter types, and store the parameters in a list of
 these unions.  Alternatively, we could subclass Query::Internal for each
 of the possible query types, and just store the appropriate parameters for
 each.

 The latter approach seems cleaner to me, and more likely to be flexible
 for future expansion of the available query operators, but I've not
 thought about this much yet.

--

Comment(by olly):

 In trunk r13095, I've removed wqf - we now store the wqf in the parameter
 member.  Updated the description to match this new scheme.  This should
 save us 4 bytes per Query object.

-- 
Ticket URL: <http://trac.xapian.org/ticket/280#comment:7>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list