[Xapian-tickets] [Xapian] #557: Allow subqueries to use separate weighting schemes
Xapian
nobody at xapian.org
Thu Jul 28 13:51:17 BST 2011
#557: Allow subqueries to use separate weighting schemes
-------------------------+--------------------------------------------------
Reporter: richard | Owner: olly
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Library API | Version:
Severity: normal | Blockedby:
Platform: All | Blocking:
-------------------------+--------------------------------------------------
When using multiple sources of weighting information, it would be very
handy to be able to use separate weighting schemes for some subqueries.
This could be implemented by adding a Query operator which takes a
subquery and a Weight object, and causes the Weight object to be used for
all posting lists generated by the subquery.
Example of a situation this could be useful in: Imagine a database of
documents tagged with people IDs. Suppose that people who are tagged in
more events are considered more important, since they may represent
"authorities" in the social network. If searching for documents matching
a query, and also matching a set of IDs, the query part may want to use
standard BM25 weighting, but the ID part may want to use a weighting
scheme which applies a higher weight to IDs with a higher termfrequency,
rather than a lower weight.
Things to think about:
- What to do about the term-independent part of the weight (probably we'd
just use the term-independent part of the top-level weight).
- How does this interact with OP_SYNONYM?
- Should the query length each weight object sees be the global query
length, or just the length of the part of the object with the adjusted
weight? If the latter, should the query length of the parts of the query
without the adjusted weight be reduced accordingly?
--
Ticket URL: <http://trac.xapian.org/ticket/557>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list