[Xapian-tickets] [Xapian] #50: SynonymPostList

Xapian nobody at xapian.org
Mon Apr 20 14:32:06 BST 2009


#50: SynonymPostList
-------------------------+--------------------------------------------------
 Reporter:  olly         |        Owner:  richard  
     Type:  enhancement  |       Status:  assigned 
 Priority:  low          |    Milestone:  1.1.1    
Component:  Library API  |      Version:  SVN trunk
 Severity:  minor        |   Resolution:           
 Keywords:               |    Blockedby:           
 Platform:  All          |     Blocking:  104, 307 
-------------------------+--------------------------------------------------

Comment(by richard):

 Very tempted as I am to merge to 1.1.0, let's focus on releasing 1.1.0
 instead.  We should merge this shortly after release, though.

 Bits so far committed are fine.

 I believe denom == 0 happens when the only terms in the query are synonym
 terms.  Though reading the code, it looks like in that case
 termfreqandwts.size() will be 0, so maybe I'm wrong about that.  (Or it
 may have been true in an earlier version of the patch, but is no longer.)

 In queryparser.lemony, the intention is to build a query which matches all
 the possible completions of the word, but matches the exact word with a
 higher weight.  This is desirable because the user hasn't explicitly
 marked the query as being a wildcard query (unlike in the wildcard case
 where the user has added a * to the end of the word).  That said, I think
 it would be nicer if the resulting queries applied OP_SYNONYM to all the
 partial terms (across prefixes), and to all the non prefixed terms: so, if
 the prefixes are "A" and "B" and we have the terms "foo" and "foot" in the
 database with each prefix, we'd currently get, for a search for "foo":

  (Afoo SYNONYM Afoot) OR (Afoo) OR (Bfoo SYNONYM Bfoot) OR (Bfoo)

 whereas it would be better to get:

  (Afoo SYNONYM Afoot SYNONYM Bfoo SYNONYM Bfoot) OR (Afoo SYNONYM Bfoo)

 It's possible that Term::get_query() should be joining the terms across
 prefixes with OP_SYNONYM, too.  That's a change which is far more likely
 to affect existing queries than the others, though, so needs some thought.

 The Weight::init_() thing is probably just historical reasons.

 I would think it's easy to add test coverage for the missing line.

-- 
Ticket URL: <http://trac.xapian.org/ticket/50#comment:25>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list