[Xapian-tickets] [Xapian] #245: All-stopword queries with two or more terms should ignore stopword list

Xapian nobody at xapian.org
Tue Apr 20 12:15:42 BST 2010


#245: All-stopword queries with two or more terms should ignore stopword list
-------------------------+--------------------------------------------------
 Reporter:  richard      |        Owner:  olly     
     Type:  defect       |       Status:  assigned 
 Priority:  normal       |    Milestone:  1.2.x    
Component:  QueryParser  |      Version:  SVN trunk
 Severity:  normal       |   Resolution:           
 Keywords:               |    Blockedby:           
 Platform:  All          |     Blocking:           
-------------------------+--------------------------------------------------
Changes (by olly):

  * milestone:  => 1.2.x


Old description:

> Currently, if a single word query is parsed, and that word is a stopword,
> the
> stopwording is ignored.  However, if a multiple word query is parsed, and
> all
> words are stopwords, the stopwording is applied (resulting in an empty
> query).
>
> If all the words in the query are stopwords, I think it may make sense to
> ignore
> the stopwording.  However, even if we decide to apply the stopwording in
> this
> case, we should be consistent in our behaviour.
>
> Some examples, in python:
>
> >>> import xapian
> >>> s=xapian.SimpleStopper()
> >>> s.add('foo')
> >>> s.add('bar')
> >>> qp=xapian.QueryParser()
> >>> qp.set_stopper(s)
> >>> str(qp.parse_query('foo'))
> 'Xapian::Query(foo:(pos=1))'
> >>> str(qp.parse_query('foo foo'))
> 'Xapian::Query()'
> >>> str(qp.parse_query('foo bar'))
> 'Xapian::Query()'
>
> Either the first parse_query() call should return Xapian::Query(), or the
> later
> ones should return non-empty queries.

New description:

 Currently, if a single word query is parsed, and that word is a stopword,
 the
 stopwording is ignored.  However, if a multiple word query is parsed, and
 all
 words are stopwords, the stopwording is applied (resulting in an empty
 query).

 If all the words in the query are stopwords, I think it may make sense to
 ignore
 the stopwording.  However, even if we decide to apply the stopwording in
 this
 case, we should be consistent in our behaviour.

 Some examples, in python:

 >>> import xapian
 >>> s=xapian.SimpleStopper()
 >>> s.add('foo')
 >>> s.add('bar')
 >>> qp=xapian.QueryParser()
 >>> qp.set_stopper(s)
 >>> str(qp.parse_query('foo'))
 'Xapian::Query(foo:(pos=1))'
 >>> str(qp.parse_query('foo foo'))
 'Xapian::Query()'
 >>> str(qp.parse_query('foo bar'))
 'Xapian::Query()'

 Either the first parse_query() call should return Xapian::Query(), or the
 later
 ones should return non-empty queries.

--

Comment:

 Marking for 1.2.x.

-- 
Ticket URL: <http://trac.xapian.org/ticket/245#comment:5>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list