[Xapian-discuss] Stopping wildcard expansion at some point
Adam Sjøgren
asjo at koldfront.dk
Fri Mar 20 12:47:59 GMT 2009
On Thu, 19 Mar 2009 23:31:42 +0000, Olly wrote:
>> I may have a hundred thousand terms starting with MM, but only 20
>> starting with A, and it would be sad for me if the user couldn't search
>> for A*.
> That's probably extreme, but it's likely to be true for English text
> that e* might be undesirable while z* is fine. I'm not sure if either
> is actually useful for English though.
Yes, it sounds less plausible for "plain text", but when you mix various
kinds of codes and identifiers in there, it can become a problem.
> On Fri, Mar 06, 2009 at 12:06:52PM +0100, Adam Sjøgren wrote:
>> Attached is a patch updated from the feedback (Xapian::termcount,
>> QueryParserError, error message) for further consideration.
> I'm still wondering what to do about this if we don't want to prevent
> ourselves being able to push the wildcard expanding into the database
> backends. We could perhaps push this check with it, but then the
> rejection potentially happens rather late on. Or the check stays and
> we end up counting the matches up front if this option is on.
That is a little over my head architecturally; I appreciate that it
isn't straightforward.
> Can you attach the patch to a ticket in trac for now, so that it doesn't
> get forgotten about?
Sure - I have created a ticket now: http://trac.xapian.org/ticket/350
>> I wasn't quite sure how, in the error message, to display the term
>> exactly as the user entered it, the closest I found was "unstemmed",
>> which hasn't got the '*'.
> Yeah, that's probably the best choice (and just append a "*" to it).
Ah, I forgot to do that; I will update the patch in trac.
Thanks!
Adam
--
"We get our thursdays from a banana." Adam Sjøgren
asjo at koldfront.dk
More information about the Xapian-discuss
mailing list