Improving partial lookup results

Olly Betts olly at survex.com
Wed Sep 18 22:09:51 BST 2019


On Tue, Sep 17, 2019 at 01:27:08PM +0100, Peter Bowyer wrote:
> It handles partial phrases well, so long as the first part is complete
> (e.g. "Peter Bow" expands well). If instead I type "P Bow" it fails to
> work, as the expansion is done at the end.

The QueryParser::FLAG_PARTIAL feature aims to support a "search as you
type" feature, so it only expands a potentially incomplete word at the
end of the query (and the expansion won't happen if there's a space
entered after that, so e.g. `Peter Bow ` is left alone).

> Is there a good way to handle this? I tried to add a wildcard in the
> string and skip the query parser, but ended up with zero results.

If you mean something like Xapian::Query("Peter Bow*") that will try to
search for the single literal term `Peter Bow*`, which indeed wouldn't
match anything in most databases.

If you really wanted to wildcard expand all words in a query string,
you'd have to parse it yourself, turn each word into an OP_WILDCARD
query and combine those.

I'd think that's likely to create a lot of false matches though, and
wildcards are relatively expensive so you might want to limit how many
words get wildcarded in a single query to avoid problems.

> Also sometimes (though not always) substring matches would help - the Ann
> examples in the notebook illustrate this.

There's expanded support for wildcards on git master, so you could
create an OP_WILDCARD query for `*ann*`, though that seems even more
likely to result in a lot of false matches and will tend to be more
expensive too.

Cheers,
    Olly



More information about the Xapian-discuss mailing list