[Xapian-discuss] Excessive memory use when using FLAG_PARTIAL?

Olly Betts olly at survex.com
Tue Jan 11 12:59:32 GMT 2011


On Tue, Jan 11, 2011 at 12:45:05PM +0000, Richard Boulton wrote:
> On 11 January 2011 12:41, Olly Betts <olly at survex.com> wrote:
> > The memory overhead per term could probably be reduced, but actually
> > it's probably not useful to expand such short partial terms - a search
> > for all words starting with the same letter is just going to be too
> > noisy to be useful, regardless of the resources it would need.  So
> > my thought would be to add a minimum length for the partial words
> > which will be expanded under FLAG_PARTIAL, and probably a way to
> > specify this via the API.
> 
> Agreed - though perhaps setting a limit on the number of terms it
> expands to would be more useful (ie, it can try to expand, and if it
> finds more than N terms, it gives up and doesn't generate a query with
> extra terms at all).

The problem there is you do significant work before deciding that you
aren't going to expand after all.  So both limits are probably useful.

Incidentally, there's already a ticket for the "term limit" feature for
FLAG_WILDCARD:

http://trac.xapian.org/ticket/350

Cheers,
    Olly



More information about the Xapian-discuss mailing list