[Xapian-discuss] Phrase Query vs AND Query? Why don't these find the same things?

Olly Betts olly at survex.com
Thu Jun 1 19:05:16 BST 2006


On Thu, Jun 01, 2006 at 11:46:59AM -0400, jarrod roberson wrote:
> I assume this is not the best way to do this, so I guess I need to know what
> the best idiom to get back ALL matches to a query.

We don't attempt to preallocate slots for the number of results
requested or anything daft like that, so if you want all the matches,
just use something like:

    enquire.get_mset(0, db.get_lastdocid());

Or even:

    enquire.get_mset(0, UINT_MAX);

Although you'll run out of memory well before you manage to actually
create an MSet with 4 billion entries!

> I know it is not common, but I will always be wanting all results back for
> this particular project.

It's probably more sensible to try and process the results in batches if
there are lots, e.g.:

    Xapian::doccount chunk = 10000;

    Xapian::doccount start = 0;
    while (true) {
	Xapian::MSet mset = enquire.get_mset(start, chunk);
	if (mset.empty()) break;
	process_mset(mset);
	start += chunk;
    }

That way you can limit memory use.  Disk caching will mean that searches
after the first will be very fast.

Cheers,
    Olly



More information about the Xapian-discuss mailing list