[Xapian-discuss] Always returning ALL the documents matching a query

tata 668 tata668 at gmail.com
Mon Dec 29 09:35:43 GMT 2008


I still have a question about Xapian returning all documents or not. Or course it could use a big 
chunk of memory if a lot of documents are found but I'm not sure how I could do it otherwise... Let 
me explain.

This is for a forum search page. In Xapian, I only store the "ids" of the indexed posts as the 
documents "data", nothing more.

I use Xapian for full-text search of posts' text, but there are more criterias allowed for a search. 
For example, users may want to search for posts only by a specific member, posts added on a specific 
date, etc. All those criterias can be combined.

Xapian's job is currently to return the ids of the posts in which the body text contains the main 
search query. Then I would join those ids with the other criterias specified by the user in a SQL 
query, using something like "WHERE id IN(1,2,3) AND otherCriterias", where "1,2,3" would be the ids 
returned by Xapian. This could result in a big query (a lot of ids in the "IN" part of the query) 
but at least it works.

Since there are more criterias than just the indexed text, how could I use Xapian without asking it 
to return ALL documents matching the main search query?

Thanks in advance!



tata 668 wrote:
> Many thanks!
> 
> Julien
> 
> 
> James Aylett wrote:
>> On Sun, Dec 28, 2008 at 05:52:57PM -0500, tata 668 wrote:
>>
>>> Let's say I always want Xapian to return me ALL the documents
>>> matching a query, not a subset. What is the best way to achieve
>>> that?
>>
>> You could request an MSet of size the number of documents in your
>> database -- $db->get_doccount() in PHP will give you this.
>>
>> Of course, if you end up getting lots of matching documents, you could
>> well bomb out of a PHP request by running out of memory or something,
>> but that's an entirely different issue.
>>
>> J
>>
> 



More information about the Xapian-discuss mailing list