[Xapian-discuss] Re: searching and sorting by date

Michel Pelletier michel at dialnetwork.com
Wed Mar 22 20:19:26 GMT 2006


Sungsoo Kim wrote:
>> SELECT ad.id FROM ad WHERE ad.subcategory_id=41 ORDER BY last_posted 
>> DESC LIMIT 0,51;
>>
>> is this essentially the same in Xapian as querying for subcategory_id 
>> = 41 and then using last_posted as the sortKey?  And, like this SQL 
>> query, can Xapian return the 50 must recent documents whose 
>> subcategory_id == 41?
> 
> 
> Yes. You need to index subcategory as a field and set boolean filter in
> the query to restrict by subcategory_id.

Ah I see, yes thank you for the clear example.

> You can see the example how to set boolean filter at 
> http://article.gmane.org/gmane.comp.search.xapian.general/1474
> 
> BTW I think you'd better to use MySQL query in the above case because it 
> does not include full text search condition. (You might have omitted it 
> for simplicity.)

I did exclude it for simplicity.  One question I have about the 
technique described in the above link, so if I have subcategory=41, that 
will be encoded as the term SUBCATEGORY41.  All that makes sense, and I 
have been able to confirm that works.

But what if someone who was authoring the document text used the term 
SUBCATEGORY41, either inadvertantly or on purpose.  Would that skew 
their document into more search results than would normally be the case? 
is there a way to prevent that? Do I have to keep my application 
specific terms and searchable text in two different databases?

>> But I haven't been able to find any further explanation, particularly, 
>> what "indexing with prefexes on probablistic terms from certain 
>> fields" means.  I tried querying for "subcateogry_id:41" but got no 
>> success.
> 
> 
> As far as I know we cannot use underscore in the field name because 
> query parser does not recognize it as a part of field name.

Thanks for that tip!

> And I have difficulty with helping you because I don't know how Xapwrap 
> indexes documents.

Yes, it is a bit tricky, the API is simple but as Jarrod pointed out 
there are great benefits to going straight with the swig wrapper.  I'll 
probably end up taking his advice once I figure this all out. ;)

-Michel




More information about the Xapian-discuss mailing list