[Xapian-discuss] a strange type of alias/expanded term

Andreas Marienborg andreas at startsiden.no
Thu Oct 16 09:58:56 BST 2008


On Oct 16, 2008, at 5:32 AM, Olly Betts wrote:

> On Mon, Oct 13, 2008 at 03:56:23PM +0200, Andreas Marienborg wrote:
>> I was wondering if there is any way I can coach queryparser into
>> something like this, so I don't have to pre-parse the query myself:
>> (pseudo code)
>>
>> my $query_string = 'jazz oslo today';
>>
>> $qp->add_alias('today' => 'D20081013');
>>
>> my $q = $qp->parse($query_string);
>>
>> is($q->get_description, '(jazz AND oslo AND D20081013)');
>
> This version is arguably slightly better since the date should act  
> as a
> boolean filter term:
>
> ((jazz AND oslo) FILTER D20081013)
>

True, the D will be a boolean-filter, so that would most likely be the  
end result?

> Both will match the same documents, but the weightings will be  
> slightly
> different.
>
> Not sure about the FILTER version, but the AND version can probably be
> achieved using synonyms:
>
> http://xapian.org/docs/synonyms.html
>
> Untested, but try something like:
>
>    # Only need to do this once per day...
>    $db->clear_synonyms("today");
>    $db->add_synonym("today", "D20081013");
>
>    $qp->set_database($db);
>    my $q = $qp->parse_query($query_string,
> 	    FLAG_PHRASE|FLAG_BOOLEAN|FLAG_LOVEHATE|FLAG_AUTO_SYNONYMS);
>


Yes, this hit me last night as well, that I can just keep changing the  
synonyms each day. Nice to get
your input that that might indeed be the best way, I'll def. try that  
route now.


>> basicly I want to somehow expand today to todays date, this week to a
>> range, tomorrow to something etc, but not sure how I might best do  
>> it?
>
> If you define multiple synonyms for the same word (by calling
> add_synonym() multiple times with the same first argument), they're
> ORed, and multi-word synonyms are supported with
> FLAG_AUTO_MULTIWORD_SYNONYMS), so `this week' is doable by defining it
> as a synonym for 7 D-prefix terms.  For `this year' you probably  
> want to
> add Y-prefix terms with just the year to avoid an OR of 365 or 366  
> date
> terms...
>

Yeah, I usually add Y M D on all documents, so that wouldn't be too  
hard. I guess I could also add W for instance, but it might be harder  
conceptually, so seven D might be just as good.

>> the other option, to pre-process, is doable I guess, but it might be
>> more error-prone?
>
> Yes, preprocessing input to the QueryParser like that is best avoided.
>

Good, then I will strive to avoid that :)


Thanks for your help!

- andreas




More information about the Xapian-discuss mailing list