[Xapian-discuss] a strange type of alias/expanded term

Olly Betts olly at survex.com
Thu Oct 16 04:32:49 BST 2008


On Mon, Oct 13, 2008 at 03:56:23PM +0200, Andreas Marienborg wrote:
> I was wondering if there is any way I can coach queryparser into  
> something like this, so I don't have to pre-parse the query myself:  
> (pseudo code)
> 
> my $query_string = 'jazz oslo today';
> 
> $qp->add_alias('today' => 'D20081013');
> 
> my $q = $qp->parse($query_string);
> 
> is($q->get_description, '(jazz AND oslo AND D20081013)');

This version is arguably slightly better since the date should act as a
boolean filter term:

((jazz AND oslo) FILTER D20081013)

Both will match the same documents, but the weightings will be slightly
different.

Not sure about the FILTER version, but the AND version can probably be
achieved using synonyms:

http://xapian.org/docs/synonyms.html

Untested, but try something like:

    # Only need to do this once per day...
    $db->clear_synonyms("today");
    $db->add_synonym("today", "D20081013");

    $qp->set_database($db);
    my $q = $qp->parse_query($query_string,
	    FLAG_PHRASE|FLAG_BOOLEAN|FLAG_LOVEHATE|FLAG_AUTO_SYNONYMS);

> basicly I want to somehow expand today to todays date, this week to a  
> range, tomorrow to something etc, but not sure how I might best do it?

If you define multiple synonyms for the same word (by calling
add_synonym() multiple times with the same first argument), they're
ORed, and multi-word synonyms are supported with
FLAG_AUTO_MULTIWORD_SYNONYMS), so `this week' is doable by defining it
as a synonym for 7 D-prefix terms.  For `this year' you probably want to
add Y-prefix terms with just the year to avoid an OR of 365 or 366 date
terms...

> the other option, to pre-process, is doable I guess, but it might be  
> more error-prone?

Yes, preprocessing input to the QueryParser like that is best avoided.

Cheers,
    Olly



More information about the Xapian-discuss mailing list