[Xapian-discuss] Xapian query language

Michel Pelletier michel at dialnetwork.com
Thu Mar 30 00:19:55 BST 2006


For a number of reasons, including not being able to express certain 
queries into Xapian's QueryParser class (see previous post on 
QueryParser bug), and the need to have users configure queries without 
writing Python code, I have hacked up a quick query language for Xapian 
that can best be described as "SQL-like" called xaql.  It requires the 
pyparsing module by Paul McGuire.

Xaql lets you specify query terms, both text and boolean that define a 
Xapian query, selection variables that define which document values you 
want to retrieve from the database, the values you want the results 
sorted by, whether ascending or descending, a match limit, and an offset 
from the match start (just like SQL).  For example, we have a Xapian 
database here with about 600K records with various boolean terms 
indexed.  Here is a xaql query and results off of that database:

$ python Xapclient.py "select xtitle, xlastposted where xstatus 
in(published, reposted, expired) and  xcountryid=4 and 
xsubcatego\ryid=41 and xsalestype IN(individual, professional) order by 
xlastposted limit 10"

Returned 10 values in 532.392024994 msec
xlastposted          xtitle
1128321436           vendesi
1128325944           BUONA OCCASIONE!!!!
1128326272           Vendo PUNTO 1.7 TD ELX
1128327372           AFFARE- offro auto accessoriatissima in ottime 
condizioni
1128327811           vendesi
1128328366           ferrari 348
1128331517           SAAB 900 T 2000 CABRIO - COLORE NERO - FULL OPTIONAL
1128333392           lancia dedra 2000 gas
1128335062           Toyota Corolla 1.3 SW 16V ottimo stato tutti 
tagliandi Toyota
1128335073           Vendo una bellisima Smart pulse cdi nera

The xaql parser turns the above into the Xapian query:

'Xapian::Query(((XSTATUSpublished OR XSTATUSreposted OR XSTATUSexpired) 
AND XCOUNTRYID4 AND XSUBCATEGORYID41 AND (XSALESTYPEindividual OR 
XSALESTYPEprofessional)))'

Xaql strongly supports boolean querying (this is where 
xapian.QueryParser is weakest) but also supports positional term 
searching using "quoted" syntax. Here is the same search as above, but 
with the text search for 'toyota':

$ python Xapclient.py "select xtitle, xlastposted where xstatus 
in(published, reposted, expired) and  xcountryid=4 and 
xsubcatego\ryid=41 and xsalestype IN(individual, professional) and 
'toyota' order by xlastposted limit 10"

Returned 10 values in 117.490053177 msec
xlastposted          xtitle
1128335062           Toyota Corolla 1.3 SW 16V ottimo stato tutti 
tagliandi Toyota
1128350630           vendo toyota land cruiser
1128451520           ToYota MR2
1128835801           TOYOTA Celica GTfour turbo 4wd 246cv
1129113767           vendo 4 cerchi in lega x toyota celica
1129114176           vendo sedili in pelle x toyota celica
1129283406           VENDO TOYOTA 4 RUNNER
1129538031           VENDO TOYOTA HI-LUX SR5 5 PORTE OTTOBRE 2002
1129616773           vendo Toyota Rav4 sol 5p anno 2004
1129786202           vendo toyota lj70vx preparato

I plan to keep maintaining xaql for our own purposes.  If anyone is 
interested in the code or have any suggestions for xaql, please let me 
know.

Thanks!

-Michel






More information about the Xapian-discuss mailing list