[Xapian-discuss] What is the best way to represent a category hierarchy using term prefixes in Xapian?

Jim Razmus II bonetruck at gmail.com
Sun Nov 6 19:48:27 GMT 2011


Assume I have the following example hierarchy:

US
>Michigan
>>Detroit
>>Grand Rapids
>>Lansing
>Minnesota
>>Grand Rapids
>>Minneapolis
>>St Paul
>Ohio
>>Columbus
>>Grand Rapids
>>Sandusky

I see two ways that I could index a “Grand Rapids, Michigan” document with 
prefixed terms:

XFIRSTLEVELus
XSECONDLEVELmichigan
XTHIRDLEVELgrandrapids

or

XFIRSTLEVELus
XSECONDLEVELus_michigan
XTHIRDLEVELus_michigan_grandrapids

I’m inclined to use the second approach thinking that it will return more 
intuitive results. That is, a search that includes Grand Rapids, Michigan search 
criteria is less likely to include documents from Minnesota and Ohio.

However, two aspects of this approach bother me. First, the creation and 
maintenance of term prefixes for each level of the hierarchy feels wrong. 
Second, the concatenation of values seems like a surrogate for using weights.

So, what is the best way to represent a hierarchy with term prefixes?

Note, I posted this question to stackoverflow here:

http://stackoverflow.com/questions/7585948/what-is-the-best-way-to-represent-a-
category-hierarchy-using-term-prefixes-in-xa

I didn't get any responses so I thought I'd try here next.

Best regards,
Jim




More information about the Xapian-discuss mailing list