[Xapian-discuss] What is the best way to represent a category hierarchy using term prefixes in Xapian?
Jim Razmus II
bonetruck at gmail.com
Sun Nov 6 19:48:27 GMT 2011
Assume I have the following example hierarchy:
US
>Michigan
>>Detroit
>>Grand Rapids
>>Lansing
>Minnesota
>>Grand Rapids
>>Minneapolis
>>St Paul
>Ohio
>>Columbus
>>Grand Rapids
>>Sandusky
I see two ways that I could index a “Grand Rapids, Michigan” document with
prefixed terms:
XFIRSTLEVELus
XSECONDLEVELmichigan
XTHIRDLEVELgrandrapids
or
XFIRSTLEVELus
XSECONDLEVELus_michigan
XTHIRDLEVELus_michigan_grandrapids
I’m inclined to use the second approach thinking that it will return more
intuitive results. That is, a search that includes Grand Rapids, Michigan search
criteria is less likely to include documents from Minnesota and Ohio.
However, two aspects of this approach bother me. First, the creation and
maintenance of term prefixes for each level of the hierarchy feels wrong.
Second, the concatenation of values seems like a surrogate for using weights.
So, what is the best way to represent a hierarchy with term prefixes?
Note, I posted this question to stackoverflow here:
http://stackoverflow.com/questions/7585948/what-is-the-best-way-to-represent-a-
category-hierarchy-using-term-prefixes-in-xa
I didn't get any responses so I thought I'd try here next.
Best regards,
Jim
More information about the Xapian-discuss
mailing list