[Xapian-discuss] Always returning ALL the documents matching a query
tata668 at gmail.com
Mon Dec 29 16:48:57 GMT 2008
Thanks a lot for all this help James.
I'll take a couple of days to try to figure it all!
James Aylett wrote:
> On Mon, Dec 29, 2008 at 10:39:23AM -0500, tata 668 wrote:
>> I'm still a litlle bit confused. What are the diferrence between a
>> "value", a "posting" and a "term"?
> See <http://www.xapian.org/docs/glossary.html> for these (and more).
>> - Being able to restrict the search on multiple criterias?
> Restrictions in searches are (usually) done using terms.
>> I have to prefixe all the terms in the documents (all the words?) by
>> a prefixe and then specify this prefixe to the queryparser before
> You don't have to use a prefix in all cases; it is common to have
> "general" text turned into unprefixed terms when it isn't
> stemmed. The QueryParser can both be given a default prefix to apply to the
> query it's building, and given a list of prefixes that can be used
> explicitly in the query (so you can do things like author:Orwell but
> store the term in the database as Aorwell, for instance).
> for more information.
>> - Being able to sort the result?
>> I have to add "values" to the documents and then use a "sorter" to
>> sort the documents by specifying which "value" to use for the
>> I'm really not sure. I would like to see an example of this kind of
>> use case in the "quickstart" guide! :-)
> The quickstart guide isn't really the place for this, because it
> should be short and get you up and running with basic Xapian usage
> quickly. However it would be great to have example code showing how to
> use the various more powerful features of Xapian.
> You can add notes about documentation that doesn't exist but you feel
> should do at <http://trac.xapian.org/wiki/MissingDocumentation>; for
> sample code, anyone can link in examples at
> <http://trac.xapian.org/wiki/SampleCode> -- we're aware that it's not
> as easy to find as it could be.
> There's actually a ticket for more sample code:
> <http://trac.xapian.org/ticket/281>, so noting specific things you'd
> like to see sample code for there would be helpful.
>> The following example would be really appreciated. For a forum
>> search page, how to:
>> - Index 3 forum posts (in a way that the following search is possible)
>> - Find which post(s) contain the phrase "hello word", have been posted
>> by user "john doe" and have been created february 12th 2008.
>> - Return them sorted by their last modification date (may be different
>> than the creation date) then by their id.
> It could be a bit simpler, by not thinking in terms of 'forum
> posts'. Possibly the easiest way of doing that would be to provide a
> scriptindex index script alongside the search code, or to extend
> simpleindex to store the needed values and terms.
> One of the difficulties here is that there are different ways of
> tackling the specific problem above. If you're only ever searching for
> posts on a single date, you might tackle that part differently to if
> you also need to search across a date range. Similarly, user "john
> doe" can be represented in different ways depending on how your system
> works (is the user name invariant? the displayed name? just some
> opaque user identifier?). For this reason, it might be better to break
> that example into several different pieces: ordering by modification
> date, searching by date, searching by date range, and restricting to
> specific creating users in this case.
> (To get you started, for those four cases you may want to use: sorting
> by date/id using a Sorter object; D-prefixed terms; values and a
> DateValueRangeProcessor; A-prefixed terms, although you'll need to
> decide what you put into the terms, and you may need to do build a
> more complex Query object than the QueryParser will do for you. I
> might knock out a couple of examples if I can reduce them to simple
> enough problems, but I'm in work-avoidance mode so I really should get
> on with something else :-)
> I'm aware this sounds slightly negative, but this is part of the
> problem with providing sample code that is simple enough to understand
> quickly, when people generally have more complex problems they are
> actually trying to solve. Feel free to keep bugging me about it though.
More information about the Xapian-discuss