[Xapian-discuss] Question: Query weights, Rset usage, lowercase

Andrey Kong alpha04 at netvigator.com
Sat Dec 9 01:55:11 GMT 2006


Hi

First of all, thank you for Xapian, its fast and stable. I use it to develop a search for Chinese data, and it works well.

I've been playing Xapian for 1 week (PHP) and able to insert terms, data, values in to the database and also able to search them out.
I add prefixes mannually (PT, PD, PP, PM...)(title, domain, URL path, meta...), and values(0) = timestamp.

Currently, I use Xapian to search, resulting the IDs for mySql database, and retrive Descriptions..etc from the mysql with the unique Key IDs. 

here are my questions:

1)How much cost if I put the Descriptions inside the Xapian.document.data field? (assume the Descriptions are unHTML contents of web pages), will the Xapian DB become very big and affects the preformance? (i have 1M docs when testing)

2)Since now i am able to search the Title(prefix PT, weight=20) and Descriptions(no prefix, weight=1) of the database, I begin wondering how to assign different weights to the Query. How to achive:

Query using "OR" (Microsoft , Keyboard , Mouse) 

which the term "Microsoft" =weight 5 | "Keyboard" = wieght 1 | "Mouse" = weight 1

Because its normal that ppl will type in the most important terms first and then the less important terms later, so i want to make the query in the same approach.

3)Since I add my own prefixes manually, I wonder does Xapian change all Terms into lowercase automatically? Or I need to do it manually?

4)when i query ("search engine") , if  I add 3 docs to the Rset, does this "Rset related to -search engine-" remains in the database? So next time I have the same query "search engine", the 3 docs in the Rset can be retrived from the database? how to do that?


Finally, once again, Xapian is very fast, thank you for the great project. I think it will be even more great, if there are 2-5 lines of example of usage in the API document. If every function has a 3-5 lines of codes of example of usage, we can understand the function and usage in 5secs. Without the example, I say I used 3-5 Hours to test it out myself, some just gave up...

Thanks
Andrey K.


More information about the Xapian-discuss mailing list