[Xapian-discuss] Test Harness for testing the xapian algorithm and alterbative algorithm

Frank Chang frank_chang91 at hotmail.com
Mon Sep 2 14:01:37 BST 2013


 Mr. Olly Betts,
                     My management team is interested in using xapian. My management team would like me to create a C/C++ test harness for measuring the processing speed of Xapian  algorithm . Please let me know the specifications of the C/C++ test harness. 
 
> From: xapian-discuss-request at lists.xapian.org
> Subject: Xapian-discuss Digest, Vol 112, Issue 1
> To: xapian-discuss at lists.xapian.org
> Date: Mon, 2 Sep 2013 12:00:06 +0100
> 
> Send Xapian-discuss mailing list submissions to
> 	xapian-discuss at lists.xapian.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.xapian.org/mailman/listinfo/xapian-discuss
> or, via email, send a message with subject or body 'help' to
> 	xapian-discuss-request at lists.xapian.org
> 
> You can reach the person managing the list at
> 	xapian-discuss-owner at lists.xapian.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Xapian-discuss digest..."
> 
> 
> Today's Topics:
> 
>    1. having trouble with prefixes (Christopher Harvey)
>    2. Re: having trouble with prefixes (Olly Betts)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Sun, 01 Sep 2013 22:37:59 -0400
> From: Christopher Harvey <chris at basementcode.com>
> To: xapian-discuss at lists.xapian.org
> Subject: [Xapian-discuss] having trouble with prefixes
> Message-ID: <871u58osmw.fsf at basementcode.com>
> Content-Type: text/plain
> 
> I've got a small test database setup with one record.
> $ delve -r 1 -V /tmp/1/
> Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg
> Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg
> 
> The terms were added with lines like this:
> 	doc.add_term(string("P:") + path);
> 
> Problem is, I can't seem to run a query that returns the document using
> any of the terms. Here is the outline of the code that runs the queries
> I'm trying to run:
> 
> 	Database db(db_path.string());
> 	QueryParser queryparser;
> 	Stem stemmer("english");
> 	//queryparser.set_stemmer(stemmer);
> 	queryparser.set_database(db);
> 	queryparser.add_prefix("type", "T");
> 	queryparser.add_prefix("md5sum", "Q");
> 	queryparser.add_prefix("path", "P");
> 	queryparser.add_prefix("extension", "E");
> 	//maybe set stemming strategy here (in query parser)?
> 	queryparser.set_stemming_strategy(QueryParser::STEM_NONE);
> 	Query query(queryparser.parse_query(full_string));
> 	cout<<"Query is '"<<full_string<<"'"<<endl;
> 	Enquire enquire(db);
> 	enquire.set_query(query);
> 	MSet match_set(enquire.get_mset(0, 10));
> 	for_each(match_set.begin(), match_set.end(),
> 	         [&db](docid id) {
> 		         print_doc_info(db.get_document(id));
> 	         });
> 
> I expected the following query to work,
> md5sum:DD4F2162FFFF0E43741A4A1C2B8EC0E7
> but it returns nothing. Same for all the other terms and prefixes. Terms
> without prefixes seem to be working normally. I set stemming to NONE on
> everything.
> 
> All I want is a way to ask xapian to return a list of all documents with
> specific paths and/or md5sums.
> 
> thanks for any tips,
> Chris
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 2 Sep 2013 11:09:27 +0100
> From: Olly Betts <olly at survex.com>
> To: Christopher Harvey <chris at basementcode.com>
> Cc: xapian-discuss at lists.xapian.org
> Subject: Re: [Xapian-discuss] having trouble with prefixes
> Message-ID: <20130902100926.GG19292 at survex.com>
> Content-Type: text/plain; charset=us-ascii
> 
> On Sun, Sep 01, 2013 at 10:37:59PM -0400, Christopher Harvey wrote:
> > I've got a small test database setup with one record.
> > $ delve -r 1 -V /tmp/1/
> > Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg
> > Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg
> > 
> > The terms were added with lines like this:
> > 	doc.add_term(string("P:") + path);
> 
> Just add the prefix "P" here.
> 
> > Problem is, I can't seem to run a query that returns the document using
> > any of the terms. Here is the outline of the code that runs the queries
> > I'm trying to run:
> > 
> > 	Database db(db_path.string());
> > 	QueryParser queryparser;
> > 	Stem stemmer("english");
> > 	//queryparser.set_stemmer(stemmer);
> > 	queryparser.set_database(db);
> > 	queryparser.add_prefix("type", "T");
> > 	queryparser.add_prefix("md5sum", "Q");
> > 	queryparser.add_prefix("path", "P");
> 
> Or if you really want that colon in there, add the prefix as "P:" here.
> 
> > 	queryparser.add_prefix("extension", "E");
> > 	//maybe set stemming strategy here (in query parser)?
> > 	queryparser.set_stemming_strategy(QueryParser::STEM_NONE);
> > 	Query query(queryparser.parse_query(full_string));
> > 	cout<<"Query is '"<<full_string<<"'"<<endl;
> 
> If you print out query.get_description() it should be clearer what's
> going on.
> 
> Cheers,
>     Olly
> 
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
> 
> 
> ------------------------------
> 
> End of Xapian-discuss Digest, Vol 112, Issue 1
> **********************************************
 		 	   		  


More information about the Xapian-discuss mailing list