[Xapian-discuss] Test Harness for testing the xapian algorithm and alterbative algorithm
Frank Chang
frank_chang91 at hotmail.com
Mon Sep 2 14:01:37 BST 2013
Mr. Olly Betts,
My management team is interested in using xapian. My management team would like me to create a C/C++ test harness for measuring the processing speed of Xapian algorithm . Please let me know the specifications of the C/C++ test harness.
> From: xapian-discuss-request at lists.xapian.org
> Subject: Xapian-discuss Digest, Vol 112, Issue 1
> To: xapian-discuss at lists.xapian.org
> Date: Mon, 2 Sep 2013 12:00:06 +0100
>
> Send Xapian-discuss mailing list submissions to
> xapian-discuss at lists.xapian.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
> or, via email, send a message with subject or body 'help' to
> xapian-discuss-request at lists.xapian.org
>
> You can reach the person managing the list at
> xapian-discuss-owner at lists.xapian.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Xapian-discuss digest..."
>
>
> Today's Topics:
>
> 1. having trouble with prefixes (Christopher Harvey)
> 2. Re: having trouble with prefixes (Olly Betts)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 01 Sep 2013 22:37:59 -0400
> From: Christopher Harvey <chris at basementcode.com>
> To: xapian-discuss at lists.xapian.org
> Subject: [Xapian-discuss] having trouble with prefixes
> Message-ID: <871u58osmw.fsf at basementcode.com>
> Content-Type: text/plain
>
> I've got a small test database setup with one record.
> $ delve -r 1 -V /tmp/1/
> Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg
> Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg
>
> The terms were added with lines like this:
> doc.add_term(string("P:") + path);
>
> Problem is, I can't seem to run a query that returns the document using
> any of the terms. Here is the outline of the code that runs the queries
> I'm trying to run:
>
> Database db(db_path.string());
> QueryParser queryparser;
> Stem stemmer("english");
> //queryparser.set_stemmer(stemmer);
> queryparser.set_database(db);
> queryparser.add_prefix("type", "T");
> queryparser.add_prefix("md5sum", "Q");
> queryparser.add_prefix("path", "P");
> queryparser.add_prefix("extension", "E");
> //maybe set stemming strategy here (in query parser)?
> queryparser.set_stemming_strategy(QueryParser::STEM_NONE);
> Query query(queryparser.parse_query(full_string));
> cout<<"Query is '"<<full_string<<"'"<<endl;
> Enquire enquire(db);
> enquire.set_query(query);
> MSet match_set(enquire.get_mset(0, 10));
> for_each(match_set.begin(), match_set.end(),
> [&db](docid id) {
> print_doc_info(db.get_document(id));
> });
>
> I expected the following query to work,
> md5sum:DD4F2162FFFF0E43741A4A1C2B8EC0E7
> but it returns nothing. Same for all the other terms and prefixes. Terms
> without prefixes seem to be working normally. I set stemming to NONE on
> everything.
>
> All I want is a way to ask xapian to return a list of all documents with
> specific paths and/or md5sums.
>
> thanks for any tips,
> Chris
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 2 Sep 2013 11:09:27 +0100
> From: Olly Betts <olly at survex.com>
> To: Christopher Harvey <chris at basementcode.com>
> Cc: xapian-discuss at lists.xapian.org
> Subject: Re: [Xapian-discuss] having trouble with prefixes
> Message-ID: <20130902100926.GG19292 at survex.com>
> Content-Type: text/plain; charset=us-ascii
>
> On Sun, Sep 01, 2013 at 10:37:59PM -0400, Christopher Harvey wrote:
> > I've got a small test database setup with one record.
> > $ delve -r 1 -V /tmp/1/
> > Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg
> > Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg
> >
> > The terms were added with lines like this:
> > doc.add_term(string("P:") + path);
>
> Just add the prefix "P" here.
>
> > Problem is, I can't seem to run a query that returns the document using
> > any of the terms. Here is the outline of the code that runs the queries
> > I'm trying to run:
> >
> > Database db(db_path.string());
> > QueryParser queryparser;
> > Stem stemmer("english");
> > //queryparser.set_stemmer(stemmer);
> > queryparser.set_database(db);
> > queryparser.add_prefix("type", "T");
> > queryparser.add_prefix("md5sum", "Q");
> > queryparser.add_prefix("path", "P");
>
> Or if you really want that colon in there, add the prefix as "P:" here.
>
> > queryparser.add_prefix("extension", "E");
> > //maybe set stemming strategy here (in query parser)?
> > queryparser.set_stemming_strategy(QueryParser::STEM_NONE);
> > Query query(queryparser.parse_query(full_string));
> > cout<<"Query is '"<<full_string<<"'"<<endl;
>
> If you print out query.get_description() it should be clearer what's
> going on.
>
> Cheers,
> Olly
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>
>
> ------------------------------
>
> End of Xapian-discuss Digest, Vol 112, Issue 1
> **********************************************
More information about the Xapian-discuss
mailing list