[Xapian-discuss] Re: UTF-8 becomes glibberish in searches

athlon athlonf athlonkmf at yahoo.com
Sun Oct 21 19:36:27 BST 2007


This was indeed the solution. Thanks :)

----- Original Message ----
From: Andrey <alpha04 at netvigator.com>
To: xapian-discuss at lists.xapian.org
Sent: Thursday, October 18, 2007 11:31:58 PM
Subject: [Xapian-discuss] Re: UTF-8 becomes glibberish in searches


Try

on top of your php code
header("Content-type: text/html; charset=utf-8");



"athlon athlonf" <athlonkmf at yahoo.com> wrote in message 
news:549946.81147.qm at web31011.mail.mud.yahoo.com...
I'm using dbi2omega and scriptindex to index a database with chinese
 characters.
Searches are done with php4-bindings.

While the index-file is in utf8, the results from the searches are
 glibberish.

These characters (changed to htmlencoding for this message)
同事 becomes something like this: a?ao?a﹐


What am I doing wrong here? Is it the indexing, or is it the searching?
 How can I check if the database is indeed in utf-8?

I'm using a fresh install of ubuntu and therefor a fresch version 1.0.2
 of xapian.

This is part of the code I use to get the results

// Start an enquire session.
$enquire = new XapianEnquire($database);

$query_string = $_POST['terms'];

$qp = new XapianQueryParser();
$stemmer = new XapianStem("english");
$qp->set_stemmer($stemmer);
$qp->set_database($database);
$qp->set_stemming_strategy(XapianQueryParser_STEM_SOME);
$qp->add_valuerangeprocessor( new XapianDateValueRangeProcessor(1) );
$qp->set_default_op( OP_AND );


$query = $qp->parse_query($query_string);
print "Parsed query is: " . $query->get_description(). "<br/>";

// Find the top 10 results for the query.
$enquire->set_query($query);
$enquire->set_sort_by_relevance_then_value(1,1);
$matches = $enquire->get_mset(0, 10);


// Display the results.
print $matches->get_matches_estimated()." results found:\n";
echo "<pre>";


$i = $matches->begin();
while (!$i->equals($matches->end())) {
    $n = $i->get_rank() + 1;
        $document = $i->get_document();
        $data = $document->get_data();

  foreach (split("\n", $data) as $line) {
        $nameval = split("=", $line, 2);
                $field[$nameval[0]] = $nameval[1];
    }
print_r($field);
    echo "$n: ". $i->get_percent()." % id=:". $i->get_docid();




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com




_______________________________________________
Xapian-discuss mailing list
Xapian-discuss at lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the Xapian-discuss mailing list