[Xapian-discuss] Problem getting Xapian working with Burmese

emmanuel at engelhart.org emmanuel at engelhart.org
Fri Aug 21 13:44:44 BST 2009


Hi

I want to update my request.
Is my question bad formulated? too trivial? ... or maybe pretty complicated/unclear?
In fact I'm not a Xapian nor a search engine expert, so I have no Idea where I have to start my investigation.
Without having the answer to my question, maybe someone can give me Idea how to better understand the issue?

Regards
Emmanuel

 Le ven 17/07/09 19:30, "Emmanuel Engelhart" emmanuel at engelhart.org a écrit:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi,
> 
> I use Xapian in my project with multiple latin languages and it works
> good. I have also tried with Parsi, and it looks to work too.
> 
> But, with Burmese, this is a little bit different. What I do:
> 
> mkdir html
> cd html
> wget -O doc.html http://my.wikipedia.orgcd ..
> omindex --db=./xapdb ./html/
> 
> To make a simple search in the db I use the following Perl script (my
> code is in C++ and it does not work too):
> 
> ===================================================================
> #!/usr/bin/perl
> 
> use Search::Xapian;
> use utf8;
> 
> my $db = Search::Xapian::Database->new( './xapdb' );
> my $enq = $db->enquire( $ARGV[0] );
> 
> printf "Running query '%s'\n",
> $enq->get_query()->get_description();
> my @matches = $enq->matches(0, 10);
> 
> print scalar(@matches) . " results found\n";
> 
> foreach my $match ( @matches ) {
> my $doc = $match->get_document();
> printf "ID %d %d%% [ %s ]\n", $match->get_docid(),
> $match->get_percent(), $doc->get_data();
> }
> ===================================================================
> 
> ./search.pl problems
> 
> ... returns the document, because you have at the beginning of the page
> a sentence in English with this word inside.
> 
> ./search.pl ၁၂၆၆
> 
> ... return a result too.
> 
> ./search ဝီကီပိဒိယအကြောင်း
> ./search ဗဟိုစာမျက်နှာ
> 
> ... do not work... in fact it does not work most of the time. I seems
> towork only with Burmese words wich are short and/or only with certain
> characters.
> 
> Is that normal?
> 
> Regards
> Emmanuel
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> iEYEARECAAYFAkpgtUAACgkQn3IpJRpNWtPRRgCfZukUGfG8Eliv6SKZDXoAWnlI
> SP8Animz/5IUtSl9Ba2oV8vJLkjdLcDX
> =QjZX
> -----END PGP SIGNATURE-----
> 
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.orghttp://lists.xapian.org/mailman/listinfo/xapian-discuss
> 
> 




More information about the Xapian-discuss mailing list