[Xapian-discuss] Working Demo for WWW Search Engine

Kevin SoftDev kevin.softdev at gmail.com
Wed Mar 1 15:41:40 GMT 2006


Olly,

It works except the city is spelled praha, prague is the english version
http://nitra.net/cgi-bin/hladaj.cgi?a=q&q=praha

I fix the content-type becaused I forgot to print the header from Perl
script. One bug is still there that it works only with one term based on the
Perl demo script that came with Xapian. As soon as user type two terms
nothing come up. I am not sure if this is bug of Perl API or is mine.

example with no results:
http://nitra.net/cgi-bin/hladaj.cgi?a=q&q=prahga+hrad


---- one term is called from the Perl script example like this:
    my $enq = $db->enquire( 'Praha' );

--- two terms is called like this?
my $enq = $db->enquire( 'Praha Hrad' );

I do not get any result back, I am wondering if there is another API I
suppose to use .

Thanks, Xapian was very easy to implement and it is the fastest from all
search engine I build, that includes MySQL 5.0,  MS SQL 2005, Lucene and
some other custom commercial.



#!/usr/bin/perl
#-----------------------------------------#
use Search::Xapian;

  my $db = Search::Xapian::Database->new( '/path/to/database' );

  my $enq = $db->enquire( 'Praha' );

  printf "Running query '%s'\n", $enq->get_query()->get_description();

  my @matches = $enq->matches(0, 20);

  print scalar(@matches) . " results found\n";

  foreach my $match ( @matches )
  {
      my $doc = $match->get_document();
      printf "ID %d %d%% [ %s ]\n", $match->get_docid(),
$match->get_percent(), $doc->get_data();
  }

















On 3/1/06, Olly Betts <olly at survex.com> wrote:
>
> On Tue, Feb 28, 2006 at 11:08:52PM -0800, Kevin SoftDev wrote:
> > Thanks for your help. I was able to deploy simple search engine
> > http://nitra.net in Czech and Slovak language. I still need to figure
> out
> > how can I get multiple terms search and paging.
>
> There seems to be a content-type bug - the results are served as
> text/plain so I get the HTML source (in Firefox at least):
>
> http://nitra.net/cgi-bin/hladaj.cgi?a=q&q=prague
>
> > I know the stemmer is in English, but do you think they will notice?
>
> It's actually probably better to not stem than to stem in a different
> language, unless the languages are very similar morphologically.  I
> doubt English and Czech or Slovak are similar enough for it to be
> beneficial, and it may be harmful.
>
> Czech and Slovak may be similar enough to each other for a Slovak
> stemmer to be beneficial on Czech text and vice versa but snowball
> doesn't include either at present.
>
> Cheers,
>    Olly
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060301/14e6995b/attachment.htm


More information about the Xapian-discuss mailing list