[Xapian-devel] Auto completion using xapian

Naveen A.N. naveen at searchlabs.org
Mon Oct 8 09:53:32 BST 2012


Thank you James . I will try to contribute :)

On Mon, Oct 8, 2012 at 1:27 PM, James Aylett <james-xapian at tartarus.org>wrote:

> Your expectations may be confounded :-)
>
> Lucene and Xapian are sufficiently different that for many user-facing
> features you may have to tackle the problem in a totally different way.
> Additionally, auto complete can be used to do a number of different things,
> so without you saying what you're actually trying to achieve it's difficult
> to be certain how to advise you.
>
> What I thought you initially asked for (and which matches my understanding
> of using edge n-grams) was a way of expanding partial terms, presumably at
> the end of a typed search string, to complete terms. This is what
> FLAG_PARTIAL is for (although it's not always that simple, particularly
> with auto complete, as you may have to read the generated Query object to
> figure out the options for completion to display).
>
> Your explanation of edge n-grams in Lucene seems to be using words as the
> particles, which you could use to auto complete the titles for instance –
> although for that you're probably better off using a left substring search
> against a relational database, or something similar.
>
> To answer your direct questions – 1) no, although if you needed it the
> thing to do would be to write your own TermGenerator and probably
> QueryParser. 2) there are some more examples in the new documentation, but
> basically if there's something that isn't documented it's because nobody's
> gotten around to documenting it yet. The API itself is reasonably
> well-documented, so if you find somewhere that you feel should have an
> additional topic document, please feel free to contribute!
>
> James
>
> On 8 Oct 2012, at 03:08, Naveen A.N. <naveen at searchlabs.org> wrote:
>
> > Hello James,
> > The thing is the code works perfectly but that is not the way that auto
> completion work.I am basically from JAVA lucene and i am sure that lucene
> and xapian are different but i expect that all the features are same.
> > In lucene we index the document using Edge N gram analyzer .so the
> document "xapian is search engine " will be generated as "xapian","xapian
> is ","xapian is search","xapian is search engine". But in my code in xapian
> how it works is it gives only the "xapian is search engine". And i am sure
> that you are aware of how auto completion works.
> >
> > I have the below questions.
> > 1) Do xapian has Edge N gram analyzer or similar thing?
> > 2) why there is only examples for basic indexing and searching ?But not
> for important features like facets and filtering, auto completion etc.
> >    only snippets are available.Do you have some working example of
> facets,filters in c++? If i missed the links can you provide them?
> >
> > On Sun, Oct 7, 2012 at 11:34 PM, James Aylett <james-xapian at tartarus.org>
> wrote:
> > I'm not sure I understand. You have code that works, and you are using
> FLAG_PARTIAL the way it is documented. Why are you not sure that what
> you're doing is correct? Is there something about the documentation that is
> confusing, or missing? Obviously we'd like to improve things wherever
> possible to avoid other people being uncertain in future.
> >
> > There's nothing about the code I can see that looks wrong, although it's
> been a while since I regularly read C++ so I might be missing something
> obvious.
> >
> > James
> >
> > On 7 Oct 2012, at 18:16, Naveen A.N. <naveen at searchlabs.org> wrote:
> >
> > > Hello James,
> > >
> > > Thank you for your reply,
> > > I tried using the below code.As you said it matches the partial query
> if i remove the * but the thing is i am not sure is this the correct way to
> generate the auto completion. Can u tell me how can i generate the auto
> completion ? Or give me an example?
> > >
> > > #include <xapian.h>
> > > #include <iostream>
> > > #include <string>
> > > #include <cstdlib> // For exit().
> > > #include <cstring>
> > >
> > > using namespace std;
> > >
> > > int main(int argc, char **argv)
> > > try {
> > >
> > >       Xapian::WritableDatabase db("/home/example/xapian",
> > >                       Xapian::DB_CREATE_OR_OPEN);
> > >       Xapian::TermGenerator indexer;
> > >       Xapian::Stem stemmer("english");
> > >       indexer.set_stemmer(stemmer);
> > >       indexer.set_database(db);
> > >       indexer.set_flags(indexer.FLAG_SPELLING);
> > >       string para = "master of business administration  master in c++
> ";
> > >       Xapian::Document doc;
> > >       doc.set_data(para);
> > >       indexer.set_document(doc);
> > >       indexer.index_text(para);
> > >       db.add_document(doc);
> > >       db.commit();
> > >       Xapian::QueryParser parser;
> > >       parser.set_database(db);
> > >       parser.set_default_op(Xapian::Query::OP_AND);
> > >       parser.set_stemmer(stemmer);
> > >       parser.set_stemming_strategy(Xapian::QueryParser::STEM_SOME);
> > >       Xapian::Query query = parser.parse_query("master",
> > >                       parser.FLAG_DEFAULT |
> parser.FLAG_SPELLING_CORRECTION
> > >                                       | parser.FLAG_AUTO_SYNONYMS |
> parser.FLAG_PARTIAL);
> > >       Xapian::Enquire enquire(db);
> > >       enquire.set_query(query);
> > >       Xapian::MSet mset = enquire.get_mset(0, 10);
> > >
> > >       for (Xapian::MSetIterator i = mset.begin(); i != mset.end();
> i++) {
> > >               Xapian::Document doc = i.get_document();
> > >               string data = doc.get_data();
> > >               cout << *i << ": [" << i.get_weight() << "]\n" << data
> << "\n";
> > >       }
> > >       cout << flush;
> > > }
> > > catch (const Xapian::Error &e) {
> > >       cout << e.get_description() << endl;
> > >       exit(1);
> > > }
> > >
> > >
> > > On Sun, Oct 7, 2012 at 10:13 PM, James Aylett <
> james-xapian at tartarus.org> wrote:
> > > A query string such as "m*" is using the wildcard expansion operator –
> if you want to use the partial support, you don't want the * at the end of
> your query string.
> > >
> > > It's also not clear from your message whether you've set a database
> before trying to parse your query. You need to do this, because Xapian's
> wildcard support (which is what partial uses) is done at query time,
> expanding to all the possibly matching terms, rather than at index time in
> an n-gram analyzer style.
> > >
> > > If you still can't get this to work, try posting a complete program
> rather than just a snippet.
> > >
> > > James
> > >
> > >
> > > On 4 Oct 2012, at 18:20, Naveen A.N. <naveen at searchlabs.org> wrote:
> > >
> > > > Hello,
> > > >
> > > > Do xapian has analyzer like EdgeNGram to use it for autocomplete.
> > > >
> > > > I am trying to use the auto completion using xapian.
> > > > For example:
> > > > e
> > > > ex
> > > > exa
> > > > exam
> > > > example
> > > > etc..
> > > > so that we can get it.
> > > > I tried to use using the Partial flag but it dose not work
> Xapian::Query query = parser.parse_query("m*",parser.FLAG_PARTIAL);
> > > > Do you have any example or any tutorial is appreciated.
> > > >
> > > > --Naveen.
> > > > _______________________________________________
> > > > Xapian-devel mailing list
> > > > Xapian-devel at lists.xapian.org
> > > > http://lists.xapian.org/mailman/listinfo/xapian-devel
> > >
> > > --
> > >  James Aylett, occasional trouble-maker
> > >  xapian.org
> > >
> > >
> > > _______________________________________________
> > > Xapian-devel mailing list
> > > Xapian-devel at lists.xapian.org
> > > http://lists.xapian.org/mailman/listinfo/xapian-devel
> >
> > --
> >  James Aylett, occasional trouble-maker
> >  xapian.org
> >
> >
> > _______________________________________________
> > Xapian-devel mailing list
> > Xapian-devel at lists.xapian.org
> > http://lists.xapian.org/mailman/listinfo/xapian-devel
>
> --
>  James Aylett, occasional trouble-maker
>  xapian.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20121008/65de18d5/attachment-0001.htm>


More information about the Xapian-devel mailing list