[Xapian-devel] Auto completion using xapian

James Aylett james-xapian at tartarus.org
Mon Oct 8 08:57:07 BST 2012


Your expectations may be confounded :-)

Lucene and Xapian are sufficiently different that for many user-facing features you may have to tackle the problem in a totally different way. Additionally, auto complete can be used to do a number of different things, so without you saying what you're actually trying to achieve it's difficult to be certain how to advise you.

What I thought you initially asked for (and which matches my understanding of using edge n-grams) was a way of expanding partial terms, presumably at the end of a typed search string, to complete terms. This is what FLAG_PARTIAL is for (although it's not always that simple, particularly with auto complete, as you may have to read the generated Query object to figure out the options for completion to display).

Your explanation of edge n-grams in Lucene seems to be using words as the particles, which you could use to auto complete the titles for instance – although for that you're probably better off using a left substring search against a relational database, or something similar.

To answer your direct questions – 1) no, although if you needed it the thing to do would be to write your own TermGenerator and probably QueryParser. 2) there are some more examples in the new documentation, but basically if there's something that isn't documented it's because nobody's gotten around to documenting it yet. The API itself is reasonably well-documented, so if you find somewhere that you feel should have an additional topic document, please feel free to contribute!

James

On 8 Oct 2012, at 03:08, Naveen A.N. <naveen at searchlabs.org> wrote:

> Hello James,
> The thing is the code works perfectly but that is not the way that auto completion work.I am basically from JAVA lucene and i am sure that lucene and xapian are different but i expect that all the features are same.
> In lucene we index the document using Edge N gram analyzer .so the document "xapian is search engine " will be generated as "xapian","xapian is ","xapian is search","xapian is search engine". But in my code in xapian how it works is it gives only the "xapian is search engine". And i am sure that you are aware of how auto completion works.
> 
> I have the below questions.
> 1) Do xapian has Edge N gram analyzer or similar thing?
> 2) why there is only examples for basic indexing and searching ?But not for important features like facets and filtering, auto completion etc.
>    only snippets are available.Do you have some working example of facets,filters in c++? If i missed the links can you provide them? 
> 
> On Sun, Oct 7, 2012 at 11:34 PM, James Aylett <james-xapian at tartarus.org> wrote:
> I'm not sure I understand. You have code that works, and you are using FLAG_PARTIAL the way it is documented. Why are you not sure that what you're doing is correct? Is there something about the documentation that is confusing, or missing? Obviously we'd like to improve things wherever possible to avoid other people being uncertain in future.
> 
> There's nothing about the code I can see that looks wrong, although it's been a while since I regularly read C++ so I might be missing something obvious.
> 
> James
> 
> On 7 Oct 2012, at 18:16, Naveen A.N. <naveen at searchlabs.org> wrote:
> 
> > Hello James,
> >
> > Thank you for your reply,
> > I tried using the below code.As you said it matches the partial query if i remove the * but the thing is i am not sure is this the correct way to generate the auto completion. Can u tell me how can i generate the auto completion ? Or give me an example?
> >
> > #include <xapian.h>
> > #include <iostream>
> > #include <string>
> > #include <cstdlib> // For exit().
> > #include <cstring>
> >
> > using namespace std;
> >
> > int main(int argc, char **argv)
> > try {
> >
> >       Xapian::WritableDatabase db("/home/example/xapian",
> >                       Xapian::DB_CREATE_OR_OPEN);
> >       Xapian::TermGenerator indexer;
> >       Xapian::Stem stemmer("english");
> >       indexer.set_stemmer(stemmer);
> >       indexer.set_database(db);
> >       indexer.set_flags(indexer.FLAG_SPELLING);
> >       string para = "master of business administration  master in c++ ";
> >       Xapian::Document doc;
> >       doc.set_data(para);
> >       indexer.set_document(doc);
> >       indexer.index_text(para);
> >       db.add_document(doc);
> >       db.commit();
> >       Xapian::QueryParser parser;
> >       parser.set_database(db);
> >       parser.set_default_op(Xapian::Query::OP_AND);
> >       parser.set_stemmer(stemmer);
> >       parser.set_stemming_strategy(Xapian::QueryParser::STEM_SOME);
> >       Xapian::Query query = parser.parse_query("master",
> >                       parser.FLAG_DEFAULT | parser.FLAG_SPELLING_CORRECTION
> >                                       | parser.FLAG_AUTO_SYNONYMS | parser.FLAG_PARTIAL);
> >       Xapian::Enquire enquire(db);
> >       enquire.set_query(query);
> >       Xapian::MSet mset = enquire.get_mset(0, 10);
> >
> >       for (Xapian::MSetIterator i = mset.begin(); i != mset.end(); i++) {
> >               Xapian::Document doc = i.get_document();
> >               string data = doc.get_data();
> >               cout << *i << ": [" << i.get_weight() << "]\n" << data << "\n";
> >       }
> >       cout << flush;
> > }
> > catch (const Xapian::Error &e) {
> >       cout << e.get_description() << endl;
> >       exit(1);
> > }
> >
> >
> > On Sun, Oct 7, 2012 at 10:13 PM, James Aylett <james-xapian at tartarus.org> wrote:
> > A query string such as "m*" is using the wildcard expansion operator – if you want to use the partial support, you don't want the * at the end of your query string.
> >
> > It's also not clear from your message whether you've set a database before trying to parse your query. You need to do this, because Xapian's wildcard support (which is what partial uses) is done at query time, expanding to all the possibly matching terms, rather than at index time in an n-gram analyzer style.
> >
> > If you still can't get this to work, try posting a complete program rather than just a snippet.
> >
> > James
> >
> >
> > On 4 Oct 2012, at 18:20, Naveen A.N. <naveen at searchlabs.org> wrote:
> >
> > > Hello,
> > >
> > > Do xapian has analyzer like EdgeNGram to use it for autocomplete.
> > >
> > > I am trying to use the auto completion using xapian.
> > > For example:
> > > e
> > > ex
> > > exa
> > > exam
> > > example
> > > etc..
> > > so that we can get it.
> > > I tried to use using the Partial flag but it dose not work Xapian::Query query = parser.parse_query("m*",parser.FLAG_PARTIAL);
> > > Do you have any example or any tutorial is appreciated.
> > >
> > > --Naveen.
> > > _______________________________________________
> > > Xapian-devel mailing list
> > > Xapian-devel at lists.xapian.org
> > > http://lists.xapian.org/mailman/listinfo/xapian-devel
> >
> > --
> >  James Aylett, occasional trouble-maker
> >  xapian.org
> >
> >
> > _______________________________________________
> > Xapian-devel mailing list
> > Xapian-devel at lists.xapian.org
> > http://lists.xapian.org/mailman/listinfo/xapian-devel
> 
> --
>  James Aylett, occasional trouble-maker
>  xapian.org
> 
> 
> _______________________________________________
> Xapian-devel mailing list
> Xapian-devel at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-devel

-- 
 James Aylett, occasional trouble-maker
 xapian.org




More information about the Xapian-devel mailing list