[Xapian-discuss] patch proposal: omindex library or daemon

Liam xapian at networkimprov.net
Mon Oct 24 19:10:53 BST 2011


On Mon, Oct 24, 2011 at 10:17 AM, James Aylett <james-xapian at tartarus.org>wrote:

> On 24 Oct 2011, at 05:10, Liam wrote:
>
> >>> void Document::set_values_and_data(const std::map<std::string,
> >>> std::string>& fields, const std::vector<std::string>& omit_fields=0);
> >>> // omit_fields is a list of field names to omit from Document values
> >>> // might live in class MimeDocument : public Document
> >>
> >> I'm not so convinced by this; and it's certainly not something that I
> >> think is needed to make a useful library around omindex.  Given the
> >> text data from the fields, it's very easy to use TermGenerator to
> >> index the content, or to call your own routines.
> >
> > Use TermGenerator? Wouldn't the user typically call Document::set_data()?
> > Forgive my inexperience…
>
>
> Document::set_data has nothing to do with terms or values (which are used
> for searching); its typical use is as a place to store information about the
> document that you'd use after having retrieved it from an MSet. So you might
> put a sample there, or a pre-rendered HTML preview blob, or (as omega does)
> a number of pieces of information that can be used to create a preview on
> the fly.
>

Ah, ok. Looking at that part of omindex, I think you'd certainly pull that
code into a separate source file. I can see it needn't be a library
component, but those of us with custom indexing logic would start with that
code and adjust as required.

So who should initiate work on this patch? (2 new src files, one of which
generates a static lib...)


More information about the Xapian-discuss mailing list