[Xapian-discuss] patch proposal: omindex library or daemon

Liam xapian at networkimprov.net
Mon Oct 24 11:39:47 BST 2011


On Mon, Oct 24, 2011 at 1:55 AM, Richard Boulton <richard at tartarus.org>wrote:

> etc).  For more flexibility, I'd quite like a library which had a very
> simple interface something like:
>
> /** Parse the file at filepath, returning a set of data found in
> fields in the file.
>  *
>  * Should always produce a "body" field; other fields produced would
> depend on the document and the abilities of the parser.
>  *
>  * @param fields A map from fieldname to field contents, used to
> return the result of parsing.
>  */
> void parse(const std::string & filepath, std::map<std::string,
> std::string> & fields);
>
> ie, something which doesn't do anything apart from get data,
> potentially in multiple fields, from the file.
>

Yes, there we go. Also needs arguments for parse options and (optional)
mime-type.

There's a second routine which does the default Document ops for values &
data:

  void Document::set_values_and_data(const std::map<std::string,
std::string>& fields, const std::vector<std::string>& omit_fields=0);
  // omit_fields is a list of field names to omit from Document values
  // might live in class MimeDocument : public Document


More information about the Xapian-discuss mailing list