[Xapian-discuss] Grouping document paragraphs

James Aylett james-xapian at tartarus.org
Tue Jan 15 00:51:12 GMT 2008


On Tue, Jan 15, 2008 at 01:04:38AM +0100, Yannick Warnier wrote:

> > Why do you want to index the paragraphs separately? Are you going to
> > want to search for them separately in some other context?
> 
> Yes, I would like to be able to search in an article base, but only
> inside titles, for example (or only on abstract sections).

You can do that using prefixes so, for instance, the abstract might be
indexed with a prefix of XA, and the title with a prefix of
S. (omindex will do 'S' for you automatically; scriptindex gives you
complete control, or you can use the optional ``prefix'' argument to
Xapian::TermGenerator::index_text() and
::index_text_without_positions().)

If you're using the QueryParser, you can then 'map' this prefix to
something more readable such as 'abstract' and 'title': 

----------------------------------------------------------------------
qp->add_prefix('XA', 'abstract');
qp->add_prefix('S', 'title');
----------------------------------------------------------------------

before calling Xapian::QueryParser::parse_query().

If you're using omega to search, look at the $setmap{} command in
omegascript, using a map called 'prefix':

----------------------------------------------------------------------
$setmap{prefix,abstract,XA,title,S}
----------------------------------------------------------------------

(This needs to happen at the start of your omegascript template.)

Either way, you can then search on 'abstract:Xapian', or
'title:prefixes' or whatever you choose. (With the QueryParser you can
also set a default prefix as an optional argument to
Xapian::QueryParser::parse_query(), which makes it easier for most
people to construct phrase searches. I don't think there's a way of
doing this with omega right now.)

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list