[Xapian-discuss] Grouping document paragraphs
James Aylett
james-xapian at tartarus.org
Tue Jan 15 00:51:12 GMT 2008
On Tue, Jan 15, 2008 at 01:04:38AM +0100, Yannick Warnier wrote:
> > Why do you want to index the paragraphs separately? Are you going to
> > want to search for them separately in some other context?
>
> Yes, I would like to be able to search in an article base, but only
> inside titles, for example (or only on abstract sections).
You can do that using prefixes so, for instance, the abstract might be
indexed with a prefix of XA, and the title with a prefix of
S. (omindex will do 'S' for you automatically; scriptindex gives you
complete control, or you can use the optional ``prefix'' argument to
Xapian::TermGenerator::index_text() and
::index_text_without_positions().)
If you're using the QueryParser, you can then 'map' this prefix to
something more readable such as 'abstract' and 'title':
----------------------------------------------------------------------
qp->add_prefix('XA', 'abstract');
qp->add_prefix('S', 'title');
----------------------------------------------------------------------
before calling Xapian::QueryParser::parse_query().
If you're using omega to search, look at the $setmap{} command in
omegascript, using a map called 'prefix':
----------------------------------------------------------------------
$setmap{prefix,abstract,XA,title,S}
----------------------------------------------------------------------
(This needs to happen at the start of your omegascript template.)
Either way, you can then search on 'abstract:Xapian', or
'title:prefixes' or whatever you choose. (With the QueryParser you can
also set a default prefix as an optional argument to
Xapian::QueryParser::parse_query(), which makes it easier for most
people to construct phrase searches. I don't think there's a way of
doing this with omega right now.)
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list