[Xapian-discuss] Limitation of the terms size

James Aylett james-xapian at tartarus.org
Tue Mar 24 15:00:33 GMT 2009


On Tue, Mar 24, 2009 at 03:55:41PM +0100, David Versmisse wrote:

> A small question: i found in the src that a term has a limitation of 245
> characters (#define MAX_SAFE_TERM_LENGTH 245). Do you plan to change
> this limitation in the future versions?
> 
> If not, how can i manage very big terms? For example, we store the
> "paths" of your objects in the database. These paths can be very long:
> "I/affaires-generales/ressources-humaines/formation/Concours/Acces-au-grade-de-technicien-superieur-principal-de-l'industrie-et-des-mines/Acces-au-grade-de-technicien-superieur-principal-de-l'industrie-et-des-mines/concours-TSPIM-septembre-2007.pdf"
> And this is very pratical for us to index it.

Honestly, does that need to be indexed? One solution here is to to
what we do in omega with URIs, and use a reduced version (including a
hash of the complete one or the redacted information) for the term if
it's going over the length limit.

You can still put the entire thing in the document data (however
you're managing that) so you can get it out again intact, and it
allows you to use something derived from the path if you need to find
the Xapian document using it.

J

-- 
  James Aylett

  talktorex.co.uk - xapian.org - uncertaintydivision.org



More information about the Xapian-discuss mailing list