[Xapian-discuss] What are the separators that scriptindex uses?
Jim Lynch
jwl at sgi.com
Wed Sep 15 20:48:54 BST 2004
I've been asked to find out what are considered separators for
scriptindex? Whitespace obviously. What is done with special
characters? The reason for the question is that my data contains some
strange stuff, like output from core dumps, source code for various
programming languages like assembly, part numbers (not just numbers, of
course) and other wierd collections of funny characters. Fortunately no
unicode just yet. I'm trying to get a feel for how difficult it's going
to be to search for this stuff and what the rules might be.
Also can I assume omega uses the same set of separators?
For instance if I look for something like PARAM_DEV-445*Foggy, will it
be found? Will it be multiple terms?
BTW, how are phrase searches these days?
Thanks,
Jim.
More information about the Xapian-discuss
mailing list