[Xapian-discuss] problems with indexing xlsx files
Olly Betts
olly at survex.com
Mon Apr 15 07:35:47 BST 2013
On Fri, Apr 05, 2013 at 03:47:11PM -0300, Chris Purves wrote:
> I have a number of Excel .xlsx files that aren't indexed properly. To illustrate, I have a file called "this is a test.xlsx". It consists of four cells:
>
> | this |
> | is |
> | a |
> | test |
>
[...]
>
> You can see that the words are all concatenated together as if they
> are a single word. If I search for "thisisatest" it comes up, but not
> otherwise.
>
> I'm using version 1.2.3 on Debian.
The xlsx extraction code changed significantly in 1.2.11, so I think
this is quite likely to already be fixed.
Could you try a newer version, or point us at a sample file which
exhibits this problem?
Cheers,
Olly
More information about the Xapian-discuss
mailing list