[Xapian-discuss] Tika memory problems. Omindex restrictions?

Charles xapian at catcons.co.uk
Sat Jun 18 17:09:07 BST 2011


On 18/06/11 21:08, Olly Betts wrote:
> On Sat, Jun 18, 2011 at 08:23:28PM +0530, Charles wrote:
> [snip]
> To prevent issues with run-away filters, they're limited to the size of
> physical memory and 5 minutes of CPU time.
>
> If Tika's really using>  1GB of memory to extract files under 1MB, it
> seems that's going to be problematic on a system with 1GB of memory.
>
> What Xapian version are you using?  Older versions of Omega based
> bug which based the limit on free memory, which on Linux excludes
> that used for caching, often leaving a very small amount of memory
> apparently free.
>
> Cheers,
>      Olly
Thanks Olly -- that was quick!  :-)

It doesn't look as if Tika is using > 1 GB memory.  Here's vmstat output 
when running a Tika command that failed when run by omindex, running it 
directly  at a command prompt.  The command was java -jar 
/opt/apache/tika/apache-tika-0.8-src/tika-app/target/tika-app-0.8.jar 
--text <whatever>.doc.  The .doc file was ~4 MB:

procs -----------memory---------- ---swap-- -----io---- -system-- 
----cpu----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa

  0  0      0 762788    312 103696    0    0     0     0   55   60  0  1 
99  0
  0  0      0 762788    312 103696    0    0     0     0   85   98  1  0 
99  0
  0  0      0 762788    312 103696    0    0     0     0   34   37  0  0 
100  0
  1  1      0 755276    312 110404    0    0  6696     4  256  221  2  5 
69 24
  1  0      0 744872    312 116084    0    0  5700     0  659  635  6 18 
53 23
  1  0      0 734268    312 119868    0    0  3784     0  688  618 17 16 
42 25
  3  0      0 723720    312 122416    0    0  2560     0  586  322 47 12 
30 11
  3  0      0 708592    312 122476    0    0    40     0  689  352 81  7 
12  0
  1  0      0 702956    312 126272    0    0  3840     0  620  464 20 16 
41 24
  3  0      0 698692    312 127560    0    0  1300     0  601  318 80  8 
12  1
  0  0      0 735476    312 130472    0    0  2856     0  525  459 35 13 
51  1
  0  0      0 735476    312 130472    0    0     0     0   39   42  0  0 
100  0
  0  0      0 735476    312 130472    0    0     0     0   32   36  0  0 
100  0
  0  0      0 735476    312 130472    0    0     0     0   31   37  0  0 
100  0

Sorry for not giving the Xapian+Omega version; it is 1.2.5.

Best

Charles



More information about the Xapian-discuss mailing list