[Xapian-tickets] [Xapian] #664: omindex hangs on indexing 10G database

Xapian nobody at xapian.org
Wed Nov 5 19:50:35 GMT 2014


#664: omindex hangs on indexing 10G database
----------------------+---------------------------
 Reporter:  hjohanns  |             Owner:  olly
     Type:  defect    |            Status:  new
 Priority:  normal    |         Milestone:
Component:  Omega     |           Version:  1.2.16
 Severity:  normal    |        Resolution:
 Keywords:  hang      |        Blocked By:
 Blocking:            |  Operating System:  Linux
----------------------+---------------------------
\
\
\
\
\
\

Comment (by olly):

 It sounds to me like it is probably flushing changes to disk when it
 appears to hang.  Xapian batches up changes to the postlist table, and
 every 10000 (by default) documents changed it will flush them to disk.
 This takes a while, especially  with a big database.  It ought not take
 hours, but if things stop fitting in memory it may start swapping so
 probably could.

 I would check how much swapping and disk I/O is happening - this will
 should a reading every 5 seconds until you hit {{{Ctrl+C}}}:

 {{{
 vmstat 5
 }}}

 Look at the columns si/so (which is the number of blocks swapped in the
 last time interval) and bi/bo (which is blocks read and written by
 processes) - I'd expect you'll see quite a lot of both.

 You can adjust the threshold lower by setting {{{XAPIAN_FLUSH_THRESHOLD}}}
 in the environment (and exporting it so that subprocesses actually see the
 value set) - e.g. to reduce it to 1000 try:

 {{{
 export XAPIAN_FLUSH_THRESHOLD=1000
 }}}

 Ideally this should auto-adjust based on the amount of memory needed to
 batch the data compared to what's available, but it doesn't currently.
\
\
\

--
Ticket URL: <http://trac.xapian.org/ticket/664#comment:1>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list