[Xapian-tickets] [Xapian] #664: omindex hangs on indexing 10G database
Xapian
nobody at xapian.org
Wed Nov 5 19:50:35 GMT 2014
#664: omindex hangs on indexing 10G database
----------------------+---------------------------
Reporter: hjohanns | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone:
Component: Omega | Version: 1.2.16
Severity: normal | Resolution:
Keywords: hang | Blocked By:
Blocking: | Operating System: Linux
----------------------+---------------------------
\
\
\
\
\
\
Comment (by olly):
It sounds to me like it is probably flushing changes to disk when it
appears to hang. Xapian batches up changes to the postlist table, and
every 10000 (by default) documents changed it will flush them to disk.
This takes a while, especially with a big database. It ought not take
hours, but if things stop fitting in memory it may start swapping so
probably could.
I would check how much swapping and disk I/O is happening - this will
should a reading every 5 seconds until you hit {{{Ctrl+C}}}:
{{{
vmstat 5
}}}
Look at the columns si/so (which is the number of blocks swapped in the
last time interval) and bi/bo (which is blocks read and written by
processes) - I'd expect you'll see quite a lot of both.
You can adjust the threshold lower by setting {{{XAPIAN_FLUSH_THRESHOLD}}}
in the environment (and exporting it so that subprocesses actually see the
value set) - e.g. to reduce it to 1000 try:
{{{
export XAPIAN_FLUSH_THRESHOLD=1000
}}}
Ideally this should auto-adjust based on the amount of memory needed to
batch the data compared to what's available, but it doesn't currently.
\
\
\
--
Ticket URL: <http://trac.xapian.org/ticket/664#comment:1>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list