We have a distributed setup that has been experiencing glacially slow commit
times on only some of the shards. (10s on a good shard, 263s on a slow
shard.) Each shard for this index has about 10GB of lucene index data and
the documents are segregated by an md5 hash, so the distribution of
document/data types should be equal across all shards. I've turned off our
postcommit hooks to isolate the problem, so it's not a snapshot run amok or
anything. I also moved the indexes over to new machines and the same indexes
that were slow in production are also slow on the test machines.
During the slow commit, the jetty process is 100% CPU / 50% RAM on a 8GB
quad core machine. The slow commit happens every time after I add at least
one document. (If I don't add any documents the commit is immediate.)

What can I do to look into this problem?

Reply via email to