On 7/17/2017 6:36 AM, Joe Obernberger wrote: > We've been indexing data on a 45 node cluster with 100 shards and 3 > replicas, but our indexing processes have been stopping due to > errors. On the server side the error is "Error logging add". Stack > trace: <snip> > Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): > File > /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211 > could only be replicated to 0 nodes instead of minReplication (=1). > There are 40 datanode(s) running and no node(s) are excluded in this > operation.
The excerpt from your log that I preserved above shows that the root of the problem is something going wrong with Solr writing to HDFS. I can only tell that there was a problem, I do not what actually went wrong. I think you'll need to take this information to the hadoop project and ask them what could cause it and what can be done about it. Solr includes Hadoop 2.7.2 jars. This is not the latest version of Hadoop, so it's possible there might be a known issue with this version that is fixed in a later version. There is a task to update Solr's Hadoop to 3.0 when it gets released: https://issues.apache.org/jira/browse/SOLR-9515 Thanks, Shawn