Thank you for the reply Scott - we have the commit settings as:
<autoCommit>
      <maxTime>60000</maxTime>
      <openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
        <maxTime>15000</maxTime>
</autoSoftCommit>

Is that 50% disk space rule across the entire HDFS cluster or on an
individual spindle?
Thank you!

-Joe


On Tue, Feb 2, 2016 at 12:01 PM, Scott Stults <
sstu...@opensourceconnections.com> wrote:

> It seems odd that the tlog files are so large. HDFS aside, is there a
> reason why you're not committing? Also, as far as disk space goes, if you
> dip below 50% free you run the risk that the index segments can't be
> merged.
>
>
> k/r,
> Scott
>
> On Fri, Jan 29, 2016 at 12:40 AM, Joseph Obernberger <
> joseph.obernber...@gmail.com> wrote:
>
> > Hi All - we're using Apache Solr Cloud 5.2.1, with an HDFS system that is
> > 86% full.  Some of the datanodes in the HDFS cluster are more close to
> > being full than other nodes.  We're getting messages about "Error adding
> > log" from the index process, which I **think** is related to datanodes
> > being full.
> > Is that the case?  Even though HDFS still has room available?
> >
> > Upon restart of Solr Cloud, we see messages such as:
> >
> > ERROR - 2016-01-28 22:15:27.594; [   UNCLASS]
> > org.apache.solr.common.SolrException; Failure to open existing log file
> > (non fatal)
> >
> >
> hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282501:org.apache.solr.common.SolrException:
> >
> >
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
> > failed to create file
> > /solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282501 for
> > DFSClient_NONMAPREDUCE_1371064795_29 for client 172.16.100.218 because
> > current leaseholder is trying to recreate file.
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> >
> >         at
> >
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> >
> >         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >         at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> >
> >         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> >
> >         at
> >
> >
> org.apache.solr.update.HdfsTransactionLog.<init>(HdfsTransactionLog.java:131)
> >
> >         at
> > org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:193)
> >         at
> > org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:136)
> >         at
> > org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:94)
> >         at
> >
> >
> org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:99)
> >
> >         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >         at
> >
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> >
> >         at
> >
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >
> >         at
> java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> >         at
> org.apache.solr.core.SolrCore.createInstance(SolrCore.java:573)
> >         at
> > org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:635)
> >         at
> > org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:928)
> >         at org.apache.solr.core.SolrCore.<init>(SolrCore.java:786)
> >         at org.apache.solr.core.SolrCore.<init>(SolrCore.java:658)
> >         at
> > org.apache.solr.core.CoreContainer.create(CoreContainer.java:637)
> >         at
> > org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:381)
> >         at
> > org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:375)
> >         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >         at
> >
> >
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:148)
> >
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >
> >         at java.lang.Thread.run(Thread.java:745)
> > Caused by:
> >
> >
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
> > failed to create file
> > /solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282501 for
> > DFSClient_NONMAPREDUCE_1371064795_29 for client 172.16.100.218 because
> > current leaseholder is trying to recreate file.
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> >
> >         at
> >
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> >
> >         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> >         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >         at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> >
> >         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> >
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1472)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1403)
> >         at
> >
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> >
> >         at com.sun.proxy.$Proxy11.append(Unknown Source)
> >         at
> >
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:313)
> >
> >         at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
> >         at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >
> >         at java.lang.reflect.Method.invoke(Method.java:606)
> >         at
> >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
> >
> >         at
> >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> >
> >         at com.sun.proxy.$Proxy12.append(Unknown Source)
> >         at
> org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1842)
> >         at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1878)
> >         at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1871)
> >         at
> >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:329)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
> >
> >         at
> >
> >
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >
> >         at
> >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:325)
> >
> >         at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1172)
> >         at
> >
> >
> org.apache.solr.update.HdfsTransactionLog.<init>(HdfsTransactionLog.java:94)
> >
> >         ... 21 more
> >
> > INFO  - 2016-01-28 22:15:27.639; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; Recovering lease on dfs file
> >
> >
> hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> >
> > INFO  - 2016-01-28 22:15:27.698; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=0 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 58ms
> > INFO  - 2016-01-28 22:15:31.821; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=1 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 4181ms
> > INFO  - 2016-01-28 22:16:32.971; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=2 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 65331ms
> > INFO  - 2016-01-28 22:17:34.638; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=3 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 126998ms
> > INFO  - 2016-01-28 22:18:35.764; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=4 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 188124ms
> > INFO  - 2016-01-28 22:19:37.114; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=5 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 249474ms
> > INFO  - 2016-01-28 22:20:38.629; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=6 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 310989ms
> > INFO  - 2016-01-28 22:21:39.751; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=7 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 372111ms
> > INFO  - 2016-01-28 22:22:40.854; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=8 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 433214ms
> > INFO  - 2016-01-28 22:23:41.981; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=9 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 494341ms
> > INFO  - 2016-01-28 22:24:43.088; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=10 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 555448ms
> > INFO  - 2016-01-28 22:25:44.808; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=11 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 617168ms
> > INFO  - 2016-01-28 22:26:45.934; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=12 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 678294ms
> > INFO  - 2016-01-28 22:27:47.036; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=13 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 739396ms
> > INFO  - 2016-01-28 22:28:48.504; [   UNCLASS]
> > org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=14 on
> >
> >
> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
> > after 800864ms
> >
> > Some shards in the cluster can take hours to come back up.  Any ideas? It
> > appears to wait 900 seconds for each of the tlog files.  When there are
> 60+
> > files - this takes a long time!
> > Thank you!
> >
>
>
>
> --
> Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
> | 434.409.2780
> http://www.opensourceconnections.com
>

Reply via email to