Does this happen all the time or only when bringing up Solr on some of
the nodes?
My (naive) question is whether this message:  AlreadyBeingCreatedException
could indicate that more than one Solr is trying to access the same tlog....

Best,
Erick

On Tue, Feb 2, 2016 at 9:01 AM, Scott Stults
<sstu...@opensourceconnections.com> wrote:
> It seems odd that the tlog files are so large. HDFS aside, is there a
> reason why you're not committing? Also, as far as disk space goes, if you
> dip below 50% free you run the risk that the index segments can't be merged.
>
>
> k/r,
> Scott
>
> On Fri, Jan 29, 2016 at 12:40 AM, Joseph Obernberger <
> joseph.obernber...@gmail.com> wrote:
>
>> Hi All - we're using Apache Solr Cloud 5.2.1, with an HDFS system that is
>> 86% full.  Some of the datanodes in the HDFS cluster are more close to
>> being full than other nodes.  We're getting messages about "Error adding
>> log" from the index process, which I **think** is related to datanodes
>> being full.
>> Is that the case?  Even though HDFS still has room available?
>>
>> Upon restart of Solr Cloud, we see messages such as:
>>
>> ERROR - 2016-01-28 22:15:27.594; [   UNCLASS]
>> org.apache.solr.common.SolrException; Failure to open existing log file
>> (non fatal)
>>
>> hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282501:org.apache.solr.common.SolrException:
>>
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
>> failed to create file
>> /solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282501 for
>> DFSClient_NONMAPREDUCE_1371064795_29 for client 172.16.100.218 because
>> current leaseholder is trying to recreate file.
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>>
>>         at
>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>>
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>>
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>>
>>         at
>>
>> org.apache.solr.update.HdfsTransactionLog.<init>(HdfsTransactionLog.java:131)
>>
>>         at
>> org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:193)
>>         at
>> org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:136)
>>         at
>> org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:94)
>>         at
>>
>> org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:99)
>>
>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>         at
>>
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>
>>         at
>>
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>
>>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>         at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:573)
>>         at
>> org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:635)
>>         at
>> org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:928)
>>         at org.apache.solr.core.SolrCore.<init>(SolrCore.java:786)
>>         at org.apache.solr.core.SolrCore.<init>(SolrCore.java:658)
>>         at
>> org.apache.solr.core.CoreContainer.create(CoreContainer.java:637)
>>         at
>> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:381)
>>         at
>> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:375)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>         at
>>
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:148)
>>
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>>         at java.lang.Thread.run(Thread.java:745)
>> Caused by:
>>
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
>> failed to create file
>> /solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282501 for
>> DFSClient_NONMAPREDUCE_1371064795_29 for client 172.16.100.218 because
>> current leaseholder is trying to recreate file.
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>>
>>         at
>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>>
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>>
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1472)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>>         at
>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>>
>>         at com.sun.proxy.$Proxy11.append(Unknown Source)
>>         at
>>
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:313)
>>
>>         at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>>         at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>>
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
>>
>>         at
>>
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>>
>>         at com.sun.proxy.$Proxy12.append(Unknown Source)
>>         at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1842)
>>         at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1878)
>>         at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1871)
>>         at
>>
>> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:329)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325)
>>
>>         at
>>
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>
>>         at
>>
>> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:325)
>>
>>         at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1172)
>>         at
>>
>> org.apache.solr.update.HdfsTransactionLog.<init>(HdfsTransactionLog.java:94)
>>
>>         ... 21 more
>>
>> INFO  - 2016-01-28 22:15:27.639; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; Recovering lease on dfs file
>>
>> hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>>
>> INFO  - 2016-01-28 22:15:27.698; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=0 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 58ms
>> INFO  - 2016-01-28 22:15:31.821; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=1 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 4181ms
>> INFO  - 2016-01-28 22:16:32.971; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=2 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 65331ms
>> INFO  - 2016-01-28 22:17:34.638; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=3 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 126998ms
>> INFO  - 2016-01-28 22:18:35.764; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=4 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 188124ms
>> INFO  - 2016-01-28 22:19:37.114; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=5 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 249474ms
>> INFO  - 2016-01-28 22:20:38.629; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=6 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 310989ms
>> INFO  - 2016-01-28 22:21:39.751; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=7 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 372111ms
>> INFO  - 2016-01-28 22:22:40.854; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=8 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 433214ms
>> INFO  - 2016-01-28 22:23:41.981; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=9 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 494341ms
>> INFO  - 2016-01-28 22:24:43.088; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=10 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 555448ms
>> INFO  - 2016-01-28 22:25:44.808; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=11 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 617168ms
>> INFO  - 2016-01-28 22:26:45.934; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=12 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 678294ms
>> INFO  - 2016-01-28 22:27:47.036; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=13 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 739396ms
>> INFO  - 2016-01-28 22:28:48.504; [   UNCLASS]
>> org.apache.solr.util.FSHDFSUtils; recoverLease=false, attempt=14 on
>>
>> file=hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0000000000000282933
>> after 800864ms
>>
>> Some shards in the cluster can take hours to come back up.  Any ideas? It
>> appears to wait 900 seconds for each of the tlog files.  When there are 60+
>> files - this takes a long time!
>> Thank you!
>>
>
>
>
> --
> Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
> | 434.409.2780
> http://www.opensourceconnections.com

Reply via email to