NumberFormatException when acessing ZK Status page in 8.6.0

2020-07-21 Thread Henrik B A
I get the following exception when accessing the Cloud -> ZK Status page in
Solr 8.6.0:

ERROR [20200721T080439,063] qtp478489615-24
org.apache.solr.servlet.HttpSolrCall -
null:java.lang.NumberFormatException: null

at java.base/java.lang.Integer.parseInt(Integer.java:620)

at java.base/java.lang.Integer.parseInt(Integer.java:776)

at
org.apache.solr.common.cloud.ZkDynamicConfig$Server.parseLine(ZkDynamicConfig.java:142)

at
org.apache.solr.common.cloud.ZkDynamicConfig.lambda$parseLines$0(ZkDynamicConfig.java:58)

at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)

at
java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)

at
java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)

at
org.apache.solr.common.cloud.ZkDynamicConfig.parseLines(ZkDynamicConfig.java:53)

at
org.apache.solr.handler.admin.ZookeeperStatusHandler.handleRequestBody(ZookeeperStatusHandler.java:83)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)

at
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:854)

at
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:818)

at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:566)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)

at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)

at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)

at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)

at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)

at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)

at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)

at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)

at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)

at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)

at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)

at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)

at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)

at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)

at org.eclipse.jetty.server.Server.handle(Server.java:500)

at
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)

at
org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)

at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)

at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)

at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)

at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)

at
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)

at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)

at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)

at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)

at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)

at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)

at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)

at
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)

at java.base/java.lang.Thread.run(Thread.java:830)

My zknode's /zookeeper/config has this content:

server.1=dev-confluentzoo2014.sol-osl01.host.example.com:7795:7796:participant
server.2=dev-confluentzoo2015.sol-osl01.host.example.com:7795:7796:participant
server.3=dev-confluentzoo2016.sol-osl01.host.example.com:7795:7796:participant
version=0

This probl

Unable to upgrade Lucene 6.x index using IndexUpgrader

2019-05-22 Thread Henrik B A
I'm trying to upgrade a index from Lucene 6.x to 7.x, and then to 8.x,
using IndexUpgrader [1].  But it never successfully upgrades to 7, and I
cannot figure out why.

I've also tried using CheckIndex [2] with the -exorcise option to fix the
index first, but that doesn't help.

Any ideas?  I've added console output at the end of this mail.

Cheers,
Henrik

[1] https://lucene.apache.org/solr/guide/7_7/indexupgrader-tool.html
[2]
https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/index/CheckIndex.html#main-java.lang.String:A-



user@dev-appsolr2535:/home/user> java -cp
'/opt/common/apps/apache-solr-7.7.1/server/solr-webapp/webapp/WEB-INF/lib/*'
org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
/opt/user/solr5-prog-fhh-storage-solr/solr/prog-fhh-storage-solr_shard4_replica_n259/data/index
IFD 0 [2019-05-22T09:24:14.317Z; main]: init: current segments file is
"segments_505t";
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@29774679
IFD 0 [2019-05-22T09:24:14.336Z; main]: init: load commit "segments_505t"
IFD 0 [2019-05-22T09:24:14.346Z; main]: init: seg=_11nw3 set
nextWriteFieldInfosGen=134 vs current=1
IFD 0 [2019-05-22T09:24:14.347Z; main]: init: seg=_11nw3 set
nextWriteDocValuesGen=134 vs current=1
IFD 0 [2019-05-22T09:24:14.358Z; main]: now checkpoint
"_11nw3(7.7.1):C16557/1395:[diagnostics={os=Linux,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, mergeMaxNumSegments=-1, os.arch=amd64,
java.runtime.version=11.0.1+13, source=merge, mergeFactor=10,
os.version=3.16.0-5-amd64,
timestamp=1558499119537}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]:delGen=133
_11o9p(7.7.1):c798:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507559781}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11o9z(7.7.1):c797:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507739341}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa0(7.7.1):C1:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
source=flush,
timestamp=1558507750723}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa1(7.7.1):C2:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
source=flush,
timestamp=1558507812631}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]"
[5 segments ; isCommit = false]
IFD 0 [2019-05-22T09:24:14.358Z; main]: 11 msec to checkpoint
IW 0 [2019-05-22T09:24:14.359Z; main]: init: create=false reader=null
IW 0 [2019-05-22T09:24:14.362Z; main]:
dir=MMapDirectory@/opt/user/solr5-prog-fhh-storage-solr/solr/prog-fhh-storage-solr_shard4_replica_n259/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@14bf9759
index=_11nw3(7.7.1):C16557/1395:[diagnostics={os=Linux,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, mergeMaxNumSegments=-1, os.arch=amd64,
java.runtime.version=11.0.1+13, source=merge, mergeFactor=10,
os.version=3.16.0-5-amd64,
timestamp=1558499119537}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]:delGen=133
_11o9p(7.7.1):c798:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507559781}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11o9z(7.7.1):c797:[diagnostics={os=Linux, java.vendor=AdoptOpenJDK,
java.version=11.0.1, java.vm.version=11.0.1+13, lucene.version=7.7.1,
mergeMaxNumSegments=-1, os.arch=amd64, java.runtime.version=11.0.1+13,
source=merge, mergeFactor=10, os.version=3.16.0-5-amd64,
timestamp=1558507739341}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa0(7.7.1):C1:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
source=flush,
timestamp=1558507750723}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}]
_11oa1(7.7.1):C2:[diagnostics={java.runtime.version=11.0.1+13,
java.vendor=AdoptOpenJDK, java.version=11.0.1, java.vm.version=11.0.1+13,
lucene.version=7.7.1, os=Linux, os.arch=amd64, os.version=3.16.0-5-amd64,
sour

Recovering forever after upgrade to 8.8.0: Timeout waiting for collection state

2021-02-05 Thread Henrik B A
After upgrading our Solr Cloud collections from 8.7.0 to 8.8.0 I struggle
to get a consistent state.  We have 8 servers hosting 3 collections, with
shards/replicas spread over all the servers.

All replicas on solr3577 is in "Recovering" state, and is repeating every
five minutes: "RemoteSolrException: Error from server at
http://solr3579.foo.bar:12621/solr: Timeout waiting for collection state",
as you see here:


ERROR [20210205T090741,988]
recoveryExecutor-11-thread-8-processing-n:solr3579.foo.bar:12621_solr
x:foo_bar_shard22_replica_n86 c:foo_bar s:shard22 r:core_node89
org.apache.solr.cloud.RecoveryStrategy - Recovery failed - trying again...
(12)
ERROR [20210205T090741,995]
recoveryExecutor-11-thread-9-processing-n:solr3579.foo.bar:12621_solr
x:foo_bar_shard2_replica_n6 c:foo_bar s:shard2 r:core_node9
org.apache.solr.cloud.RecoveryStrategy - Error while trying to recover.
core=foo_bar_shard2_replica_n6:java.util.concurrent.ExecutionException:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://solr3579.foo.bar:12621/solr: Timeout waiting for
collection state.
at
java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
at
java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
at
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:876)
at
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:614)
at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333)
at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:316)
at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at java.base/java.lang.Thread.run(Thread.java:832)
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://solr3579.foo.bar:12621/solr: Timeout waiting for
collection state.
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.lambda$httpUriRequest$0(HttpSolrClient.java:310)
... 5 more
ERROR [20210205T090741,995]
recoveryExecutor-11-thread-9-processing-n:solr3579.foo.bar:12621_solr
x:foo_bar_shard2_replica_n6 c:foo_bar s:shard2 r:core_node9
org.apache.solr.cloud.RecoveryStrategy - Recovery failed - trying again...
(12)


At the same time solr3579 is repeating "NotInClusterStateException: Timeout
waiting for collection state", as seen here:


ERROR [20210205T090741,994] qtp313082880-176670
org.apache.solr.servlet.HttpSolrCall -
org.apache.solr.cloud.ZkController$NotInClusterStateException: Timeout
waiting for collection state.
at
org.apache.solr.handler.admin.PrepRecoveryOp.execute(PrepRecoveryOp.java:163)
at
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367)
at
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:397)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:181)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
at
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:836)
at
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:800)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:545)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
at
org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
at
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1612)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler

Re: Recovering forever after upgrade to 8.8.0: Timeout waiting for collection state

2021-02-05 Thread Henrik B A
On Fri, Feb 5, 2021 at 10:58 AM Henrik Brautaset Aronsen
 wrote:

> After upgrading our Solr Cloud collections from 8.7.0 to 8.8.0 I struggle
> to get a consistent state.  We have 8 servers hosting 3 collections, with
> shards/replicas spread over alle the servers.
>
> All replicas on solr3577 is in "Recovering" state, and is repeating every
> five minutes: "RemoteSolrException: Error from server at
> http://solr3579.foo.bar:12621/solr
> <
> http://solrtravel3577.europe-north1-c.finn-forklift-search-prod.gcp.int.host.finntech.no:12621/solr
> >:
> Timeout waiting for collection state", as you see here:
>

It seems my timeout settings were too aggressive. Things started working
when I restarted the nodes with -Dsolr.jetty.http.idleTimeout=6
-DsocketTimeout=6 -DconnTimeout=6. Previous values were
-Dsolr.jetty.http.idleTimeout=1
-DsocketTimeout=5000 -DconnTimeout=5000.

Cheers,
Henrik