NullPointerException when select query on multiple IDs after splitting a shard
We are using SolrCloud 7.7.2 and having some trouble with multi shards. The initial number of shards in the collection was one(and 3 replica), and we have recently split a shard into 4 shards(and 3 replica for each shard) by using API call : /admin/collections?action=SPLITSHARD&collection=name&shard=shardID And now this query, "https://hostname:8983/solr/my_collection/select?ids=169455599|1" , is returning NPE, which was working before multi shards. Could you please give any advice on why select query on multiple IDs doesn't work after shart splitting? null:java.lang.NullPointerException at org.apache.solr.handler.component.QueryComponent.unmarshalSortValues(QueryComponent.java:1034) at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:885) at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:585) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:564) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:426) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2551) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:502) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411) at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:305) at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683
Re: NullPointerException when select query on multiple IDs after splitting a shard
Yes, we are sure that this is not typo. Actually we did more experiments and found that 1) https://hostname:8983/solr/my_collection/select?ids=169455599|1 2) https://hostname:8983/solr/my_collection/select?q=id:169455599|1 3) https://hostname:8983/solr/my_collection/get?ids=169455599|1 1) throws NPE, but 2) and 3) is working. However 1) is also working with one shard in the same collection which I backed up before splitting a shard It looks very weird and we are wondering about missing any config. Or, 1) is the type of the syntax Solr discourages to use? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
RE: tlogs are not deleted
Thank you for advice. I will disable it and check if tlogs will disappear. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
CDCR cpu usage 100% with some errors
* Solr Version 7.7. Using Cloud with CDCR * 3 replicas 1 shard on production and disaster recovery Hi, Last week, I posted a question about tlogs - https://lucene.472066.n3.nabble.com/tlogs-are-not-deleted-td4451323.html#a4451430 I disabled buffer based on the advice, but still, tlogs in "production" are not being deleted. (tlogs in "disaster recovery" nodes are cleaned.) And there is another issue, which I suspect it to be related to the problem that I previously posted. I am having tons of logs from our "disaster recovery" nodes. The log files are building up at an incredibly fast rate with the messages below forever and cpu usage is always 100% every day("production" nodes' cpu usage is normal). It looks like replicating from production server to disaster recovery, but it actually never ends. Is this high cpu usage on disaster recovery nodes be normal? And is tlogs, which is not being cleaned properly, on production nodes related to high cpu usage on dr nodes? * * 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection1_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection2_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection3_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:18:11.729 INFO (cdcr-replicator-378-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection1 2019-10-28 18:18:11.730 INFO (cdcr-replicator-282-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection2 2019-10-28 18:18:11.730 INFO (cdcr-replicator-332-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection3 ... *And in the middle of logs, I see the following exception for some of the collections.* 2019-10-28 18:18:11.732 WARN (cdcr-replicator-404-thread-1) [ ] o.a.s.h.CdcrReplicator Failed to forward update request to target: collection_steps java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.List at org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.getVersion(CdcrUpdateLog.java:732) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.next(CdcrUpdateLog.java:635) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:77) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) ~[solr-solrj-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:50] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
CDCR cpu usage 100% with some errors
* Solr Version 7.7. Using Cloud with CDCR * 3 replicas 1 shard on production and disaster recovery Hi, Last week, I posted a question about tlogs - https://lucene.472066.n3.nabble.com/tlogs-are-not-deleted-td4451323.html#a4451430 I disabled buffer based on the advice, but still, tlogs in "production" are not being deleted. (tlogs in "disaster recovery" nodes are cleaned.) And there is another issue, which I suspect it to be related to the problem that I previously posted. I am having tons of logs from our "disaster recovery" nodes. The log files are building up at an incredibly fast rate with the messages below forever and cpu usage is always 100% every day("production" nodes' cpu usage is normal). It looks like replicating from production server to disaster recovery, but it actually never ends. Is this high cpu usage on disaster recovery nodes be normal? And is tlogs, which is not being cleaned properly, on production nodes related to high cpu usage on dr nodes? *these are the sample messages from tons of logs in disaster recovert nodes * 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection1_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection2_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection3_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:18:11.729 INFO (cdcr-replicator-378-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection1 2019-10-28 18:18:11.730 INFO (cdcr-replicator-282-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection2 2019-10-28 18:18:11.730 INFO (cdcr-replicator-332-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection3 ... *And in the middle of logs, I see the following exception for some of the collections.* 2019-10-28 18:18:11.732 WARN (cdcr-replicator-404-thread-1) [ ] o.a.s.h.CdcrReplicator Failed to forward update request to target: collection_steps java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.List at org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.getVersion(CdcrUpdateLog.java:732) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.next(CdcrUpdateLog.java:635) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:77) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) ~[solr-solrj-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:50] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: CDCR cpu usage 100% with some errors
I just saw this article. https://issues.apache.org/jira/browse/SOLR-13349 Can my issue be related to this? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Three questions about huge tlog problem and CDCR
* Environment: Solr Cloud 7.7.0, 3 nodes / CDCR bidirectional / CDCR buffer disabled Hello All, I have some problem with tlog. They are getting bigger and bigger... They don't seem to be deleted at all even after hard commit, so now the total size of tlog files is more than 21GB.. Actually I see multiple tlog folders like, 2.5GB tlog/ 6.7GB tlog.20190815170021077/ 6.7GB tlog.20190316225613751/ ... Are they all necessary for recovery? what is the tlog.2019 folders? Based on my understanding, tlog files are for recovery when graceful shutdown failed.. 1) As long as I stop entire nodes gracefully, is it safe to delete tlog files manually by using rm -rf ./tlogs? 2) I think that the reason why tlog files are not deleted is because of CDCR not working properly.. So tlogs just stay forever until being synchronized.. And synchronization never happened and tlogs keep increasing.. Does my theory make sense? 3) Actually, we set up our replicator element's schdule to 1 hour and updatelogsynchronizer element to 1 hour as well. Could this be the reason for why CDCR is not working because of the interval is too long? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
tlogs are not purged when CDCR is enabled
Using Solr 7.7.3-snapshot, 1 shard + 3 replicas on source and target cluster When unidirectional CDCR enabled and buffer disabled, my understanding is, when data is successfully forwarded to target and committed, tlogs on both source and target should be purged. However, the source node doesn't purge tlogs no matter how I tried(manually committed as well) while tlogs on target are purged. (if I turn off CDCR and import data, tlogs is nicely cleaned) So I tested with some queries.. and there are no errors. queue size is 0, and the last processed version is not -1 either. I also double-checked CDCR buffer disabled on both source and target, and CDCR(unidirectioanl) data replication is working fine(except that fact that tlogs keep growing). What am I missing and what else should I check next? $ curl -k https://localhost:8983/solr/tbh_manuals_uni_shard1_replica_n2/cdcr?action=QUEUES { "responseHeader":{ "status":0, "QTime":0}, "queues":[ "host1:8981,host2:8981,host3:8981/solr",[ "tbh_manuals_uni",[ "queueSize",0, "lastTimestamp","2020-01-08T23:16:26.899Z"]]], "tlogTotalSize":503573, "tlogTotalCount":278, "updateLogSynchronizer":"stopped"} $ curl -k https://localhost:8983/solr/tbh_manuals_uni_shard1_replica_n2/cdcr?action=ERRORS { "responseHeader":{ "status":0, "QTime":1}, "errors":[ "host1:8981,host2:8981,host3:8981/solr",[ "tbh_manuals_uni",[ "consecutiveErrors",0, "bad_request",0, "internal",0, "last",[} $ curl -k https://localhost:8983/solr/tbh_manuals_uni_shard1_replica_n2/cdcr?action=LASTPROCESSEDVERSION { "responseHeader":{ "status":0, "QTime":0}, "lastProcessedVersion":1655203836093005824} I actually see some errors on zookeeper.out file only in target's leader node as follows. However honestly, I don't know what they mean.. 2020-01-08 15:11:42,740 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590008 type:create cxid:0xd2 zxid:0x300b4 txntype:-1 reqpath:n/a Error Path:/solr/collections Error:KeeperErrorCode = NodeExists for /solr/collections 2020-01-08 15:11:42,742 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590008 type:create cxid:0xd3 zxid:0x300b5 txntype:-1 reqpath:n/a Error Path:/solr/collections/tbh_manuals_uni Error:KeeperErrorCode = NodeExists for /solr/collections/tbh_manuals_uni 2020-01-08 15:11:42,744 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590008 type:create cxid:0xd4 zxid:0x300b6 txntype:-1 reqpath:n/a Error Path:/solr/collections/tbh_manuals_uni/terms Error:KeeperErrorCode = NodeExists for /solr/collections/tbh_manuals_uni/terms 2020-01-08 15:11:42,745 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590008 type:create cxid:0xd5 zxid:0x300b7 txntype:-1 reqpath:n/a Error Path:/solr/collections/tbh_manuals_uni/terms/shard1 Error:KeeperErrorCode = NodeExists for /solr/collections/tbh_manuals_uni/terms/shard1 2020-01-08 15:11:42,821 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590005 type:create cxid:0x23c zxid:0x300ba txntype:-1 reqpath:n/a Error Path:/solr/collections Error:KeeperErrorCode = NodeExists for /solr/collections 2020-01-08 15:11:42,823 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590005 type:create cxid:0x23d zxid:0x300bb txntype:-1 reqpath:n/a Error Path:/solr/collections/tbh_manuals_uni Error:KeeperErrorCode = NodeExists for /solr/collections/tbh_manuals_uni 2020-01-08 15:11:42,825 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590005 type:create cxid:0x23e zxid:0x300bc txntype:-1 reqpath:n/a Error Path:/solr/collections/tbh_manuals_uni/terms Error:KeeperErrorCode = NodeExists for /solr/collections/tbh_manuals_uni/terms 2020-01-08 15:11:42,827 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590005 type:create cxid:0x23f zxid:0x300bd txntype:-1 reqpath:n/a Error Path:/solr/collections/tbh_manuals_uni/terms/shard1 Error:KeeperErrorCode = NodeExists for /solr/collections/tbh_manuals_uni/terms/shard1 2020-01-08 15:11:45,185 [myid:2] - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x301d2ecaf590005 type:setData cxid:0x274 zxid:0x300ce txntype:-1 reqpath:n/a Error Pa
Re: Tlogs are not purged when CDCR is enabled
Another finding is, no matter how I tried to disable buffer with the following setup on target node, it is always enabled first time. disabled Once I call CDCR API to disable buffer, it turns to be disabled. I wonder if https://issues.apache.org/jira/browse/SOLR-11652 is related to this issue.. How can I make the default state of buffer disabled if this setup doesn't work? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Is there any way to forcefully delete tlog?
Hi, Now I finally gave up deleting tlogs(disabling buffer, restarting source node, neither of them worked..) and am now writing cron script to delete tlogs. However, I discovered that even if I delete tlogs files manually, QUEUES api still returns the positive number of tlogTotalCount(but tlogTotalSize is zero). I guess Solr stores tlogs count in its memory and tries to recover tlogs files. Is there any way(hopefully rest api) to delete tlogs from disk as well as memory in Solr? Thanks. Louis -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
How to force delete tlog?
Hi, Now I finally gave up deleting tlogs(disabling cdcr buffer, restarting source node, neither of them worked..) and am now writing cron script to delete tlogs. However, I discovered that even if I delete tlogs files manually, QUEUES api still returns the positive number of tlogTotalCount(but tlogTotalSize is zero). I guess Solr stores tlogs count in its memory and tries to recover tlogs files when it reloads collections, which brings back a bunch of tlogs with zero size. Is there any way(hopefully rest api) to delete tlogs from disk as well as memory in Solr? Thanks. Louis -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6
Dear all, We start to upgrade a huge SolrCloud cluster from 5.4.1 to lastest version 8.5.1. Context : . Ubuntu 16.04, 64b, JVM Oracle 8 101 and now OpenJDK 8 252 . We can't reindex documents because old ones doesn't exist anymore, so no other choices than upgrading indexes. Our upgrading strategy is based on indexUpgrader Tool. 5.4.1 -> 5.5.5 : Ok 5.5.5 -> 6.6.6 : Ok 6.6.6 -> 7.7.3 : ok Unable to upgrade 7.7.3 to 8.5.1 : here my problem using 8.5.1, indexUpgrader : Exception in thread "main" org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource BufferedChecksumIndexInput(MMapIndexInput(path="/data2/solr/nodes/node1/solr/insight_dw_shard3_replica_n69/data/index/segments_2nz0"))): This index was initially created with Lucene 6.x while the current version is 8.5.1 and Lucene only supports reading the current and previous major versions.. This version of Lucene only supports indexes created with release 7.0 and later. at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:318) at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:432) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:429) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:680) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:632) at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:434) at org.apache.lucene.index.DirectoryReader.listCommits(DirectoryReader.java:285) at org.apache.lucene.index.IndexUpgrader.upgrade(IndexUpgrader.java:158) at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:78) But when I check the index version with 7.7.3, the segment seems to be 7.7.3! 0.00% total deletions; 50756501 documents; 0 deleteions Segments file=segments_2nz0 numSegments=1 version=7.7.3 id=ay2stfke7hwy9gippl8k77tdd userData={commitTimeMSec=1589314850951} 1 of 1: name=_2rr9t maxDoc=50756501 version=7.7.3 id=9pubpiwgt38rzyxr7litvgcu5 codec=Lucene70 compound=false numFiles=10 size (MB)=338,143.905 diagnostics = {os=Linux, java.vendor=Oracle Corporation, java.version=1.8.0_101, java.vm.version=25.101-b13, lucene.version=7.7.3, mergeMaxNumSegments=1, os.arch=amd64, java.runtime.version=1.8.0_101-b13, source=merge, mergeFactor=2, os.version=3.13.0-147-generic, timestamp=1589484981711} no deletions test: open reader.OK [took 2.779 sec] When I read the different thread, some people say that when a segment is "marked as v6 lucene index", this mark remains across upgrading, so we are stucked in 7.7.3 version. What are my options? Many many thanks for your help, Jean-Louis Jean-Louis Vila, PhD Directeur technique Sword SAS d +33 4 72 85 37 60 m+33 6 17 81 14 69 t +33 4 72 85 37 40 e jean-louis.v...@sword-group.com<mailto:jean-louis.v...@sword-group.com> 9 avenue Charles de Gaulle 69771, Saint Didier au Mont d'Or France www.sword-group.com<http://www.sword-group.com/> P Pensez à l'environnement avant d'imprimer ce message / Please consider the environment before printing this mail note. Ce message et toutes les pièces jointes (ci-après le "message") sont établis à l'intention exclusive de ses destinataires et sont confidentiels. Si vous recevez ce message par erreur, merci de le détruire et d'en avertir immédiatement l'expéditeur. Toute utilisation de ce message non conforme à sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite, sauf autorisation expresse. Internet ne permettant pas d'assurer l'intégrité de ce message, le Groupe Sword (et ses filiales) décline(nt) toute responsabilité au titre de ce message, dans l'hypothèse où il aurait été modifié, altéré ou falsifié. Le Groupe Sword vous remercie de votre attention.
RE: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6
Many thanks for your answers Erik. Effectively, I've read this into many different threads that the migration path will not be guaranteed but, what's strange is that there's no formal information on this impossibility because clearly we can't migrate to v8 if indexes are not "pure" v7 indexes. I understand reason (y =f(x)) but al least a simple documentation about the fact that a Lucene 6 segments can't be upgrade into Lucene 8 would be appreciate. More, the check tool just shows v7.7.3 index and there is no mention about "real" segment version which v6! So forbid to open v7 lucene indexes upgraded from v6, is quiet brutal and the rule about that we can migrate only from previous major version is not completely true :-( I'll stay into v7.7.3 Thanks again, Jean-Louis -Original Message- From: Erick Erickson Sent: mardi 19 mai 2020 15:00 To: solr-user@lucene.apache.org Subject: Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6 This will not work. Lucene has never promised this upgrade path would work, the “one major version back-compat” means that Lucene X has special handling for X-1, but for X-2, all bets are off. Starting with Solr 6, a marker is written into the segments recording the version of Lucene the segment was written with. That marker is preserved through all merges/upgrades/whatever. Starting with Lucene 8, if any segment has a marker for Lucene 6 (or no marker at all for earlier versions), then Lucene will refuse to open the index. IndexUpgraderTool and the like simply cannot synthesize the new index format, the most succinct explanation I’ve seen is from Robert Muir: “I think the key issue here is Lucene is an index not a database. Because it is a lossy index and does not retain all of the user's data, its not possible to safely migrate some things automagically. In the norms case IndexWriter needs to re-analyze the text ("re-index") and compute stats to get back the value, so it can be re-encoded. The function is y = f(x) and if x is not available its not possible, so lucene can't do it.” So you’ll have to re-index your corpus with Solr 8 I’m afraid. Best, Erick > On May 19, 2020, at 4:19 AM, VILA Jean-Louis > wrote: > > Dear all, > > We start to upgrade a huge SolrCloud cluster from 5.4.1 to lastest version > 8.5.1. >Context : > . Ubuntu 16.04, 64b, JVM Oracle 8 101 and now OpenJDK 8 252 . We can't > reindex documents because old ones doesn't exist anymore, so no other choices > than upgrading indexes. > > Our upgrading strategy is based on indexUpgrader Tool. >5.4.1 -> 5.5.5 : Ok >5.5.5 -> 6.6.6 : Ok >6.6.6 -> 7.7.3 : ok >Unable to upgrade 7.7.3 to 8.5.1 : here my problem using > 8.5.1, indexUpgrader : > > Exception in thread "main" > org.apache.lucene.index.IndexFormatTooOldException: Format version is not > supported (resource > BufferedChecksumIndexInput(MMapIndexInput(path="/data2/solr/nodes/node1/solr/insight_dw_shard3_replica_n69/data/index/segments_2nz0"))): > This index was initially created with Lucene 6.x while the current version > is 8.5.1 and Lucene only supports reading the current and previous major > versions.. This version of Lucene only supports indexes created with release > 7.0 and later. >at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:318) >at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289) >at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:432) >at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:429) >at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:680) >at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:632) >at > org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:434) >at > org.apache.lucene.index.DirectoryReader.listCommits(DirectoryReader.java:285) >at > org.apache.lucene.index.IndexUpgrader.upgrade(IndexUpgrader.java:158) >at > org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:78) > > But when I check the index version with 7.7.3, the segment seems to be 7.7.3! > 0.00% total deletions; 50756501 documents; 0 deleteions Segments > file=segments_2nz0 numSegments=1 version=7.7.3 > id=ay2stfke7hwy9gippl8k77tdd userData={commitTimeMSec=1589314850951} > 1 of 1: name=_2rr9t maxDoc=50756501 >version=7.7.3 >id=9pubpiwgt38rzyxr7litvgcu5 >codec=Lucene70 >compound=false >numFiles=10 >size (MB)=338,143.905 >diagnostics = {os=Linux, java.vendor=Oracle Corporat
RE: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6
Thanks Walter, but I can't imagine that will work because if this could work, then the index Upgrader should work and it is not the case ☹ Because of the format, the index iv6 can't be rewrite whatever the process you use (add replica, optimize, etc...) The only way I have is the full reindexing! 260 000 000 docs / 3TB indexes, a specific preprocessing, it will be very very long.. -Original Message- From: Walter Underwood Sent: mardi 19 mai 2020 17:43 To: solr-user@lucene.apache.org Subject: Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6 Hmm, might be able to hack this with optimize (forced merge). First, you would have to add enough extra documents to force a rewrite of all segments. That might be as many documents as are already in the index. You could set a “fake:true” field and filter them out with an fq. Or make sure they have no searchable text. After adding all those, run optimize. This should rewrite all the segments in the new format. Finally, delete all the extra documents. Might want to do another optimize after that. No guarantee that this desperate hack will work. wunder Walter Underwood wun...@wunderwood.org https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobserver.wunderwood.org%2F&data=02%7C01%7CJean-Louis.VILA%40sword-group.com%7C630d6fc16a954cac9c6008d7fc0b587b%7C6adf23d8eabe44c8b68a0b8fb7aacef9%7C1%7C0%7C637254997968577639&sdata=yPhyNyGjjJhKgu%2Bmvkp7%2Fwsx8%2FAR8x5rEnmWRjgmSv8%3D&reserved=0 (my blog) > On May 19, 2020, at 6:21 AM, VILA Jean-Louis > wrote: > > Many thanks for your answers Erik. > > Effectively, I've read this into many different threads that the migration > path will not be guaranteed but, what's strange is that there's no formal > information on this impossibility because clearly we can't migrate to v8 if > indexes are not "pure" v7 indexes. I understand reason (y =f(x)) but al least > a simple documentation about the fact that a Lucene 6 segments can't be > upgrade into Lucene 8 would be appreciate. > > More, the check tool just shows v7.7.3 index and there is no mention > about "real" segment version which v6! So forbid to open v7 lucene > indexes upgraded from v6, is quiet brutal and the rule about that we > can migrate only from previous major version is not completely true > :-( I'll stay into v7.7.3 > > Thanks again, > Jean-Louis > > -Original Message- > From: Erick Erickson > Sent: mardi 19 mai 2020 15:00 > To: solr-user@lucene.apache.org > Subject: Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6 > > This will not work. Lucene has never promised this upgrade path would work, > the “one major version back-compat” means that Lucene X has special handling > for X-1, but for X-2, all bets are off. Starting with Solr 6, a marker is > written into the segments recording the version of Lucene the segment was > written with. That marker is preserved through all merges/upgrades/whatever. > > Starting with Lucene 8, if any segment has a marker for Lucene 6 (or no > marker at all for earlier versions), then Lucene will refuse to open the > index. > > IndexUpgraderTool and the like simply cannot synthesize the new index format, > the most succinct explanation I’ve seen is from Robert Muir: > > “I think the key issue here is Lucene is an index not a database. Because it > is a lossy index and does not retain all of the user's data, its not possible > to safely migrate some things automagically. In the norms case IndexWriter > needs to re-analyze the text ("re-index") and compute stats to get back the > value, so it can be re-encoded. The function is y = f(x) and if x is not > available its not possible, so lucene can't do it.” > > So you’ll have to re-index your corpus with Solr 8 I’m afraid. > > Best, > Erick > > >> On May 19, 2020, at 4:19 AM, VILA Jean-Louis >> wrote: >> >> Dear all, >> >> We start to upgrade a huge SolrCloud cluster from 5.4.1 to lastest version >> 8.5.1. >> Context : >> . Ubuntu 16.04, 64b, JVM Oracle 8 101 and now OpenJDK 8 252 . We >> can't reindex documents because old ones doesn't exist anymore, so no other >> choices than upgrading indexes. >> >> Our upgrading strategy is based on indexUpgrader Tool. >> 5.4.1 -> 5.5.5 : Ok >> 5.5.5 -> 6.6.6 : Ok >> 6.6.6 -> 7.7.3 : ok >> Unable to upgrade 7.7.3 to 8.5.1 : here my problem using >> 8.5.1, indexUpgrader : >> >> Exception in thread "main" >> org.apache.lucene.index.IndexFormatTooOldException: Format version is
Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6
Erick I just suggest a dedicated page to upgrade path because reading the page about indexUpgraderTool, we understand well that we can’t upgrade in one phase but 6->7->8 must be made and nowhere it is specified that from Lucene 6, the segments are marked V6 for ever. Naively, by transitivity, the upgrade path 6>7>8 is quiet natural. From my point of view, we must speak about “since Lucene 6, version is compatible previous major version of an index” not upgrading. The term is ambiguous. The thinks must be clear, I understand the problem :-) Jean louis > Le 19 mai 2020 à 19:03, Erick Erickson a écrit : > > Jean-Louis: > > One explication is here: > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F8_5%2Findexupgrader-tool.html&data=02%7C01%7CJean-Louis.VILA%40sword-group.com%7C26ae67c92e4149a2eb7c08d7fc167c6f%7C6adf23d8eabe44c8b68a0b8fb7aacef9%7C1%7C0%7C63725504581966&sdata=HapOVXDPluPWEC%2BSAVpTJju94od0y4X%2BNNoRd%2Beh2TE%3D&reserved=0, > but then again the reference guide is very long, I’m not sure how to make it > more findable. Or, for that matter, whether it should be part of the > IndexUpgraderTool section or not. Please feel free to suggest (even better, > submit a patch) if you can think of a place it’d be more easily findable. > It’s always useful to have someone with fresh eyes weigh in. > > Optimize won’t work. Under the covers, optimize is just a merge. It uses the > exact same low-level merging code that background merging uses, including > preserving the markers in the segment files. That’s why the Lucene devs use > “forceMerge” rather than “optimize”, the latter is easy to interpret as > something that does more than it really does. > > This is also the same code that IndexUpgraderTool uses too for that matter. > IndexUpgraderTool is, really, just a forceMerge down to one segment, which is > all optimize is (assuming you specify maxSegments=1). > > Best, > Erick > >> On May 19, 2020, at 11:42 AM, Walter Underwood wrote: >> >> Hmm, might be able to hack this with optimize (forced merge). >> >> First, you would have to add enough extra documents to force a rewrite of >> all segments. That might be as many documents as are already in the index. >> You could set a “fake:true” field and filter them out with an fq. Or make >> sure they have no searchable text. >> >> After adding all those, run optimize. This should rewrite all the segments >> in the new format. >> >> Finally, delete all the extra documents. Might want to do another optimize >> after that. >> >> No guarantee that this desperate hack will work. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobserver.wunderwood.org%2F&data=02%7C01%7CJean-Louis.VILA%40sword-group.com%7C26ae67c92e4149a2eb7c08d7fc167c6f%7C6adf23d8eabe44c8b68a0b8fb7aacef9%7C1%7C0%7C63725504581966&sdata=uLAG8jtE15ydynynxEgKEEhOeng08DdpKgaKU81RB%2Bk%3D&reserved=0 >> (my blog) >> >>>> On May 19, 2020, at 6:21 AM, VILA Jean-Louis >>>> wrote: >>> >>> Many thanks for your answers Erik. >>> >>> Effectively, I've read this into many different threads that the migration >>> path will not be guaranteed but, what's strange is that there's no formal >>> information on this impossibility because clearly we can't migrate to v8 if >>> indexes are not "pure" v7 indexes. I understand reason (y =f(x)) but al >>> least a simple documentation about the fact that a Lucene 6 segments can't >>> be upgrade into Lucene 8 would be appreciate. >>> >>> More, the check tool just shows v7.7.3 index and there is no mention about >>> "real" segment version which v6! So forbid to open v7 lucene indexes >>> upgraded from v6, is quiet brutal and the rule about that we can migrate >>> only from previous major version is not completely true :-( >>> I'll stay into v7.7.3 >>> >>> Thanks again, >>> Jean-Louis >>> >>> -Original Message- >>> From: Erick Erickson >>> Sent: mardi 19 mai 2020 15:00 >>> To: solr-user@lucene.apache.org >>> Subject: Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6 >>> >>> This will not work. Lucene has never promised this upgrade path would work, >>> the “one major version back-compat” means that Lucene X has special >>> handling for X-1, but for X-2, all bets are off. Starting with Solr 6, a >&g