Howdy, I'm trying to test shard splitting, and it's not working for me. I've got a 4 node cloud with a single collection and 2 shards.
I've indexed 170k small documents, and I'm using the compositeId router, with an internal "client id" as the shard key, with 4 distinct values across the data set. For my testing, the values of the shard keys are 1 through 4. Before splitting, shard1 contains 100k docs (all of the docs for shard keys 1 and 4) and shard2 contains 70k docs (all of the docs for shard keys 2 and 3). In prod, we're going to have thousands of unique shard keys, but for now, I'm testing at a smaller scale. I attempt to split shard2 with http://host0:8983/solr/admin/collections?action=SPLITSHARD&collection=coll&shard=shard2 I understand the shard splitting is on hash range, not document count, and it shouldn't split up documents within a single shard key, so I'm ok with it if both shard keys end up in the same sub-shard. I see the following in the logs: 689524 [qtp259549756-119] ERROR org.apache.solr.servlet.SolrDispatchFilter – null:java.lang.RuntimeException: java.lang.IllegalArgumentException: maxValue must be non-negative (got: -1) at org.apache.solr.handler.admin.CoreAdminHandler.handleSplitAction(CoreAdminHandler.java:290) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:186) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:209) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.IllegalArgumentException: maxValue must be non-negative (got: -1) at org.apache.lucene.util.packed.PackedInts.bitsRequired(PackedInts.java:1184) at org.apache.lucene.codecs.lucene42.Lucene42DocValuesConsumer.addNumericField(Lucene42DocValuesConsumer.java:140) at org.apache.lucene.codecs.lucene42.Lucene42DocValuesConsumer.addNumericField(Lucene42DocValuesConsumer.java:92) at org.apache.lucene.codecs.DocValuesConsumer.mergeNumericField(DocValuesConsumer.java:112) at org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:221) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:119) at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2488) at org.apache.solr.update.SolrIndexSplitter.split(SolrIndexSplitter.java:125) at org.apache.solr.update.DirectUpdateHandler2.split(DirectUpdateHandler2.java:766) at org.apache.solr.handler.admin.CoreAdminHandler.handleSplitAction(CoreAdminHandler.java:284) ... 30 more I thought maybe there was an issue with performing a split where all docs go to one sub-shard, so I wiped everything clean and re-indexed the same data using default routing (no shard key). Now when I perform a split, I get a replication error on the second replica of both sub-shards: 453616 [RecoveryThread] WARN org.apache.solr.update.PeerSync – PeerSync: core=coll_shard2_0_replica2 url=http://host2:8983/solr exception talking to http://host1:8983/solr/coll_shard2_0_replica1/, failed org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server at http://host1:8983/solr/coll_shard2_0_replica1 returned non ok status:404, message:Not Found at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) if I browse to http://host1:8983/solr/coll_shard2_0_replica1<http://solrcloud1:8983/solr/#/marin_shard2_0_replica1> I do get a 404, but http://host1:8983/solr/#/coll_shard2_0_replica1<http://solrcloud1:8983/solr/#/marin_shard2_0_replica1> does bring up the admin UI for that core. I've been running solr cloud 4.1.0 for a while now, and our prod cloud has ~1.8 billion docs indexed into 3 shards. I'd really like to get splitting working so we can add more hardware. Any ideas? -Greg