Howdy,

I'm trying to test shard splitting, and it's not working for me.  I've got
a 4 node cloud with a single collection and 2 shards.

I've indexed 170k small documents, and I'm using the compositeId router,
with an internal "client id" as the shard key, with 4 distinct values
across the data set.  For my testing, the values of the shard keys are 1
through 4.  Before splitting, shard1 contains 100k docs (all of the docs
for shard keys 1 and 4) and shard2 contains 70k docs (all of the docs for
shard keys 2 and 3).

In prod, we're going to have thousands of unique shard keys, but for now,
I'm testing at a smaller scale.  I attempt to split shard2 with
http://host0:8983/solr/admin/collections?action=SPLITSHARD&collection=coll&shard=shard2

I understand the shard splitting is on hash range, not document count, and
it shouldn't split up documents within a single shard key, so I'm ok with
it if both shard keys end up in the same sub-shard.

I see the following in the logs:

689524 [qtp259549756-119] ERROR org.apache.solr.servlet.SolrDispatchFilter
 – null:java.lang.RuntimeException: java.lang.IllegalArgumentException:
maxValue must be non-negative (got: -1)
        at
org.apache.solr.handler.admin.CoreAdminHandler.handleSplitAction(CoreAdminHandler.java:290)
        at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:186)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:209)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
        at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
        at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
        at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
        at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
        at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
        at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
        at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:368)
        at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
        at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
        at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
        at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
        at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
        at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
        at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.IllegalArgumentException: maxValue must be
non-negative (got: -1)
        at
org.apache.lucene.util.packed.PackedInts.bitsRequired(PackedInts.java:1184)
        at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesConsumer.addNumericField(Lucene42DocValuesConsumer.java:140)
        at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesConsumer.addNumericField(Lucene42DocValuesConsumer.java:92)
        at
org.apache.lucene.codecs.DocValuesConsumer.mergeNumericField(DocValuesConsumer.java:112)
        at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:221)
        at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:119)
        at
org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2488)
        at
org.apache.solr.update.SolrIndexSplitter.split(SolrIndexSplitter.java:125)
        at
org.apache.solr.update.DirectUpdateHandler2.split(DirectUpdateHandler2.java:766)
        at
org.apache.solr.handler.admin.CoreAdminHandler.handleSplitAction(CoreAdminHandler.java:284)
        ... 30 more



I thought maybe there was an issue with performing a split where all docs
go to one sub-shard, so I wiped everything clean and re-indexed the same
data using default routing (no shard key).  Now when I perform a split, I
get a replication error on the second replica of both sub-shards:

453616 [RecoveryThread] WARN  org.apache.solr.update.PeerSync  – PeerSync:
core=coll_shard2_0_replica2 url=http://host2:8983/solr  exception talking
to http://host1:8983/solr/coll_shard2_0_replica1/, failed
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Server at http://host1:8983/solr/coll_shard2_0_replica1 returned non ok
status:404, message:Not Found
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385)
        at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
        at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156)
        at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)


if I browse to 
http://host1:8983/solr/coll_shard2_0_replica1<http://solrcloud1:8983/solr/#/marin_shard2_0_replica1>
I
do get a 404, but
http://host1:8983/solr/#/coll_shard2_0_replica1<http://solrcloud1:8983/solr/#/marin_shard2_0_replica1>
does
bring up the admin UI for that core.

I've been running solr cloud 4.1.0 for a while now, and our prod cloud has
~1.8 billion docs indexed into 3 shards.  I'd really like to get splitting
working so we can add more hardware.

Any ideas?

-Greg

Reply via email to