Entered issue: https://issues.apache.org/jira/browse/SOLR-13563
Please let me know if I need to include any other information. I have to say, props to anyone involved in making the "ant idea" target a thing. Makes it ridiculously easy for someone who can code, but not java specifically, to look at and suggest possible fixes to the code. 10/10 would submit a ticket again! ________________________________ From: Andrzej Białecki <a...@getopt.org> Sent: Wednesday, June 19, 2019 7:07:02 AM To: solr-user@lucene.apache.org Subject: Re: Solr 7.7.2 - SolrCloud - SPLITSHARD - Using LINK method fails on disk usage checks Hi Andrew, Please create a JIRA issue and attach this patch, I’ll look into fixing this. Thanks! > On 18 Jun 2019, at 23:19, Andrew Kettmann <andrew.kettm...@evolve24.com> > wrote: > > Attached the patch, but that isn't sent out on the mailing list, my mistake. > Patch below: > > > > ### START > > diff --git > a/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java > b/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java > index 24a52eaf97..e018f8a42f 100644 > --- > a/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java > +++ > b/solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java > @@ -135,7 +135,9 @@ public class SplitShardCmd implements > OverseerCollectionMessageHandler.Cmd { > } > > RTimerTree t = timings.sub("checkDiskSpace"); > - checkDiskSpace(collectionName, slice.get(), parentShardLeader); > + if (splitMethod != SolrIndexSplitter.SplitMethod.LINK) { > + checkDiskSpace(collectionName, slice.get(), parentShardLeader); > + } > t.stop(); > > // let's record the ephemeralOwner of the parent leader node > > ### END > > ________________________________ > From: Andrew Kettmann > Sent: Tuesday, June 18, 2019 3:05:15 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr 7.7.2 - SolrCloud - SPLITSHARD - Using LINK method fails on > disk usage checks > > > Looks like the disk check here is the problem, I am no Java developer, but > this patch ignores the check if you are using the link method for splitting. > Attached the patch. This is off of the commit for 7.7.2, d4c30fc285 . The > modified version only has to be run on the overseer machine, so there is that > at least. > > ________________________________ > From: Andrew Kettmann > Sent: Tuesday, June 18, 2019 11:32:43 AM > To: solr-user@lucene.apache.org > Subject: Solr 7.7.2 - SolrCloud - SPLITSHARD - Using LINK method fails on > disk usage checks > > > Using Solr 7.7.2 Docker image, testing some of the new autoscale features, > huge fan so far. Tested with the link method on a 2GB core and found that it > took less than 1MB of additional space. Filled the core quite a bit larger, > 12GB of a 20GB PVC, and now splitting the shard fails with the following > error message on my overseer: > > > 2019-06-18 16:27:41.754 ERROR > (OverseerThreadFactory-49-thread-5-processing-n:10.0.192.74:8983_solr) > [c:test_autoscale s:shard1 ] o.a.s.c.a.c.OverseerCollectionMessageHandler > Collection: test_autoscale operation: splitshard > failed:org.apache.solr.common.SolrException: not enough free disk space to > perform index split on node 10.0.193.23:8983_solr, required: > 23.35038321465254, available: 7.811378479003906 > at > org.apache.solr.cloud.api.collections.SplitShardCmd.checkDiskSpace(SplitShardCmd.java:567) > at > org.apache.solr.cloud.api.collections.SplitShardCmd.split(SplitShardCmd.java:138) > at > org.apache.solr.cloud.api.collections.SplitShardCmd.call(SplitShardCmd.java:94) > at > org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:294) > at > org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > > > > I attempted sending the request to the node itself to see if it did anything > different, but no luck. My parameters are (Note Python formatting as that is > my language of choice): > > > > splitparams = {'action':'SPLITSHARD', > 'collection':'test_autoscale', > 'shard':'shard1', > 'splitMethod':'link', > 'timing':'true', > 'async':'shardsplitasync'} > > > And this is confirmed by the log message from the node itself: > > > 2019-06-18 16:27:41.730 INFO (qtp1107530534-16) [c:test_autoscale ] > o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections > params={async=shardsplitasync&timing=true&action=SPLITSHARD&collection=test_autoscale&shard=shard1&splitMethod=link} > status=0 QTime=20 > > > While it is true I do not have enough space if I were using the rewrite > method, the link method on a 2GB core used an additional less than 1MB of > space. Is there something I am missing here? is there an option to disable > the disk space check that I need to pass? I can't find anything in the > documentation at this point. > > > [https://storage.googleapis.com/e24-email-images/e24logonotag.png]<https://www.evolve24.com> > Andrew Kettmann > DevOps Engineer > P: 1.314.596.2836 > [LinkedIn]<https://linkedin.com/company/evolve24> [Twitter] > <https://twitter.com/evolve24> [Instagram] > <https://www.instagram.com/evolve_24> > > evolve24 Confidential & Proprietary Statement: This email and any attachments > are confidential and may contain information that is privileged, confidential > or exempt from disclosure under applicable law. It is intended for the use of > the recipients. If you are not the intended recipient, or believe that you > have received this communication in error, please do not read, print, copy, > retransmit, disseminate, or otherwise use the information. Please delete this > email and attachments, without reading, printing, copying, forwarding or > saving them, and notify the Sender immediately by reply email. No > confidentiality or privilege is waived or lost by any transmission in error.