Hi Jason, thanks for pointing me to issue SOLR-6595. Looks to me that the async thing is similar to the handling of distributed collection cmds. I hope I can spare the time to try if your patch would fix it. Yes, I will try your suggestion and see if we can do a work around and check the collection after the restore with a query.
Regards Steffen > -----Original Message----- > From: Jason Gerlowski [mailto:gerlowsk...@gmail.com] > Sent: Montag, 4. Februar 2019 15:43 > To: solr-user@lucene.apache.org > Subject: Re: Asynchronous Calls to Backup/Restore Collections ignoring > errors > > Hi Steffen, > > There are a few "known issues" in this area. Probably most relevant is > SOLR-6595, which covers a few error-reporting issues for "collection-admin" > operations. I don't think we've gotten any reports yet of success/failure > determination being broken for asynchronous operations, but that's not > too surprising given my understanding of how that bit of the code works. > So "yes", this is a known issue. > We've made some progress towards improving the situation, but there's > still work to be done. > > As for workarounds, I can't think of any clever suggestions. You might be > able to issue a query to the collection to see if it returns any docs, or a > particular number of expected docs. But that may not be possible, > depending on what you meant by the collection being "unusable" above. > > Best, > > Jason > > On Thu, Jan 31, 2019 at 10:10 AM Steffen Moldenhauer > <s.moldenha...@intershop.de> wrote: > > > > Hi all, > > > > we are using the collection API backup and restore to transfer > > collections from a pre-prod to a production system. We are currently > using Solr version 6.6.5 But sometimes that automated process fails and > collections are not working on the production system. > > > > It seems that the asynchronous API calls backup and restore do not report > some errors/exceptions. > > > > I tried it with the solrcloud gettingstarted example: > > > > > http://localhost:8983/solr/admin/collections?action=BACKUP&name=back > up > > -gettingstarted&collection=gettingstarted&location=D:\solr_backup > > > > > http://localhost:8983/solr/admin/collections?action=DELETE&name=gettin > > gstarted > > > > Now I simulate an error just by deleting somthing from the backup in the > file-system and try to restore the incomplete backup: > > > > > http://localhost:8983/solr/admin/collections?action=RESTORE&name=bac > ku > > p- > gettingstarted&collection=gettingstarted&location=D:\solr_backup&asy > > nc=1000 > > > > > http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&req > u > > estid=1000 <response><lst name="responseHeader"><int > > name="status">0</int><int name="QTime">2</int></lst><lst > > name="status"><str name="state">completed</str><str > name="msg">found > > [1000] in completed tasks</str></lst></response> > > > > The status is completed but the collection is not usable. > > > > With a synchronous restore call I get: > > > > > http://localhost:8983/solr/admin/collections?action=RESTORE&name=bac > kup-gettingstarted&collection=gettingstarted&location=D:\solr_backup > > <response><lst name="responseHeader"><int > name="status">500</int><int name="QTime">6456</int></lst><str > name="Operation restore caused > exception:">org.apache.solr.common.SolrException:org.apache.solr.commo > n.SolrException: Could not restore core</str><lst name="exception"><str > name="msg">Could not restore core</str><int > name="rspCode">500</int></lst><lst name="error"><lst > name="metadata"><str name="error- > class">org.apache.solr.common.SolrException</str><str name="root-error- > class">org.apache.solr.common.SolrException</str></lst><str > name="msg">Could not restore core</str><str > name="trace">org.apache.solr.common.SolrException: Could not restore > core > > at > org.apache.solr.handler.admin.CollectionsHandler.handleResponse(Collectio > nsHandler.java:300) > > at > org.apache.solr.handler.admin.CollectionsHandler.invokeAction(Collections > Handler.java:237) > > at > org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(Colle > ctionsHandler.java:215) > > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle > rBase.java:173) > > at > org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:748) > > at > org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java: > 729) > > at > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:510) > > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:36 > 1) > > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:30 > 5) > > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandle > r.java:1691) > > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java: > 143) > > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.ja > va:226) > > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.ja > va:1180) > > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.jav > a:185) > > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.jav > a:1112) > > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java: > 141) > > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHa > ndlerCollection.java:213) > > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection. > java:119) > > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.ja > va:134) > > at > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java > :335) > > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.ja > va:134) > > at org.eclipse.jetty.server.Server.handle(Server.java:534) > > at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) > > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractCo > nnection.java:273) > > at > > org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) > > at > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.jav > a:93) > > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProdu > ceConsume(ExecuteProduceConsume.java:303) > > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceCons > ume(ExecuteProduceConsume.java:148) > > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecutePr > oduceConsume.java:136) > > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.ja > va:671) > > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.ja > va:589) > > at java.lang.Thread.run(Thread.java:748) > > </str><int name="code">500</int></lst></response> > > > > > > But we cannot use the sync call because we are running in a timout even if > we increase the socket timeout of the client. > > And we cannot use the async because it does not report errors. > > > > Is this a known bug? Any ideas for a workaround? > > > > Kind regards > > Steffen Moldenhauer > >