Hi Steffen, There are a few "known issues" in this area. Probably most relevant is SOLR-6595, which covers a few error-reporting issues for "collection-admin" operations. I don't think we've gotten any reports yet of success/failure determination being broken for asynchronous operations, but that's not too surprising given my understanding of how that bit of the code works. So "yes", this is a known issue. We've made some progress towards improving the situation, but there's still work to be done.
As for workarounds, I can't think of any clever suggestions. You might be able to issue a query to the collection to see if it returns any docs, or a particular number of expected docs. But that may not be possible, depending on what you meant by the collection being "unusable" above. Best, Jason On Thu, Jan 31, 2019 at 10:10 AM Steffen Moldenhauer <s.moldenha...@intershop.de> wrote: > > Hi all, > > we are using the collection API backup and restore to transfer collections > from a pre-prod to a production system. We are currently using Solr version > 6.6.5 > But sometimes that automated process fails and collections are not working on > the production system. > > It seems that the asynchronous API calls backup and restore do not report > some errors/exceptions. > > I tried it with the solrcloud gettingstarted example: > > http://localhost:8983/solr/admin/collections?action=BACKUP&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup > > http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted > > Now I simulate an error just by deleting somthing from the backup in the > file-system and try to restore the incomplete backup: > > http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup&async=1000 > > http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1000 > <response><lst name="responseHeader"><int name="status">0</int><int > name="QTime">2</int></lst><lst name="status"><str > name="state">completed</str><str name="msg">found [1000] in completed > tasks</str></lst></response> > > The status is completed but the collection is not usable. > > With a synchronous restore call I get: > > http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup > <response><lst name="responseHeader"><int name="status">500</int><int > name="QTime">6456</int></lst><str name="Operation restore caused > exception:">org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > Could not restore core</str><lst name="exception"><str name="msg">Could not > restore core</str><int name="rspCode">500</int></lst><lst name="error"><lst > name="metadata"><str > name="error-class">org.apache.solr.common.SolrException</str><str > name="root-error-class">org.apache.solr.common.SolrException</str></lst><str > name="msg">Could not restore core</str><str > name="trace">org.apache.solr.common.SolrException: Could not restore core > at > org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:300) > at > org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:237) > at > org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:215) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) > at > org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:748) > at > org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:729) > at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:510) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.eclipse.jetty.server.Server.handle(Server.java:534) > at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) > at > org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) > at > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at java.lang.Thread.run(Thread.java:748) > </str><int name="code">500</int></lst></response> > > > But we cannot use the sync call because we are running in a timout even if we > increase the socket timeout of the client. > And we cannot use the async because it does not report errors. > > Is this a known bug? Any ideas for a workaround? > > Kind regards > Steffen Moldenhauer >