Hi all, we are using the collection API backup and restore to transfer collections from a pre-prod to a production system. We are currently using Solr version 6.6.5 But sometimes that automated process fails and collections are not working on the production system.
It seems that the asynchronous API calls backup and restore do not report some errors/exceptions. I tried it with the solrcloud gettingstarted example: http://localhost:8983/solr/admin/collections?action=BACKUP&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted Now I simulate an error just by deleting somthing from the backup in the file-system and try to restore the incomplete backup: http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup&async=1000 http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1000 <response><lst name="responseHeader"><int name="status">0</int><int name="QTime">2</int></lst><lst name="status"><str name="state">completed</str><str name="msg">found [1000] in completed tasks</str></lst></response> The status is completed but the collection is not usable. With a synchronous restore call I get: http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup <response><lst name="responseHeader"><int name="status">500</int><int name="QTime">6456</int></lst><str name="Operation restore caused exception:">org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not restore core</str><lst name="exception"><str name="msg">Could not restore core</str><int name="rspCode">500</int></lst><lst name="error"><lst name="metadata"><str name="error-class">org.apache.solr.common.SolrException</str><str name="root-error-class">org.apache.solr.common.SolrException</str></lst><str name="msg">Could not restore core</str><str name="trace">org.apache.solr.common.SolrException: Could not restore core at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:300) at org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:237) at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:215) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) at org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:748) at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:729) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:510) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.server.Server.handle(Server.java:534) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Thread.java:748) </str><int name="code">500</int></lst></response> But we cannot use the sync call because we are running in a timout even if we increase the socket timeout of the client. And we cannot use the async because it does not report errors. Is this a known bug? Any ideas for a workaround? Kind regards Steffen Moldenhauer