Hi all,

we are using the collection API backup and restore to transfer collections from 
a pre-prod to a production system. We are currently using Solr version 6.6.5
But sometimes that automated process fails and collections are not working on 
the production system.

It seems that the asynchronous API calls backup and restore do not report some 
errors/exceptions.

I tried it with the solrcloud gettingstarted example:

http://localhost:8983/solr/admin/collections?action=BACKUP&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup

http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted

Now I simulate an error just by deleting somthing from the backup in the 
file-system and try to restore the incomplete backup:

http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup&async=1000

http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1000
<response><lst name="responseHeader"><int name="status">0</int><int 
name="QTime">2</int></lst><lst name="status"><str 
name="state">completed</str><str name="msg">found [1000] in completed 
tasks</str></lst></response>

The status is completed but the collection is not usable.

With a synchronous restore call I get:

http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup
        <response><lst name="responseHeader"><int name="status">500</int><int 
name="QTime">6456</int></lst><str name="Operation restore caused 
exception:">org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 Could not restore core</str><lst name="exception"><str name="msg">Could not 
restore core</str><int name="rspCode">500</int></lst><lst name="error"><lst 
name="metadata"><str 
name="error-class">org.apache.solr.common.SolrException</str><str 
name="root-error-class">org.apache.solr.common.SolrException</str></lst><str 
name="msg">Could not restore core</str><str 
name="trace">org.apache.solr.common.SolrException: Could not restore core
               at 
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:300)
               at 
org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:237)
               at 
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:215)
               at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
               at 
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:748)
               at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:729)
               at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:510)
               at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
               at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
               at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
               at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
               at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
               at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
               at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
               at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
               at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
               at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
               at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
               at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
               at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
               at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
               at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
               at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
               at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
               at org.eclipse.jetty.server.Server.handle(Server.java:534)
               at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
               at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
               at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
               at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
               at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
               at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
               at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
               at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
               at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
               at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
               at java.lang.Thread.run(Thread.java:748)
</str><int name="code">500</int></lst></response>


But we cannot use the sync call because we are running in a timout even if we 
increase the socket timeout of the client.
And we cannot use the async because it does not report errors.

Is this a known bug? Any ideas for a workaround?

Kind regards
Steffen Moldenhauer

Reply via email to