Hi Steffen,

There are a few "known issues" in this area.  Probably most relevant
is SOLR-6595, which covers a few error-reporting issues for
"collection-admin" operations.  I don't think we've gotten any reports
yet of success/failure determination being broken for asynchronous
operations, but that's not too surprising given my understanding of
how that bit of the code works.  So "yes", this is a known issue.
We've made some progress towards improving the situation, but there's
still work to be done.

As for workarounds, I can't think of any clever suggestions.  You
might be able to issue a query to the collection to see if it returns
any docs, or a particular number of expected docs.  But that may not
be possible, depending on what you meant by the collection being
"unusable" above.

Best,

Jason

On Thu, Jan 31, 2019 at 10:10 AM Steffen Moldenhauer
<s.moldenha...@intershop.de> wrote:
>
> Hi all,
>
> we are using the collection API backup and restore to transfer collections 
> from a pre-prod to a production system. We are currently using Solr version 
> 6.6.5
> But sometimes that automated process fails and collections are not working on 
> the production system.
>
> It seems that the asynchronous API calls backup and restore do not report 
> some errors/exceptions.
>
> I tried it with the solrcloud gettingstarted example:
>
> http://localhost:8983/solr/admin/collections?action=BACKUP&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup
>
> http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted
>
> Now I simulate an error just by deleting somthing from the backup in the 
> file-system and try to restore the incomplete backup:
>
> http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup&async=1000
>
> http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1000
> <response><lst name="responseHeader"><int name="status">0</int><int 
> name="QTime">2</int></lst><lst name="status"><str 
> name="state">completed</str><str name="msg">found [1000] in completed 
> tasks</str></lst></response>
>
> The status is completed but the collection is not usable.
>
> With a synchronous restore call I get:
>
> http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup
>         <response><lst name="responseHeader"><int name="status">500</int><int 
> name="QTime">6456</int></lst><str name="Operation restore caused 
> exception:">org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>  Could not restore core</str><lst name="exception"><str name="msg">Could not 
> restore core</str><int name="rspCode">500</int></lst><lst name="error"><lst 
> name="metadata"><str 
> name="error-class">org.apache.solr.common.SolrException</str><str 
> name="root-error-class">org.apache.solr.common.SolrException</str></lst><str 
> name="msg">Could not restore core</str><str 
> name="trace">org.apache.solr.common.SolrException: Could not restore core
>                at 
> org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:300)
>                at 
> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:237)
>                at 
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:215)
>                at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
>                at 
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:748)
>                at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:729)
>                at 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:510)
>                at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
>                at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>                at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>                at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>                at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>                at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>                at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>                at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>                at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>                at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>                at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>                at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>                at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>                at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>                at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>                at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>                at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>                at org.eclipse.jetty.server.Server.handle(Server.java:534)
>                at 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>                at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>                at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>                at 
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>                at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>                at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>                at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>                at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>                at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>                at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>                at java.lang.Thread.run(Thread.java:748)
> </str><int name="code">500</int></lst></response>
>
>
> But we cannot use the sync call because we are running in a timout even if we 
> increase the socket timeout of the client.
> And we cannot use the async because it does not report errors.
>
> Is this a known bug? Any ideas for a workaround?
>
> Kind regards
> Steffen Moldenhauer
>

Reply via email to