Hi All - when building machine learning models using information gain, I
sometimes get this error when the number of iterations is high. I'm
using about 20k news articles in my training set (about 10k positive,
and 10k negative), and (for this particular run) am using 500 terms and
25,000 iterations. I have gotten the error with a much lower number of
iterations (1,000) as well.
The specific stream command was:
update(models,
batchSize="50",train(MODEL1024_1522696624083,features(MODEL1024_1522696624083,q="*:*",featureSet="FSet_MODEL1024_1522696624083",field="Text",outcome="out_i",positiveLabel=1,numTerms=500),q="*:*",name="MODEL1024",field="Text",outcome="out_i",maxIterations="25000"))
The training data was split across 20 shards - specifically created with:
http://icarus.querymasters.com:9100/solr/admin/collections?action=CREATE&name=MODEL1024_1522696624083&numShards=20&replicationFactor=2&maxShardsPerNode=5&collection.configName=TRAINING
Any ideas? The complete error is:
java.io.IOException: java.util.concurrent.ExecutionException:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at
http://vesta:9100/solr/MODEL1024_1522696624083_shard20_replica_n75:
Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing
/solr/MODEL1024_1522696624083_shard20_replica_n75/select. Reason:
<pre> Not Found</pre></p>
</body>
</html>
at
org.apache.solr.client.solrj.io.stream.TextLogitStream.read(TextLogitStream.java:498)
at
org.apache.solr.client.solrj.io.stream.PushBackStream.read(PushBackStream.java:87)
at
org.apache.solr.client.solrj.io.stream.UpdateStream.read(UpdateStream.java:109)
at
org.apache.solr.client.solrj.io.stream.ExceptionStream.read(ExceptionStream.java:68)
at
org.apache.solr.handler.StreamHandler$TimerStream.read(StreamHandler.java:627)
at
org.apache.solr.client.solrj.io.stream.TupleStream.lambda$writeMap$0(TupleStream.java:87)
at
org.apache.solr.response.JSONWriter.writeIterator(JSONResponseWriter.java:523)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:180)
at
org.apache.solr.response.JSONWriter$2.put(JSONResponseWriter.java:559)
at
org.apache.solr.client.solrj.io.stream.TupleStream.writeMap(TupleStream.java:84)
at
org.apache.solr.response.JSONWriter.writeMap(JSONResponseWriter.java:547)
at
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:198)
at
org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:209)
at
org.apache.solr.response.JSONWriter.writeNamedList(JSONResponseWriter.java:325)
at
org.apache.solr.response.JSONWriter.writeResponse(JSONResponseWriter.java:120)
at
org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:71)
at
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)
at
org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:806)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:535)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at
http://vesta:9100/solr/MODEL1024_1522696624083_shard20_replica_n75:
Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing
/solr/MODEL1024_1522696624083_shard20_replica_n75/select. Reason:
<pre> Not Found</pre></p>
</body>
</html>
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.solr.client.solrj.io.stream.TextLogitStream.read(TextLogitStream.java:459)
... 47 more
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at
http://vesta:9100/solr/MODEL1024_1522696624083_shard20_replica_n75:
Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing
/solr/MODEL1024_1522696624083_shard20_replica_n75/select. Reason:
<pre> Not Found</pre></p>
</body>
</html>
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:590)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:195)
at
org.apache.solr.client.solrj.io.stream.TextLogitStream$LogitCall.call(TextLogitStream.java:640)
at
org.apache.solr.client.solrj.io.stream.TextLogitStream$LogitCall.call(TextLogitStream.java:582)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Last Check: 4/2/2018, 3:47:15 PM
Thank you!
-Joe Obernberger