​>
If I'm reading this right, you have 420M docs on a single shard?
Yep, you were reading it right. Thanks for your guidance. We will do
various prototyping following "the sizing exercise".

Best,
Stephen

On Tue, Apr 26, 2016 at 6:17 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> ​​
> If I'm reading this right, you have 420M docs on a single shard? If that's
> true
> you are pushing the envelope of what I've seen work and be performant. Your
> OOM errors are the proverbial 'smoking gun' that you're putting too many
> docs
> on too few nodes.
>
> You say that the document count is "growing quite rapidly". My expectation
> is
> that your problems will only get worse as you cram more docs into your
> shard.
>
> You're correct that adding more memory (and consequently more JVM
> memory?) only gets you so far before you start running into GC trouble,
> when you hit full GC pauses they'll get longer and longer which is its own
> problem. And you don't want to have huge JVM memory at the expense
> of op system memory due the fact that Lucene uses MMapDirectory, see
> Uwe's excellent blog:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> I'd _strongly_ recommend you do "the sizing exercise". There are lots of
> details here:
>
> https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> You've already done some of this inadvertently, unfortunately it sounds
> like
> it's in production. If I were going to guess, I'd say the maximum number of
> docs on any shard should be less than half what you currently have. So you
> need to figure out how many docs you expect to host in this collection
> eventually
> and have N/200M shards. At least.
>
> There are various strategies when the answer is "I don't know", you
> might add new
> collections when you max out and then use "collection aliasing" to
> query them etc.
>
> Best,
> Erick
>
> On Tue, Apr 26, 2016 at 3:49 PM, Stephen Lewis <sle...@panopto.com> wrote:
> > Hello,
> >
> > I'm looking for some guidance on the best steps for tuning a solr cloud
> > cluster which is heavy on writes. We are currently running a solr cloud
> > fleet composed of one core, one shard, and three nodes. The cloud is
> hosted
> > in AWS, and each solr node is on its own linux r3.2xl instance with 8 cpu
> > and 61 GiB mem, and a 2TB EBS volume attached. Our index is currently 550
> > GiB over 420M documents, and growing quite rapidly. We are currently
> doing
> > a bit more than 1000 document writes/deletes per second.
> >
> > Recently, we've hit some trouble with our production cloud. We have had
> the
> > process on individual instances die a few times, and we see the following
> > error messages being logged (expanded logs at the bottom of the email):
> >
> > ERROR - 2016-04-26 00:56:43.873; org.apache.solr.common.SolrException;
> > null:org.eclipse.jetty.io.EofException
> >
> > WARN  - 2016-04-26 00:55:29.571;
> org.eclipse.jetty.servlet.ServletHandler;
> > /solr/panopto/select
> > java.lang.IllegalStateException: Committed
> >
> > WARN  - 2016-04-26 00:55:29.571; org.eclipse.jetty.server.Response;
> > Committed before 500 {trace=org.eclipse.jetty.io.EofException
> >
> >
> > Another time we saw this happen, we had java OOM errors (expanded logs at
> > the bottom):
> >
> > WARN  - 2016-04-25 22:58:43.943;
> org.eclipse.jetty.servlet.ServletHandler;
> > Error for /solr/panopto/select
> > java.lang.OutOfMemoryError: Java heap space
> > ERROR - 2016-04-25 22:58:43.945; org.apache.solr.common.SolrException;
> > null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap
> space
> > ...
> > Caused by: java.lang.OutOfMemoryError: Java heap space
> >
> >
> > When the cloud goes into recovery during live indexing, it takes about
> 4-6
> > hours for a node to recover, but when we turn off indexing, recovery only
> > takes about 90 minutes.
> >
> > Moreover, we see that deletes are extremely slow. We do batch deletes of
> > about 300 documents based on two value filters, and this takes about one
> > minute:
> >
> > Research online suggests that a larger disk cache
> > <https://wiki.apache.org/solr/SolrPerformanceProblems> could be helpful,
> > but I also see from an older page
> > <http://wiki.apache.org/lucene-java/ImproveSearchingSpeed> on tuning for
> > Lucene that turning down the swappiness on our Linux instances may be
> > preferred to simply increasing space for the disk cache.
> >
> > Moreover, to scale in the past, we've simply rolled our cluster while
> > increasing the memory on the new machines, but I wonder if we're hitting
> > the limit for how much we should scale vertically. My impression is that
> > sharding will allow us to warm searchers faster and maintain a more
> > effective cache as we scale. Will we really be helped by sharding, or is
> it
> > only a matter of total CPU/Memory in the cluster?
> >
> > Thanks!
> >
> > Stephen
> >
> > (206)753-9320
> > stephen-lewis.net
> >
> > Logs:
> >
> > ERROR - 2016-04-26 00:56:43.873; org.apache.solr.common.SolrException;
> > null:org.eclipse.jetty.io.EofException
> > at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)
> > at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)
> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
> > at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
> > at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
> > at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
> > at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
> > at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
> > at
> >
> org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:83)
> > at
> >
> org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:42)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:765)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> > at org.eclipse.jetty.server.Server.handle(Server.java:368)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> > at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> > at
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> > at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> > at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > WARN  - 2016-04-25 22:58:43.943;
> org.eclipse.jetty.servlet.ServletHandler;
> > Error for /solr/panopto/select
> > java.lang.OutOfMemoryError: Java heap space
> > ERROR - 2016-04-25 22:58:43.945; org.apache.solr.common.SolrException;
> > null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap
> space
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:793)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:434)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> > at org.eclipse.jetty.server.Server.handle(Server.java:368)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
> > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
> > at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> > at
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> > at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> > at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> > at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.OutOfMemoryError: Java heap space
> >
> > WARN  - 2016-04-26 00:56:43.873; org.eclipse.jetty.server.Response;
> > Committed before 500 {trace=org.eclipse.jetty.io.EofException
> > at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)
> > at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107)
> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
> > at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
> > at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
> > at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
> > at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
> > at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
> > at
> >
> org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:83)
> > at
> >
> org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:42)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:765)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> > at
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> > at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> > at
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> > at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> > at org.eclipse.jetty.server.Server.handle(Server.java:368)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> > at
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> > at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> > at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> > at
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> > at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> > at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> > at java.lang.Thread.run(Thread.java:745)
> > ,code=500}
>



-- 
Stephen

(206)753-9320
stephen-lewis.net

Reply via email to