>From my testing program, there's nothing standard here. As the blog points out, since I was indexing fairly simple documents you should _not_ be expecting to see those indexing rates. The point of the article was just to show the _relative_ changes when I sent batches.
Best, Erick On Wed, Aug 17, 2016 at 1:59 PM, Jaspal Sawhney <jsawh...@sapient.com> wrote: > Erick > Going through the article which you shared. Where are you getting the > Docs/second value? > Thanks > > On 8/17/16, 4:37 PM, "Jaspal Sawhney" <jsawh...@sapient.com> wrote: > >>Erick >>Thanks - My batch size was 30 and thread size also 30. >>Thanks >> >>On 8/17/16, 3:48 PM, "Erick Erickson" <erickerick...@gmail.com> wrote: >> >>>What this probably indicates is that the size of the packets you send >>>to Solr is large enough that it exceeds the transport protocol's >>>limit. This is reinforced by your statement that reducing the batch >>>size fixes the problem even though it increases indexing time. >>> >>>So the place I'd be looking is the jetty configurations for any limits >>>there. >>> >>>That said, what is your batch size? In my testing I pretty quickly get >>>into diminishing returns, here's a writeup from some time ago: >>>https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/ >>> >>>Best, >>>Erick >>> >>>On Wed, Aug 17, 2016 at 12:03 PM, Jaspal Sawhney <jsawh...@sapient.com> >>>wrote: >>>> Bump ! >>>> >>>> On 8/16/16, 10:53 PM, "Jaspal Sawhney" <jsawh...@sapient.com> wrote: >>>> >>>>>Hello >>>>>We are running solr 4.6 in master-slave configuration where in our >>>>>master >>>>>is used entirely for indexing. No search traffic comes to master ever. >>>>>Off late we have started to get the early EOF error on the solr Master >>>>>which results in a Broken Pipe error on the commerce application from >>>>>where Indexing was kicked off from. >>>>> >>>>>Things to mention >>>>> >>>>> 1. We have a couple of sites each of which has the same document >>>>>size but diff document count. >>>>> 2. This error is being observed in the site which has the most >>>>>number >>>>>of document count I.e. 2204743 >>>>> 3. The way I have understood solr to work is that irrespective of >>>>>number of document the throughput is controlled by the ŒNumber of >>>>>Threads¹ and ŒBatch size¹ - Am I correct? >>>>> * In our case we have not touched the batch size and Number of >>>>>Threads when this error started coming >>>>> * However when I do touch these parameters (specifically reduce >>>>>them) the error does not come however indexing time increases a lot. >>>>> 4. We have to index overnight daily because we put product prices in >>>>>the Index which get updated nightly >>>>> 5. Solr master is running with a 20 GB Heap >>>>> >>>>>What we have tried >>>>> >>>>> 1. I disabled autoCommit (I.e. Hard commit) and put the >>>>>autoSoftCommit >>>>>as 5 mins >>>>> * I realized afterwards that this was a wrong test because my >>>>>understanding of soft commit was incorrect, My understanding now is >>>>>that >>>>>hard commit just truncate the Tlog do hardCommit should be better >>>>>indexing performance. >>>>> * This test failed for lack of space reason however because >>>>>disable autoCommit did not make sense I did not retry this test yet. >>>>> 2. Increased the RAMBufferSizeMB from 100MB to 1000MB >>>>> * This test did not yield anything favorable the master gave >>>>>the >>>>>early EOF exception >>>>> 3. Increased the merge factor from 20 ‹> 100 >>>>> * This test did not yield anything favorable the master gave >>>>>the >>>>>early EOF exception >>>>> 4. Flipped the autoCommit to 15 secs and disabled auto commit >>>>> * This test did not yield anything favorable the master gave >>>>>the >>>>>early EOF exception >>>>> * I got the input for this from >>>>>https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-s >>>>>o >>>>>ft >>>>>commit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section >>>>> 5. Tried to bypass transaction log all together This test is >>>>>underway currently >>>>> >>>>>Questions >>>>> >>>>> 1. Since we are not using solrCloud I want to understand the >>>>>impact >>>>>of bypassing transaction log >>>>> 2. How does solr take documents which are sent to it to storage as >>>>>in >>>>>what is the journey of a document from segment to tlog to storage. >>>>> >>>>>It would be great If there are any pointers which you can share. >>>>> >>>>>Thanks >>>>>J./ >>>>> >>>>>The actual Error Log >>>>>ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException; >>>>>org.apache.solr.common.SolrException: early EOF >>>>> at >>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176) >>>>> at >>>>>org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandle >>>>>r >>>>>.j >>>>>ava:92) >>>>> at >>>>>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Cont >>>>>e >>>>>nt >>>>>StreamHandlerBase.java:74) >>>>> at >>>>>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandler >>>>>B >>>>>as >>>>>e.java:135) >>>>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) >>>>> at >>>>>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.j >>>>>a >>>>>va >>>>>:721) >>>>> at >>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter. >>>>>j >>>>>av >>>>>a:417) >>>>> at >>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter. >>>>>j >>>>>av >>>>>a:201) >>>>> at >>>>>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa >>>>>n >>>>>dl >>>>>er.java:1419) >>>>> at >>>>>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:4 >>>>>5 >>>>>5) >>>>> at >>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav >>>>>a >>>>>:1 >>>>>37) >>>>> at >>>>>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java: >>>>>5 >>>>>57 >>>>>) >>>>> at >>>>>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler >>>>>. >>>>>ja >>>>>va:231) >>>>> at >>>>>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler >>>>>. >>>>>ja >>>>>va:1075) >>>>> at >>>>>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:38 >>>>>4 >>>>>) >>>>> at >>>>>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler. >>>>>j >>>>>av >>>>>a:193) >>>>> at >>>>>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler. >>>>>j >>>>>av >>>>>a:1009) >>>>> at >>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav >>>>>a >>>>>:1 >>>>>35) >>>>> at >>>>>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Contex >>>>>t >>>>>Ha >>>>>ndlerCollection.java:255) >>>>> at >>>>>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollec >>>>>t >>>>>io >>>>>n.java:154) >>>>> at >>>>>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.j >>>>>a >>>>>va >>>>>:116) >>>>> at org.eclipse.jetty.server.Server.handle(Server.java:368) >>>>> at >>>>>org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractH >>>>>t >>>>>tp >>>>>Connection.java:489) >>>>> at >>>>>org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingH >>>>>t >>>>>tp >>>>>Connection.java:53) >>>>> at >>>>>org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpCon >>>>>n >>>>>ec >>>>>tion.java:953) >>>>> at >>>>>org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content( >>>>>A >>>>>bs >>>>>tractHttpConnection.java:1014) >>>>> at >>>>>org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953) >>>>> at >>>>>org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) >>>>> at >>>>>org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConn >>>>>e >>>>>ct >>>>>ion.java:72) >>>>> at >>>>>org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(Sock >>>>>e >>>>>tC >>>>>onnector.java:264) >>>>> at >>>>>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool. >>>>>j >>>>>av >>>>>a:608) >>>>> at >>>>>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.j >>>>>a >>>>>va >>>>>:543) >>>>> at java.lang.Thread.run(Thread.java:745) >>>>>Caused by: com.ctc.wstx.exc.WstxIOException: early EOF >>>>> at >>>>>com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708) >>>>> at >>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086) >>>>> at >>>>>org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389) >>>>> at >>>>>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:2 >>>>>4 >>>>>6) >>>>> at >>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) >>>>> ... 32 more >>>>>Caused by: org.eclipse.jetty.io.EofException: early EOF >>>>> at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65) >>>>> at java.io.InputStream.read(InputStream.java:101) >>>>> at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365) >>>>> at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110) >>>>> at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101) >>>>> at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84) >>>>> at >>>>>com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.ja >>>>>v >>>>>a: >>>>>57) >>>>> at >>>>>com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:10 >>>>>4 >>>>>6) >>>>> at >>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796) >>>>> at >>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756) >>>>> at >>>>>com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java: >>>>>2 >>>>>98 >>>>>1) >>>>> at >>>>>com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.jav >>>>>a >>>>>:2 >>>>>936) >>>>> at >>>>>com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2 >>>>>8 >>>>>48 >>>>>) >>>>> at >>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) >>>>> ... 35 more >>>>> >>>> >> >