>From my testing program, there's nothing standard here.

As the blog points out, since I was indexing fairly
simple documents you should _not_ be expecting to
see those indexing rates. The point of the article was
just to show the _relative_ changes when I sent
batches.

Best,
Erick

On Wed, Aug 17, 2016 at 1:59 PM, Jaspal Sawhney <jsawh...@sapient.com> wrote:
> Erick
> Going through the article which you shared. Where are you getting the
> Docs/second value?
> Thanks
>
> On 8/17/16, 4:37 PM, "Jaspal Sawhney" <jsawh...@sapient.com> wrote:
>
>>Erick
>>Thanks - My batch size was 30 and thread size also 30.
>>Thanks
>>
>>On 8/17/16, 3:48 PM, "Erick Erickson" <erickerick...@gmail.com> wrote:
>>
>>>What this probably indicates is that the size of the packets you send
>>>to Solr is large enough that it exceeds the transport protocol's
>>>limit. This is reinforced by your statement that reducing the batch
>>>size fixes the problem even though it increases indexing time.
>>>
>>>So the place I'd be looking is the jetty configurations for any limits
>>>there.
>>>
>>>That said, what is your batch size? In my testing I pretty quickly get
>>>into diminishing returns, here's a writeup from some time ago:
>>>https://lucidworks.com/blog/2015/10/05/really-batch-updates-solr-2/
>>>
>>>Best,
>>>Erick
>>>
>>>On Wed, Aug 17, 2016 at 12:03 PM, Jaspal Sawhney <jsawh...@sapient.com>
>>>wrote:
>>>> Bump !
>>>>
>>>> On 8/16/16, 10:53 PM, "Jaspal Sawhney" <jsawh...@sapient.com> wrote:
>>>>
>>>>>Hello
>>>>>We are running solr 4.6 in master-slave configuration where in our
>>>>>master
>>>>>is used entirely for indexing. No search traffic comes to master ever.
>>>>>Off late we have started to get the early EOF error on the solr Master
>>>>>which results in a Broken Pipe error on the commerce application from
>>>>>where Indexing was kicked off from.
>>>>>
>>>>>Things to mention
>>>>>
>>>>>  1.  We have a couple of sites ­ each of which has the same document
>>>>>size but diff document count.
>>>>>  2.  This error is being observed in the site which has the most
>>>>>number
>>>>>of document count I.e. 2204743
>>>>>  3.  The way I have understood solr to work is that irrespective of
>>>>>number of document ­ the throughput is controlled by the ŒNumber of
>>>>>Threads¹ and ŒBatch size¹ - Am I correct?
>>>>>     *   In our case we have not touched the batch size and Number of
>>>>>Threads when this error started coming
>>>>>     *   However when I do touch these parameters (specifically reduce
>>>>>them) the error does not come ­ however indexing time increases a lot.
>>>>>  4.  We have to index overnight daily because we put product prices in
>>>>>the Index which get updated nightly
>>>>>  5.  Solr master is running with a 20 GB Heap
>>>>>
>>>>>What we have tried
>>>>>
>>>>>  1.  I disabled autoCommit (I.e. Hard commit) and put the
>>>>>autoSoftCommit
>>>>>as 5 mins
>>>>>     *   I realized afterwards that this was a wrong test because my
>>>>>understanding of soft commit was incorrect, My understanding now is
>>>>>that
>>>>>hard commit just truncate the Tlog do hardCommit should be better
>>>>>indexing performance.
>>>>>     *   This test failed for lack of space reason however because
>>>>>disable autoCommit did not make sense ­ I did not retry this test yet.
>>>>>  2.  Increased the RAMBufferSizeMB from 100MB to 1000MB
>>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>>the
>>>>>early EOF exception
>>>>>  3.  Increased the merge factor from 20 ‹> 100
>>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>>the
>>>>>early EOF exception
>>>>>  4.  Flipped the autoCommit to 15 secs and disabled auto commit
>>>>>     *   This test did not yield anything favorable ­ the master gave
>>>>>the
>>>>>early EOF exception
>>>>>     *   I got the input for this from
>>>>>https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-s
>>>>>o
>>>>>ft
>>>>>commit-and-commit-in-sorlcloud/ - Heavy (Bulk) Indexing section
>>>>>  5.  Tried to bypass transaction log all together ­ This test is
>>>>>underway currently
>>>>>
>>>>>Questions
>>>>>
>>>>>  1.  Since we are not using solrCloud ­ I want to understand the
>>>>>impact
>>>>>of bypassing transaction log
>>>>>  2.  How does solr take documents which are sent to it to storage as
>>>>>in
>>>>>what is the journey of a document from segment to tlog to storage.
>>>>>
>>>>>It would be great If there are any pointers which you can share.
>>>>>
>>>>>Thanks
>>>>>J./
>>>>>
>>>>>The actual Error Log
>>>>>ERROR - 2016-08-16 22:59:55.988; org.apache.solr.common.SolrException;
>>>>>org.apache.solr.common.SolrException: early EOF
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
>>>>>        at
>>>>>org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandle
>>>>>r
>>>>>.j
>>>>>ava:92)
>>>>>        at
>>>>>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Cont
>>>>>e
>>>>>nt
>>>>>StreamHandlerBase.java:74)
>>>>>        at
>>>>>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandler
>>>>>B
>>>>>as
>>>>>e.java:135)
>>>>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>>>>>        at
>>>>>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.j
>>>>>a
>>>>>va
>>>>>:721)
>>>>>        at
>>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.
>>>>>j
>>>>>av
>>>>>a:417)
>>>>>        at
>>>>>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.
>>>>>j
>>>>>av
>>>>>a:201)
>>>>>        at
>>>>>org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa
>>>>>n
>>>>>dl
>>>>>er.java:1419)
>>>>>        at
>>>>>org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:4
>>>>>5
>>>>>5)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav
>>>>>a
>>>>>:1
>>>>>37)
>>>>>        at
>>>>>org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:
>>>>>5
>>>>>57
>>>>>)
>>>>>        at
>>>>>org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler
>>>>>.
>>>>>ja
>>>>>va:231)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler
>>>>>.
>>>>>ja
>>>>>va:1075)
>>>>>        at
>>>>>org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:38
>>>>>4
>>>>>)
>>>>>        at
>>>>>org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.
>>>>>j
>>>>>av
>>>>>a:193)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.
>>>>>j
>>>>>av
>>>>>a:1009)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.jav
>>>>>a
>>>>>:1
>>>>>35)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Contex
>>>>>t
>>>>>Ha
>>>>>ndlerCollection.java:255)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollec
>>>>>t
>>>>>io
>>>>>n.java:154)
>>>>>        at
>>>>>org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.j
>>>>>a
>>>>>va
>>>>>:116)
>>>>>        at org.eclipse.jetty.server.Server.handle(Server.java:368)
>>>>>        at
>>>>>org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractH
>>>>>t
>>>>>tp
>>>>>Connection.java:489)
>>>>>        at
>>>>>org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingH
>>>>>t
>>>>>tp
>>>>>Connection.java:53)
>>>>>        at
>>>>>org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpCon
>>>>>n
>>>>>ec
>>>>>tion.java:953)
>>>>>        at
>>>>>org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(
>>>>>A
>>>>>bs
>>>>>tractHttpConnection.java:1014)
>>>>>        at
>>>>>org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953)
>>>>>        at
>>>>>org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
>>>>>        at
>>>>>org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConn
>>>>>e
>>>>>ct
>>>>>ion.java:72)
>>>>>        at
>>>>>org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(Sock
>>>>>e
>>>>>tC
>>>>>onnector.java:264)
>>>>>        at
>>>>>org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.
>>>>>j
>>>>>av
>>>>>a:608)
>>>>>        at
>>>>>org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.j
>>>>>a
>>>>>va
>>>>>:543)
>>>>>        at java.lang.Thread.run(Thread.java:745)
>>>>>Caused by: com.ctc.wstx.exc.WstxIOException: early EOF
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:389)
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:2
>>>>>4
>>>>>6)
>>>>>        at
>>>>>org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
>>>>>        ... 32 more
>>>>>Caused by: org.eclipse.jetty.io.EofException: early EOF
>>>>>        at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
>>>>>        at java.io.InputStream.read(InputStream.java:101)
>>>>>        at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
>>>>>        at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
>>>>>        at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
>>>>>        at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
>>>>>        at
>>>>>com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.ja
>>>>>v
>>>>>a:
>>>>>57)
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:10
>>>>>4
>>>>>6)
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)
>>>>>        at
>>>>>com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:
>>>>>2
>>>>>98
>>>>>1)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.jav
>>>>>a
>>>>>:2
>>>>>936)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2
>>>>>8
>>>>>48
>>>>>)
>>>>>        at
>>>>>com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
>>>>>        ... 35 more
>>>>>
>>>>
>>
>

Reply via email to