Usually EOF errors indicate that the packet you're sending are too big. Wait, though. 50K is not buffered docs, I think it's buffered _requests_. So you're creating a queue that's ginormous and asking 2 threads to empty it.
But that's not really the issue I suspect. How many documents are you adding at a time when you call server.add? I.e. are you using sever.add(doc) or server.add(doclist)? If the latter and you're adding a bunch of docs, try lowering that number. If you're sending one doc at a time I'm on the wrong track. Best Erick On Thu, Jul 18, 2013 at 2:51 PM, Beale, Jim (US-KOP) <jim.be...@hibu.com> wrote: > Hey folks, > > I've been migrating an application which indexes about 15M documents from > straight-up Lucene into SolrCloud. We've set up 5 Solr instances with a 3 > zookeeper ensemble using HAProxy for load balancing. The documents are > processed on a quad core machine with 6 threads and indexed into SolrCloud > through HAProxy using ConcurrentUpdateSolrServer in order to batch the > updates. The indexing box is heavily-loaded during indexing but I don't > think it is so bad that it would cause issues. > > I'm using Solr 4.3.1 on client and server side, zookeeper 3.4.5 and HAProxy > 1.4.22. > > I've been accepting the default HttpClient with 50K buffered docs and 2 > threads, i.e., > > int solrMaxBufferedDocs = 50000; > int solrThreadCount = 2; > solrServer = new ConcurrentUpdateSolrServer(solrHttpIPAddress, > solrMaxBufferedDocs, solrThreadCount); > > autoCommit is configured in the solrconfig as follows: > > <autoCommit> > <maxTime>600000</maxTime> > <maxDocs>500000</maxDocs> > <openSearcher>false</openSearcher> > </autoCommit> > > I'm getting the following errors on the client and server sides respectively: > > Client side: > > 2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-4] INFO > SystemDefaultHttpClient - I/O exception (java.net.SocketException) caught > when processing request: Software caused connection abort: socket write error > 2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-4] INFO > SystemDefaultHttpClient - Retrying request > 2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-5] INFO > SystemDefaultHttpClient - I/O exception (java.net.SocketException) caught > when processing request: Software caused connection abort: socket write error > 2013-07-16 19:02:47,002 [concurrentUpdateScheduler-1-thread-5] INFO > SystemDefaultHttpClient - Retrying request > > Server side: > > 7988753 [qtp1956653918-23] ERROR org.apache.solr.core.SolrCore รข > java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException] > early EOF > at > com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18) > at > com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731) > at > com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657) > at > com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809) > at > org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:393) > > When I disabled autoCommit on the server side, I didn't see any errors there > but I still get the issue client-side after about 2 million documents - which > is about 45 minutes. > > Has anyone seen this issue before? I couldn't find anything useful on the > usual places. > > I suppose I could setup wireshark to see what is happening but I'm hoping > that someone has a better suggestion. > > Thanks in advance for any help! > > > Best regards, > Jim Beale > > hibu.com > 2201 Renaissance Boulevard, King of Prussia, PA, 19406 > Office: 610-879-3864 > Mobile: 610-220-3067 > > The information contained in this email message, including any attachments, > is intended solely for use by the individual or entity named above and may be > confidential. If the reader of this message is not the intended recipient, > you are hereby notified that you must not read, use, disclose, distribute or > copy any part of this communication. If you have received this communication > in error, please immediately notify me by email and destroy the original > message, including any attachments. Thank you. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > The information contained in this email message, including any attachments, > is intended solely for use by the individual or entity named above and may be > confidential. If the reader of this message is not the intended recipient, > you are hereby notified that you must not read, use, disclose, distribute or > copy any part of this communication. If you have received this communication > in error, please immediately notify me by email and destroy the original > message, including any attachments. Thank you.