Sorry, meant to forward that to another developer at work. --wunder On Jun 13, 2014, at 1:03 PM, Walter Underwood <wun...@wunderwood.org> wrote:
> You can't, because it never reports them. We might be building with > HttpSolrServer instead. > > wunder > > On Jun 13, 2014, at 11:57 AM, Shawn Heisey <s...@elyograg.org> wrote: > >> On 6/13/2014 12:06 PM, Tang, Rebecca wrote: >>> I've been working with this issue for a while and I really don’t know what >>> the root cause is. Any insight would be great! >>> >>> I have 14 million records in a mysql DB. I grab 100,000 records from the >>> DB at a time and then use ConcurrentUpdateSolrServer (with queue size = 50 >>> and thread count = 4 and using the internally managed solr client) to write >>> the documents to the solr index. >> >> A side note, not directly related to your problem: >> ConcurrentUpdateSolrServer will swallow all indexing exceptions. In >> real terms, this means that you will *never* be notified that anything >> failed - from the point of view of your SolrJ application, indexing will >> always succeed, even if your Solr server is completely powered off. >> >> Instead of using ConcurrentUpdateSolrServer, use HttpSolrServer and >> configure your application to do indexing with several threads. >> HttpSolrServer is completely threadsafe. >> >>> If I build metadata only (I.e. Only from DB to Solr), then the index build >>> takes 4 hrs with no errors. >>> >>> But if I build metadata + ocr text (ocr text is stored on the file system >>> and can be very large), then the index build takes 15 – 16 hrs and often >>> times I get a few early EOF errors on the Solr server. >>> From Solr.log: >>> INFO - 2014-06-13 06:28:27.113; >>> org.apache.solr.update.processor.LogUpdateProcessor; [ltdl3testperf] >>> webapp=/solr path=/update params={wt=javabin&version=2} {add=[trpy0136 >>> (1470801743195406336), nfhc0136 (1470801743199600640), sfhc0136 >>> (1470801743205892096), kghc0136 (1470801743218475008), zfhc0136 >>> (1470801743220572160), jghc0136 (1470801743237349376), rghc0136 >>> (1470801743268806656), ffhc0136 (1470801743270903808), pghc0136 >>> (1470801743285583872), sghc0136 (1470801743286632448), ... (14165 adds)]} 0 >>> 260102 >>> ERROR - 2014-06-13 06:28:27.114; org.apache.solr.common.SolrException; >>> java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException] >>> early EOF >>> at >>> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18) >> >> EofException from Jetty means one specific thing: The client software >> disconnected before Solr was finished with the request and sent its >> response. Chances are good that this is because of a configured socket >> timeout on your SolrJ client or its HttpClient. This might have been >> done with the setSoTimeout method on the server object. >> >> If you must configure a socket timeout, make it VERY long -- longer than >> a single request is going to take, which often means several minutes. >> >> Thanks, >> Shawn >> > > -- > Walter Underwood > wun...@wunderwood.org > > > -- Walter Underwood wun...@wunderwood.org