You can't, because it never reports them. We might be building with 
HttpSolrServer instead.

wunder

On Jun 13, 2014, at 11:57 AM, Shawn Heisey <s...@elyograg.org> wrote:

> On 6/13/2014 12:06 PM, Tang, Rebecca wrote:
>> I've been working with this issue for a while and I really don’t know what 
>> the root cause is.  Any insight would be great!
>> 
>> I have 14 million records in a mysql DB.  I grab 100,000 records from the DB 
>> at a time and then use ConcurrentUpdateSolrServer (with queue size = 50 and 
>> thread count = 4 and using the internally managed solr client) to write the 
>> documents to the solr index.
> 
> A side note, not directly related to your problem:
> ConcurrentUpdateSolrServer will swallow all indexing exceptions.  In
> real terms, this means that you will *never* be notified that anything
> failed - from the point of view of your SolrJ application, indexing will
> always succeed, even if your Solr server is completely powered off.
> 
> Instead of using ConcurrentUpdateSolrServer, use HttpSolrServer and
> configure your application to do indexing with several threads. 
> HttpSolrServer is completely threadsafe.
> 
>> If I build metadata only (I.e. Only from DB to Solr), then the index build 
>> takes 4 hrs with no errors.
>> 
>> But if I build metadata + ocr text (ocr text is stored on the file system 
>> and can be very large), then the index build takes 15 – 16 hrs and often 
>> times I get a few early EOF errors on the Solr server.
>> From Solr.log:
>> INFO  - 2014-06-13 06:28:27.113; 
>> org.apache.solr.update.processor.LogUpdateProcessor; [ltdl3testperf] 
>> webapp=/solr path=/update params={wt=javabin&version=2} {add=[trpy0136 
>> (1470801743195406336), nfhc0136 (1470801743199600640), sfhc0136 
>> (1470801743205892096), kghc0136 (1470801743218475008), zfhc0136 
>> (1470801743220572160), jghc0136 (1470801743237349376), rghc0136 
>> (1470801743268806656), ffhc0136 (1470801743270903808), pghc0136 
>> (1470801743285583872), sghc0136 (1470801743286632448), ... (14165 adds)]} 0 
>> 260102
>> ERROR - 2014-06-13 06:28:27.114; org.apache.solr.common.SolrException; 
>> java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException] 
>> early EOF
>>        at 
>> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
> 
> EofException from Jetty means one specific thing:  The client software
> disconnected before Solr was finished with the request and sent its
> response.  Chances are good that this is because of a configured socket
> timeout on your SolrJ client or its HttpClient.  This might have been
> done with the setSoTimeout method on the server object.
> 
> If you must configure a socket timeout, make it VERY long -- longer than
> a single request is going to take, which often means several minutes.
> 
> Thanks,
> Shawn
> 

--
Walter Underwood
wun...@wunderwood.org



Reply via email to