Hi there,

I've been working with this issue for a while and I really don’t know what the 
root cause is.  Any insight would be great!

I have 14 million records in a mysql DB.  I grab 100,000 records from the DB at 
a time and then use ConcurrentUpdateSolrServer (with queue size = 50 and thread 
count = 4 and using the internally managed solr client) to write the documents 
to the solr index.

If I build metadata only (I.e. Only from DB to Solr), then the index build 
takes 4 hrs with no errors.

But if I build metadata + ocr text (ocr text is stored on the file system and 
can be very large), then the index build takes 15 – 16 hrs and often times I 
get a few early EOF errors on the Solr server.
>From Solr.log:
INFO  - 2014-06-13 06:28:27.113; 
org.apache.solr.update.processor.LogUpdateProcessor; [ltdl3testperf] 
webapp=/solr path=/update params={wt=javabin&version=2} {add=[trpy0136 
(1470801743195406336), nfhc0136 (1470801743199600640), sfhc0136 
(1470801743205892096), kghc0136 (1470801743218475008), zfhc0136 
(1470801743220572160), jghc0136 (1470801743237349376), rghc0136 
(1470801743268806656), ffhc0136 (1470801743270903808), pghc0136 
(1470801743285583872), sghc0136 (1470801743286632448), ... (14165 adds)]} 0 
260102
ERROR - 2014-06-13 06:28:27.114; org.apache.solr.common.SolrException; 
java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException] early 
EOF
        at 
com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
…

We tried increasing the solr server from 4 to 6 cpus.  We moved the solr server 
to a faster disk.  I reduced the queue size for the for 
ConcurrentUpdateSolrServer from 100 to 50.  But we cannot consistently get a 
full index going without any the EOF errors.

In my past three builds (I build them overnight):

  1.  The first one succeeded
  2.  The second one had one early EOF error and dropped 3 records out of 14 
million
  3.  The third one had many early EOFs and dropped around 200,000 records

One cluster of the errors occurred at around 6:28am.  I looked at the cpu and 
file I/O stats around that time, and didn't see anything out of the ordinary.

> sar
06:00:01 AM     all     42.13      0.00      1.54      2.13      0.00     54.20
06:10:01 AM     all     43.30      0.00      1.68      2.77      0.00     52.24
06:20:01 AM     all     47.73      0.00      1.83      2.43      0.00     48.01
06:30:01 AM     all     47.71      0.00      1.76      3.15      0.00     47.38
06:40:01 AM     all     47.01      0.00      1.68      2.55      0.00     48.76

> sar –d
06:00:01 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     
await     svctm     %util
06:20:01 AM    dev8-0      1.84      2.35    370.95    203.01      0.05     
27.60      9.58      1.76
06:20:01 AM   dev8-16     83.05    464.90  44384.81    540.05     13.25    
160.17      2.53     21.03
06:20:01 AM   dev8-32      0.00      0.00      0.00      0.00      0.00      
0.00      0.00      0.00
06:20:01 AM  dev253-0      1.41      1.71     10.90      8.95      0.01     
10.16      3.03      0.43
06:20:01 AM  dev253-1     45.09      0.64    360.06      8.00      2.46     
54.66      0.30      1.37
06:20:01 AM  dev253-2   5513.98    464.90  44092.00      8.08   1623.60    
295.54      0.04     21.04
06:30:01 AM    dev8-0      2.52    100.62     83.64     72.99      0.03     
10.42      6.59      1.66
06:30:01 AM   dev8-16     52.56   1502.75  18736.64    385.06      5.67    
107.95      2.17     11.42
06:30:01 AM   dev8-32     42.55      0.01  38923.71    914.83     15.33    
360.27      3.84     16.35
06:30:01 AM  dev253-0      3.03     98.24     13.55     36.93      0.03      
9.44      2.99      0.90
06:30:01 AM  dev253-1      9.06      2.38     70.09      8.00      0.26     
29.19      0.84      0.77
06:30:01 AM  dev253-2   7216.35   1502.76  57660.35      8.20   2599.49    
360.22      0.04     26.58


Does anyone have any suggestions of where I can dig for the root cause?


Thanks!
Rebecca Tang
Applications Developer, UCSF CKM
Legacy Tobacco Document Library<legacy.library.ucsf.edu/>
E: rebecca.t...@ucsf.edu

Reply via email to