On more thing that might help to identify the problem source (which I've discovered just now) -- at the time the optimize "finished" (or broke off), Tomcat logged the following:
r...@cms004:~# /opt/apache-tomcat-sba1-live/logs/catalina.2009-01-16.log ... Jan 16, 2009 11:40:14 AM org.apache.solr.core.SolrCore execute INFO: /update/csv header=false&separator=;&commit=true&fieldnames=key,siln,biln,prln,date,validfrom,contractno,currency, reserve1,reserve2,posno,processingcode,ean,ek,pb,quantityunit,p_s_ek1,p_s_m1,p_s_q1,p_s_ek2,p_s_m2,p_s_q2,p_s_ek3,p_s_m3 ,p_s_q3,p_s_ek4,p_s_m4,p_s_q4,p_s_ek5,p_s_m5,p_s_q5,p_s_ek6,p_s_m6,p_s_q6,p_s_ek7,p_s_m7,p_s_q7,p_s_ek8,p_s_m8,p_s_q8,p_ s_ek9,p_s_m9,p_s_q9,p_s_ek10,p_s_m10,p_s_q10,a_s_ek1,a_s_m1,a_s_q1,a_s_ek2,a_s_m2,a_s_q2,a_s_ek3,a_s_m3,a_s_q3,a_s_ek4,a _s_m4,a_s_q4,a_s_ek5,a_s_m5,a_s_q5,a_s_ek6,a_s_m6,a_s_q6,a_s_ek7,a_s_m7,a_s_q7,a_s_ek8,a_s_m8,a_s_q8,a_s_ek9,a_s_m9,a_s_ q9,a_s_ek10,a_s_m10,a_s_q10&stream.file=/opt/asdf/bla/bla 3754 Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true) Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true) Jan 16, 2009 12:08:01 PM org.apache.solr.core.SolrException log SEVERE: java.io.IOException: No space left on device at java.io.RandomAccessFile.writeBytes(Native Method) at java.io.RandomAccessFile.write(RandomAccessFile.java:456) at org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:589) at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96) at org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85) at org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109) at org.apache.lucene.store.FSDirectory$FSIndexOutput.close(FSDirectory.java:594) at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:48) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:211) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508) at de.businessmart.cai.solr.CAIUpdateHandler.commit(CAIUpdateHandler.java:343) at org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:214) at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:84) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77) at org.apache.solr.core.SolrCore.execute(SolrCore.java:658) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:191) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:159) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:833) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:639) at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1285) at java.lang.Thread.run(Thread.java:595) ... So a new question is: how much free disk space does an optimize need? The size of the index additionally? Dominik Dominik Schramm wrote: > Hello, > > we are running "Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02 > 17:35:12" with "Lucene Implementation Version: build 2007-05-20" in a > Tomcat application server "Apache Tomcat/5.5.20" on a 64-bit Ubuntu 7.10. > > For some time now (probably due to the continuous growth of the index, > which is now roughly 40 GB in size) we experience a problem with deleted > but still growing index files: > > r...@cms004:~# lsof | grep deleted > ... > java 10601 root 84u REG 8,9 237359104 > 2981966 /opt/solr1/data/index/_3tmy9.frq (deleted) > java 10601 root 85u REG 8,9 120507392 > 2981967 /opt/solr1/data/index/_3tmy9.prx (deleted) > java 10601 root 86u REG 8,9 14528512 > 2981968 /opt/solr1/data/index/_3tmy9.tis (deleted) > java 10601 root 87u REG 8,9 233472 > 2981971 /opt/solr1/data/index/_3tmy9.tii (deleted) > ... > r...@cms004:~# ps -fp 10601 > UID PID PPID C STIME TTY TIME CMD > root 10601 1 82 Jan15 pts/2 20:37:04 > /usr/lib/jvm/java-1.5.0-sun-1.5.0.13/bin/java -Djava.awt.headless=true - > r...@cms004:~# > > During the runs of the optimize.pl script (daily at night) the number of > files marked as deleted increases and drops again, but not to zero! > Several large files always remain even after the optimizer finished (the > lsof snapshot above is from after an optimize run). But even more > important is the fact that the Tomcat process still writes to them, > eventually filling up the partition. The only solution right now is to > restart the Tomcat process once a day. This is not super critical because > we run a staging environment, but it's a nuisance. > > I've noticed that the commit preceding the optimize always fails: > > 2009/01/16 12:08:01 started by d > 2009/01/16 12:08:01 command: /opt/solr1/bin/commit > 2009/01/16 12:08:04 commit request to Solr at > http://localhost:8083/solr1/update failed: > 2009/01/16 12:08:04 <?xml version="1.0" encoding="UTF-8"?> <response> <lst > name="responseHeader"><int name="status">0</int><int > name="QTime">2423</int></lst> </response> > 2009/01/16 12:08:04 failed (elapsed time: 3 sec) > > What can I do about this? Is this a known phenomenon in version 1.2.0 or > with Tomcat etc. and has it been solved in subsequent versions? I couldn't > find any specific hints at this in the changelogs. An upgrade would be > untrivial and time-consuming, so I would like to make sure that the > problem will go away afterwards. > > BTW: Danilo Fantinato described supposedly the same problem on Thu, 27 Sep > 2007 16:37:01 GMT, (when the version we are still using was more or less > the current one) the subject was "Problem with handle hold deleted files" > -- there was no reply. See here: > http://www.nabble.com/Problem-with-handle-hold-deleted-files-td12925293.html > > Thanks in advance for any help. > > Dominik >