Own Similarity Class in Solr
Hello, I would like to alter the similarity behaviour of solr to remove the fieldnorm factor in the similarity calculations. As far as I read, I need to recreate my own similarity class and import it into solr using the config in schema.xml. Has anybody already tweaked or played with this topic, and might give me some code or advices ? Thats would be really great Thanks Best Greetings, Tom
Recompilation of latest lucene seems to break update of Solr
Hello, jhust compiled the latest version of lucene (), updated the webapps/solr/WEB-INF/lib/ with the 3 jar files: lucene-core-2.0.1-dev.jar lucene-snowball-2.0.1-dev.jar lucene-highlighter-2.0.1-dev.jar Restarted solr, The Admin interface of solr is still running, but trying to send updates to solr gives me the following error: java.lang.NoClassDefFoundError: org/apache/lucene/document/Fieldable at org.apache.solr.core.SolrCore.update(SolrCore.java:673) at org.apache.solr.servlet.SolrUpdateServlet.doPost (SolrUpdateServlet.java:52) at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:252) at org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke (StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke (StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke (StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke (ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke (StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service (CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process (Http11Processor.java:869) at org.apache.coyote.http11.Http11BaseProtocol $Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket (PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt (LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool $ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) I read in the mailinglists of lucene that they changed the Fieldable into Field: LUCENE-545 that was recently committed breaks backward compatibility with Document.getField(), a non-expert level API that is *very* widely used. Something simple like Field x = mydoc.getField("x"); no longer compiles (and neither do other methods with Field in the signature). Is this intentional? If not, uses of "Field" in unit tests should not have been changed to Fieldable. Anybody an idea how to solve my problem. I jhust want to change some stuff in the lucene itself and replace the jar files on solr with the newly created (compiled) ones Thanks tom
Recompilation of latest lucene seems to break update of Solr - Addum
Hi again, with the latest nightbuild of solr and the latest lucene, the error still persists, but the error message is different, jhust to be complete here the actual error message: java.lang.NoSuchMethodError: org.apache.lucene.document.Document.add (Lorg/apache/lucene/document/Fieldable;)V at org.apache.solr.update.DocumentBuilder.addSingleField (DocumentBuilder.java:64) at org.apache.solr.update.DocumentBuilder.addField (DocumentBuilder.java:70) at org.apache.solr.update.DocumentBuilder.addField (DocumentBuilder.java:80) at org.apache.solr.core.SolrCore.readDoc(SolrCore.java:915) at org.apache.solr.core.SolrCore.update(SolrCore.java:685) at org.apache.solr.servlet.SolrUpdateServlet.doPost (SolrUpdateServlet.java:52) at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:252) at org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke (StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke (StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke (StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke (ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke (StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service (CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process (Http11Processor.java:869) at org.apache.coyote.http11.Http11BaseProtocol $Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket (PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt (LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool $ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) Tom
Re: Recompilation of latest lucene seems to break update of Solr
Hi Tom, I had fixed the LUCENE-545 backward incompatability in Lucene here: http://issues.apache.org/jira/browse/LUCENE-609 Although it shouldn't be neccessary, maybe it would work if you put the new lucene libs in solr's lib dir and rebuild solr? -Yonik On 7/26/06, Tom Weber <[EMAIL PROTECTED]> wrote: Hello, jhust compiled the latest version of lucene (), updated the webapps/solr/WEB-INF/lib/ with the 3 jar files: lucene-core-2.0.1-dev.jar lucene-snowball-2.0.1-dev.jar lucene-highlighter-2.0.1-dev.jar Restarted solr, The Admin interface of solr is still running, but trying to send updates to solr gives me the following error: java.lang.NoClassDefFoundError: org/apache/lucene/document/Fieldable at org.apache.solr.core.SolrCore.update(SolrCore.java:673) at org.apache.solr.servlet.SolrUpdateServlet.doPost (SolrUpdateServlet.java:52) at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:252) at org.apache.catalina.core.ApplicationFilterChain.doFilter (ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke (StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke (StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke (StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke (ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke (StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service (CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process (Http11Processor.java:869) at org.apache.coyote.http11.Http11BaseProtocol $Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket (PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt (LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool $ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) I read in the mailinglists of lucene that they changed the Fieldable into Field: LUCENE-545 that was recently committed breaks backward compatibility with Document.getField(), a non-expert level API that is *very* widely used. Something simple like Field x = mydoc.getField("x"); no longer compiles (and neither do other methods with Field in the signature). Is this intentional? If not, uses of "Field" in unit tests should not have been changed to Fieldable. Anybody an idea how to solve my problem. I jhust want to change some stuff in the lucene itself and replace the jar files on solr with the newly created (compiled) ones Thanks tom
Recompilation of latest lucene seems to break update of Solr - Solution
Hi again, sorry to spam the list, but wanted to share the solution which let the system compile. I had to use the latest lucene version, not available on their site, but only on the subversioning system. (Version called 2.1 - 419723 2006-07-06 22:14:07) To get this version, use the "svn" tool, here the syntax : "svn checkout http://svn.apache.org/repos/asf/lucene/java/trunk/ lucene" this version does compile and the solr is able to compile also, as well as the application is running with the jhust build .jar files. Best Greetings, Tom
Doc add limit
Hey there... I'm having an issue with large doc updates on my solr installation. I'm adding in batches between 2-20,000 docs at a time and I've noticed Solr seems to hang at 6,144 docs every time. Breaking the adds into smaller batches works just fine, but I was wondering if anyone knew why this would happen. I've tried doubling memory as well as tweaking various config options but nothing seems to let me break the 6,144 barrier. This is the output from Solr admin. Any help would be greatly appreciated. *name: * updateHandler *class: * org.apache.solr.update.DirectUpdateHandler2 *version: * 1.0 *description: * Update handler that efficiently directly updates the on-disk main lucene index *stats: *commits : 0 optimizes : 0 docsPending : 6144 deletesPending : 6144 adds : 6144 deletesById : 0 deletesByQuery : 0 errors : 0 cumulative_adds : 6144 cumulative_deletesById : 0 cumulative_deletesByQuery : 0 cumulative_errors : 0 docsDeleted : 0
Re: Doc add limit
It's possible it's not hanging, but just takes a long time on a specific add. This is because Lucene will occasionally merge segments. When very large segments are merged, it can take a long time. In the log file, add commands are followed by the number of milliseconds the operation took. Next time Solr hangs, wait for a number of minutes until you see the operation logged and note how long it took. How many documents are in the index before you do a batch that causes a hang? Does it happen on the first batch? If so, you might be seeing some other bug. What appserver are you using? Do the admin pages respond when you see this hang? If so, what does a stack trace look like? -Yonik On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: Hey there... I'm having an issue with large doc updates on my solr installation. I'm adding in batches between 2-20,000 docs at a time and I've noticed Solr seems to hang at 6,144 docs every time. Breaking the adds into smaller batches works just fine, but I was wondering if anyone knew why this would happen. I've tried doubling memory as well as tweaking various config options but nothing seems to let me break the 6,144 barrier. This is the output from Solr admin. Any help would be greatly appreciated. *name: * updateHandler *class: * org.apache.solr.update.DirectUpdateHandler2 *version: * 1.0 *description: * Update handler that efficiently directly updates the on-disk main lucene index *stats: *commits : 0 optimizes : 0 docsPending : 6144 deletesPending : 6144 adds : 6144 deletesById : 0 deletesByQuery : 0 errors : 0 cumulative_adds : 6144 cumulative_deletesById : 0 cumulative_deletesByQuery : 0 cumulative_errors : 0 docsDeleted : 0
Re: Doc add limit
Thanks for you help Yonik, I've responded to your questions below: On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: It's possible it's not hanging, but just takes a long time on a specific add. This is because Lucene will occasionally merge segments. When very large segments are merged, it can take a long time. I've left it running (hung) for up to a half hour at a time and I've verified that my cpu idles during the hang. I have witnessed much shorter hangs on the ramp up to my 6,144 limit but they have been more like 2 - 10 seconds in length. Perhaps this is the Lucene merging you mentioned. In the log file, add commands are followed by the number of milliseconds the operation took. Next time Solr hangs, wait for a number of minutes until you see the operation logged and note how long it took. Here are the last 5 log entries before the hang the last one is doc #6,144. Also it looks like Tomcat is trying to redeploy the webapp those last tomcat entries repeat indefinitely every 10 seconds or so. Perhaps this is a Tomcat problem? Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110705) 0 36596 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110700) 0 36600 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110688) 0 36603 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110690) 0 36608 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110686) 0 36611 Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT/META-INF/context.xml Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT/WEB-INF/web.xml Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT/META-INF/context.xml Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17 /conf/context.xml How many documents are in the index before you do a batch that causes a hang? Does it happen on the first batch? If so, you might be seeing some other bug. What appserver are you using? Do the admin pages respond when you see this hang? If so, what does a stack trace look like? I actually don't think I had the problem on the first batch, in fact my first batch contained very close to 6,144 documents so perhaps there is a relation there. Right now, I'm adding to an index with close to 90,000 documents in it. I'm running Tomcat 5.5.17 and the admin pages respond just fine when it's hung... I did a thread dump and this is the trace of my update: "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu time=6330.7360ms user time=5769.5920ms at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at java.io.PrintStream.write(PrintStream.java:412) at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java :112) at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:533) at sun.net.www.protocol.http.HttpURLConnection.writeRequests( HttpURLConnection.java:410) at sun.net.www.protocol.http.HttpURLConnection.getInputStream( HttpURLConnection.java:934) at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java:169) at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java:62) at org.apache.jsp.update_jsp._jspService(update_jsp.java:57) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.jasper.servlet.JspServletWrapper.service( JspServletWrapper.java:332) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java :314) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter( ApplicationFilterChain.java:252) at org.apache.catalina.core.ApplicationFilterChain.doFilter( ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke( StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke( StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke( StandardHostValve.java:126)
Re: Doc add limit
So it looks like your client is hanging trying to send somethig over the socket to the server and blocking... probably because Tomcat isn't reading anything from the socket because it's busy trying to restart the webapp. What is the heap size of the server? try increasing it... maybe tomcat could have detected low memory and tried to reload the webapp. -Yonik On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: Thanks for you help Yonik, I've responded to your questions below: On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > It's possible it's not hanging, but just takes a long time on a > specific add. This is because Lucene will occasionally merge > segments. When very large segments are merged, it can take a long > time. I've left it running (hung) for up to a half hour at a time and I've verified that my cpu idles during the hang. I have witnessed much shorter hangs on the ramp up to my 6,144 limit but they have been more like 2 - 10 seconds in length. Perhaps this is the Lucene merging you mentioned. In the log file, add commands are followed by the number of > milliseconds the operation took. Next time Solr hangs, wait for a > number of minutes until you see the operation logged and note how long > it took. Here are the last 5 log entries before the hang the last one is doc #6,144. Also it looks like Tomcat is trying to redeploy the webapp those last tomcat entries repeat indefinitely every 10 seconds or so. Perhaps this is a Tomcat problem? Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110705) 0 36596 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110700) 0 36600 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110688) 0 36603 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110690) 0 36608 Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update INFO: add (id=110686) 0 36611 Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] redeploy resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT/META-INF/context.xml Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT/WEB-INF/web.xml Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17 /webapps/ROOT/META-INF/context.xml Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources FINE: Checking context[] reload resource /source/solr/apache-tomcat-5.5.17 /conf/context.xml How many documents are in the index before you do a batch that causes > a hang? Does it happen on the first batch? If so, you might be > seeing some other bug. What appserver are you using? Do the admin > pages respond when you see this hang? If so, what does a stack trace > look like? I actually don't think I had the problem on the first batch, in fact my first batch contained very close to 6,144 documents so perhaps there is a relation there. Right now, I'm adding to an index with close to 90,000 documents in it. I'm running Tomcat 5.5.17 and the admin pages respond just fine when it's hung... I did a thread dump and this is the trace of my update: "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu time=6330.7360ms user time=5769.5920ms at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at java.io.PrintStream.write(PrintStream.java:412) at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java :112) at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:533) at sun.net.www.protocol.http.HttpURLConnection.writeRequests( HttpURLConnection.java:410) at sun.net.www.protocol.http.HttpURLConnection.getInputStream( HttpURLConnection.java:934) at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaUpdate.java:169) at com.gawker.solr.update.GanjaUpdate.update(GanjaUpdate.java:62) at org.apache.jsp.update_jsp._jspService(update_jsp.java:57) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.jasper.servlet.JspServletWrapper.service( JspServletWrapper.java:332) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java :314) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.cata
Re: Doc add limit
Right now the heap is set to 512M but I've increased it up to 2GB and yet it still hangs at the same number 6,144... Here's something interesting... I pushed this code over to a different server and tried an update. On that server it's hanging on #5,267. Then tomcat seems to try to reload the webapp... indefinitely. So I guess this is looking more like a tomcat problem more than a lucene/solr problem huh? -Sangraal On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: So it looks like your client is hanging trying to send somethig over the socket to the server and blocking... probably because Tomcat isn't reading anything from the socket because it's busy trying to restart the webapp. What is the heap size of the server? try increasing it... maybe tomcat could have detected low memory and tried to reload the webapp. -Yonik On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > Thanks for you help Yonik, I've responded to your questions below: > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > It's possible it's not hanging, but just takes a long time on a > > specific add. This is because Lucene will occasionally merge > > segments. When very large segments are merged, it can take a long > > time. > > > I've left it running (hung) for up to a half hour at a time and I've > verified that my cpu idles during the hang. I have witnessed much shorter > hangs on the ramp up to my 6,144 limit but they have been more like 2 - 10 > seconds in length. Perhaps this is the Lucene merging you mentioned. > > In the log file, add commands are followed by the number of > > milliseconds the operation took. Next time Solr hangs, wait for a > > number of minutes until you see the operation logged and note how long > > it took. > > > Here are the last 5 log entries before the hang the last one is doc #6,144. > Also it looks like Tomcat is trying to redeploy the webapp those last tomcat > entries repeat indefinitely every 10 seconds or so. Perhaps this is a Tomcat > problem? > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > INFO: add (id=110705) 0 36596 > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > INFO: add (id=110700) 0 36600 > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > INFO: add (id=110688) 0 36603 > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > INFO: add (id=110690) 0 36608 > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > INFO: add (id=110686) 0 36611 > Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources > FINE: Checking context[] redeploy resource /source/solr/apache- tomcat-5.5.17 > /webapps/ROOT > Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources > FINE: Checking context[] redeploy resource /source/solr/apache- tomcat-5.5.17 > /webapps/ROOT/META-INF/context.xml > Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources > FINE: Checking context[] reload resource /source/solr/apache- tomcat-5.5.17 > /webapps/ROOT/WEB-INF/web.xml > Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources > FINE: Checking context[] reload resource /source/solr/apache- tomcat-5.5.17 > /webapps/ROOT/META-INF/context.xml > Jul 26, 2006 1:25:36 PM org.apache.catalina.startup.HostConfigcheckResources > FINE: Checking context[] reload resource /source/solr/apache- tomcat-5.5.17 > /conf/context.xml > > How many documents are in the index before you do a batch that causes > > a hang? Does it happen on the first batch? If so, you might be > > seeing some other bug. What appserver are you using? Do the admin > > pages respond when you see this hang? If so, what does a stack trace > > look like? > > > I actually don't think I had the problem on the first batch, in fact my > first batch contained very close to 6,144 documents so perhaps there is a > relation there. Right now, I'm adding to an index with close to 90,000 > documents in it. > I'm running Tomcat 5.5.17 and the admin pages respond just fine when it's > hung... I did a thread dump and this is the trace of my update: > > "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu > time=6330.7360ms user time=5769.5920ms > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java :92) > at java.net.SocketOutputStream.write(SocketOutputStream.java:136) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java :105) > at java.io.PrintStream.write(PrintStream.java:412) > at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java > :112) > at sun.net.www.http.HttpClient.writeRequests(HttpClient.java:533) > at sun.net.www.protocol.http.HttpURLConnection.writeRequests( > HttpURLConnection.java:410) > at sun.net.www.protocol.http.HttpURLConnection.getInputStream( > HttpURLConnection.java:934) > at com.gawker.solr.update.GanjaUpdate.doUpdate(GanjaU
Re: Doc add limit
Tomcat problem, or a Solr problem that is only manifesting on your platform, or a JVM or libc problem, or even a client update problem... (possibly you might be exhausting the number of sockets in the server by using persistent connections with a long timeout and never reusing them?) What is your OS/JVM? -Yonik On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: Right now the heap is set to 512M but I've increased it up to 2GB and yet it still hangs at the same number 6,144... Here's something interesting... I pushed this code over to a different server and tried an update. On that server it's hanging on #5,267. Then tomcat seems to try to reload the webapp... indefinitely. So I guess this is looking more like a tomcat problem more than a lucene/solr problem huh? -Sangraal On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > So it looks like your client is hanging trying to send somethig over > the socket to the server and blocking... probably because Tomcat isn't > reading anything from the socket because it's busy trying to restart > the webapp. > > What is the heap size of the server? try increasing it... maybe tomcat > could have detected low memory and tried to reload the webapp. > > -Yonik > > On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > > Thanks for you help Yonik, I've responded to your questions below: > > > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > > > It's possible it's not hanging, but just takes a long time on a > > > specific add. This is because Lucene will occasionally merge > > > segments. When very large segments are merged, it can take a long > > > time. > > > > > > I've left it running (hung) for up to a half hour at a time and I've > > verified that my cpu idles during the hang. I have witnessed much > shorter > > hangs on the ramp up to my 6,144 limit but they have been more like 2 - > 10 > > seconds in length. Perhaps this is the Lucene merging you mentioned. > > > > In the log file, add commands are followed by the number of > > > milliseconds the operation took. Next time Solr hangs, wait for a > > > number of minutes until you see the operation logged and note how long > > > it took. > > > > > > Here are the last 5 log entries before the hang the last one is doc > #6,144. > > Also it looks like Tomcat is trying to redeploy the webapp those last > tomcat > > entries repeat indefinitely every 10 seconds or so. Perhaps this is a > Tomcat > > problem? > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > INFO: add (id=110705) 0 36596 > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > INFO: add (id=110700) 0 36600 > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > INFO: add (id=110688) 0 36603 > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > INFO: add (id=110690) 0 36608 > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > INFO: add (id=110686) 0 36611 > > Jul 26, 2006 1:25:36 PM > org.apache.catalina.startup.HostConfigcheckResources > > FINE: Checking context[] redeploy resource /source/solr/apache- > tomcat-5.5.17 > > /webapps/ROOT > > Jul 26, 2006 1:25:36 PM > org.apache.catalina.startup.HostConfigcheckResources > > FINE: Checking context[] redeploy resource /source/solr/apache- > tomcat-5.5.17 > > /webapps/ROOT/META-INF/context.xml > > Jul 26, 2006 1:25:36 PM > org.apache.catalina.startup.HostConfigcheckResources > > FINE: Checking context[] reload resource /source/solr/apache- > tomcat-5.5.17 > > /webapps/ROOT/WEB-INF/web.xml > > Jul 26, 2006 1:25:36 PM > org.apache.catalina.startup.HostConfigcheckResources > > FINE: Checking context[] reload resource /source/solr/apache- > tomcat-5.5.17 > > /webapps/ROOT/META-INF/context.xml > > Jul 26, 2006 1:25:36 PM > org.apache.catalina.startup.HostConfigcheckResources > > FINE: Checking context[] reload resource /source/solr/apache- > tomcat-5.5.17 > > /conf/context.xml > > > > How many documents are in the index before you do a batch that causes > > > a hang? Does it happen on the first batch? If so, you might be > > > seeing some other bug. What appserver are you using? Do the admin > > > pages respond when you see this hang? If so, what does a stack trace > > > look like? > > > > > > I actually don't think I had the problem on the first batch, in fact my > > first batch contained very close to 6,144 documents so perhaps there is > a > > relation there. Right now, I'm adding to an index with close to 90,000 > > documents in it. > > I'm running Tomcat 5.5.17 and the admin pages respond just fine when > it's > > hung... I did a thread dump and this is the trace of my update: > > > > "http-8080-Processor25" Id=33 in RUNNABLE (running in native) total cpu > > time=6330.7360ms user time=5769.5920ms > > at java.net.SocketOutputStream.socketWrite0(Native Method) > > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java > :92) > > at java.net.SocketOutputStream.write(SocketOutpu
Re: Doc add limit
I see the problem on Mac OS X/JDK: 1.5.0_06 and Debian/JDK: 1.5.0_07. I don't think it's a socket problem, because I can initiate additional updates while the server is hung... weird I know. Thanks for all your help, I'll send a post if/when I find a solution. -S On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: Tomcat problem, or a Solr problem that is only manifesting on your platform, or a JVM or libc problem, or even a client update problem... (possibly you might be exhausting the number of sockets in the server by using persistent connections with a long timeout and never reusing them?) What is your OS/JVM? -Yonik On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > Right now the heap is set to 512M but I've increased it up to 2GB and yet it > still hangs at the same number 6,144... > > Here's something interesting... I pushed this code over to a different > server and tried an update. On that server it's hanging on #5,267. Then > tomcat seems to try to reload the webapp... indefinitely. > > So I guess this is looking more like a tomcat problem more than a > lucene/solr problem huh? > > -Sangraal > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > So it looks like your client is hanging trying to send somethig over > > the socket to the server and blocking... probably because Tomcat isn't > > reading anything from the socket because it's busy trying to restart > > the webapp. > > > > What is the heap size of the server? try increasing it... maybe tomcat > > could have detected low memory and tried to reload the webapp. > > > > -Yonik > > > > On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > > > Thanks for you help Yonik, I've responded to your questions below: > > > > > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > > > > > It's possible it's not hanging, but just takes a long time on a > > > > specific add. This is because Lucene will occasionally merge > > > > segments. When very large segments are merged, it can take a long > > > > time. > > > > > > > > > I've left it running (hung) for up to a half hour at a time and I've > > > verified that my cpu idles during the hang. I have witnessed much > > shorter > > > hangs on the ramp up to my 6,144 limit but they have been more like 2 - > > 10 > > > seconds in length. Perhaps this is the Lucene merging you mentioned. > > > > > > In the log file, add commands are followed by the number of > > > > milliseconds the operation took. Next time Solr hangs, wait for a > > > > number of minutes until you see the operation logged and note how long > > > > it took. > > > > > > > > > Here are the last 5 log entries before the hang the last one is doc > > #6,144. > > > Also it looks like Tomcat is trying to redeploy the webapp those last > > tomcat > > > entries repeat indefinitely every 10 seconds or so. Perhaps this is a > > Tomcat > > > problem? > > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > INFO: add (id=110705) 0 36596 > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > INFO: add (id=110700) 0 36600 > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > INFO: add (id=110688) 0 36603 > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > INFO: add (id=110690) 0 36608 > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > INFO: add (id=110686) 0 36611 > > > Jul 26, 2006 1:25:36 PM > > org.apache.catalina.startup.HostConfigcheckResources > > > FINE: Checking context[] redeploy resource /source/solr/apache- > > tomcat-5.5.17 > > > /webapps/ROOT > > > Jul 26, 2006 1:25:36 PM > > org.apache.catalina.startup.HostConfigcheckResources > > > FINE: Checking context[] redeploy resource /source/solr/apache- > > tomcat-5.5.17 > > > /webapps/ROOT/META-INF/context.xml > > > Jul 26, 2006 1:25:36 PM > > org.apache.catalina.startup.HostConfigcheckResources > > > FINE: Checking context[] reload resource /source/solr/apache- > > tomcat-5.5.17 > > > /webapps/ROOT/WEB-INF/web.xml > > > Jul 26, 2006 1:25:36 PM > > org.apache.catalina.startup.HostConfigcheckResources > > > FINE: Checking context[] reload resource /source/solr/apache- > > tomcat-5.5.17 > > > /webapps/ROOT/META-INF/context.xml > > > Jul 26, 2006 1:25:36 PM > > org.apache.catalina.startup.HostConfigcheckResources > > > FINE: Checking context[] reload resource /source/solr/apache- > > tomcat-5.5.17 > > > /conf/context.xml > > > > > > How many documents are in the index before you do a batch that causes > > > > a hang? Does it happen on the first batch? If so, you might be > > > > seeing some other bug. What appserver are you using? Do the admin > > > > pages respond when you see this hang? If so, what does a stack trace > > > > look like? > > > > > > > > > I actually don't think I had the problem on the first batch, in fact my > > > first batch contained very close to 6,144 documents so perhaps there is > > a > > > relation there. Right now, I'm adding to an inde
Re: Doc add limit
If you narrow the docs down to just the "id" field, does it still happen at the same place? -Yonik On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: I see the problem on Mac OS X/JDK: 1.5.0_06 and Debian/JDK: 1.5.0_07. I don't think it's a socket problem, because I can initiate additional updates while the server is hung... weird I know. Thanks for all your help, I'll send a post if/when I find a solution. -S On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > Tomcat problem, or a Solr problem that is only manifesting on your > platform, or a JVM or libc problem, or even a client update problem... > (possibly you might be exhausting the number of sockets in the server > by using persistent connections with a long timeout and never reusing > them?) > > What is your OS/JVM? > > -Yonik > > On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > > Right now the heap is set to 512M but I've increased it up to 2GB and > yet it > > still hangs at the same number 6,144... > > > > Here's something interesting... I pushed this code over to a different > > server and tried an update. On that server it's hanging on #5,267. Then > > tomcat seems to try to reload the webapp... indefinitely. > > > > So I guess this is looking more like a tomcat problem more than a > > lucene/solr problem huh? > > > > -Sangraal > > > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > > > So it looks like your client is hanging trying to send somethig over > > > the socket to the server and blocking... probably because Tomcat isn't > > > reading anything from the socket because it's busy trying to restart > > > the webapp. > > > > > > What is the heap size of the server? try increasing it... maybe tomcat > > > could have detected low memory and tried to reload the webapp. > > > > > > -Yonik > > > > > > On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > > > > Thanks for you help Yonik, I've responded to your questions below: > > > > > > > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > > > > > > > It's possible it's not hanging, but just takes a long time on a > > > > > specific add. This is because Lucene will occasionally merge > > > > > segments. When very large segments are merged, it can take a long > > > > > time. > > > > > > > > > > > > I've left it running (hung) for up to a half hour at a time and I've > > > > verified that my cpu idles during the hang. I have witnessed much > > > shorter > > > > hangs on the ramp up to my 6,144 limit but they have been more like > 2 - > > > 10 > > > > seconds in length. Perhaps this is the Lucene merging you mentioned. > > > > > > > > In the log file, add commands are followed by the number of > > > > > milliseconds the operation took. Next time Solr hangs, wait for a > > > > > number of minutes until you see the operation logged and note how > long > > > > > it took. > > > > > > > > > > > > Here are the last 5 log entries before the hang the last one is doc > > > #6,144. > > > > Also it looks like Tomcat is trying to redeploy the webapp those > last > > > tomcat > > > > entries repeat indefinitely every 10 seconds or so. Perhaps this is > a > > > Tomcat > > > > problem? > > > > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > INFO: add (id=110705) 0 36596 > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > INFO: add (id=110700) 0 36600 > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > INFO: add (id=110688) 0 36603 > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > INFO: add (id=110690) 0 36608 > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > INFO: add (id=110686) 0 36611 > > > > Jul 26, 2006 1:25:36 PM > > > org.apache.catalina.startup.HostConfigcheckResources > > > > FINE: Checking context[] redeploy resource /source/solr/apache- > > > tomcat-5.5.17 > > > > /webapps/ROOT > > > > Jul 26, 2006 1:25:36 PM > > > org.apache.catalina.startup.HostConfigcheckResources > > > > FINE: Checking context[] redeploy resource /source/solr/apache- > > > tomcat-5.5.17 > > > > /webapps/ROOT/META-INF/context.xml > > > > Jul 26, 2006 1:25:36 PM > > > org.apache.catalina.startup.HostConfigcheckResources > > > > FINE: Checking context[] reload resource /source/solr/apache- > > > tomcat-5.5.17 > > > > /webapps/ROOT/WEB-INF/web.xml > > > > Jul 26, 2006 1:25:36 PM > > > org.apache.catalina.startup.HostConfigcheckResources > > > > FINE: Checking context[] reload resource /source/solr/apache- > > > tomcat-5.5.17 > > > > /webapps/ROOT/META-INF/context.xml > > > > Jul 26, 2006 1:25:36 PM > > > org.apache.catalina.startup.HostConfigcheckResources > > > > FINE: Checking context[] reload resource /source/solr/apache- > > > tomcat-5.5.17 > > > > /conf/context.xml > > > > > > > > How many documents are in the index before you do a batch that > causes > > > > > a hang? Does it happen on the first batch? If so, you might be > > > > > seeing some othe
Re: Recompilation of latest lucene seems to break update of Solr - Solution
:I had to use the latest lucene version, not available on their : site, but only on the subversioning system. (Version called 2.1 - : 419723 2006-07-06 22:14:07) Just to clarify: Solr is definitely a little more "bleeding edge" then the stable releases of Lucene. The Lucene JARs in solr's lib directory come from the nightly builds and are rev'ed as needed. You should always be able to verify which nightly build is in use in the CHANGES.txt, which currently says... 3. Upgrade to Lucene 2.0 nightly build 2006-07-15, lucene SVN revision 422302, -Hoss
Re: Doc add limit
I removed everything from the Add xml so the docs looked like this: 187880 187852 and it still hung at 6,144... -S On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: If you narrow the docs down to just the "id" field, does it still happen at the same place? -Yonik On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > I see the problem on Mac OS X/JDK: 1.5.0_06 and Debian/JDK: 1.5.0_07. > > I don't think it's a socket problem, because I can initiate additional > updates while the server is hung... weird I know. > > Thanks for all your help, I'll send a post if/when I find a solution. > > -S > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > Tomcat problem, or a Solr problem that is only manifesting on your > > platform, or a JVM or libc problem, or even a client update problem... > > (possibly you might be exhausting the number of sockets in the server > > by using persistent connections with a long timeout and never reusing > > them?) > > > > What is your OS/JVM? > > > > -Yonik > > > > On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > > > Right now the heap is set to 512M but I've increased it up to 2GB and > > yet it > > > still hangs at the same number 6,144... > > > > > > Here's something interesting... I pushed this code over to a different > > > server and tried an update. On that server it's hanging on #5,267. Then > > > tomcat seems to try to reload the webapp... indefinitely. > > > > > > So I guess this is looking more like a tomcat problem more than a > > > lucene/solr problem huh? > > > > > > -Sangraal > > > > > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > > > > > So it looks like your client is hanging trying to send somethig over > > > > the socket to the server and blocking... probably because Tomcat isn't > > > > reading anything from the socket because it's busy trying to restart > > > > the webapp. > > > > > > > > What is the heap size of the server? try increasing it... maybe tomcat > > > > could have detected low memory and tried to reload the webapp. > > > > > > > > -Yonik > > > > > > > > On 7/26/06, sangraal aiken <[EMAIL PROTECTED]> wrote: > > > > > Thanks for you help Yonik, I've responded to your questions below: > > > > > > > > > > On 7/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > It's possible it's not hanging, but just takes a long time on a > > > > > > specific add. This is because Lucene will occasionally merge > > > > > > segments. When very large segments are merged, it can take a long > > > > > > time. > > > > > > > > > > > > > > > I've left it running (hung) for up to a half hour at a time and I've > > > > > verified that my cpu idles during the hang. I have witnessed much > > > > shorter > > > > > hangs on the ramp up to my 6,144 limit but they have been more like > > 2 - > > > > 10 > > > > > seconds in length. Perhaps this is the Lucene merging you mentioned. > > > > > > > > > > In the log file, add commands are followed by the number of > > > > > > milliseconds the operation took. Next time Solr hangs, wait for a > > > > > > number of minutes until you see the operation logged and note how > > long > > > > > > it took. > > > > > > > > > > > > > > > Here are the last 5 log entries before the hang the last one is doc > > > > #6,144. > > > > > Also it looks like Tomcat is trying to redeploy the webapp those > > last > > > > tomcat > > > > > entries repeat indefinitely every 10 seconds or so. Perhaps this is > > a > > > > Tomcat > > > > > problem? > > > > > > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > > INFO: add (id=110705) 0 36596 > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > > INFO: add (id=110700) 0 36600 > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > > INFO: add (id=110688) 0 36603 > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > > INFO: add (id=110690) 0 36608 > > > > > Jul 26, 2006 1:25:28 PM org.apache.solr.core.SolrCore update > > > > > INFO: add (id=110686) 0 36611 > > > > > Jul 26, 2006 1:25:36 PM > > > > org.apache.catalina.startup.HostConfigcheckResources > > > > > FINE: Checking context[] redeploy resource /source/solr/apache- > > > > tomcat-5.5.17 > > > > > /webapps/ROOT > > > > > Jul 26, 2006 1:25:36 PM > > > > org.apache.catalina.startup.HostConfigcheckResources > > > > > FINE: Checking context[] redeploy resource /source/solr/apache- > > > > tomcat-5.5.17 > > > > > /webapps/ROOT/META-INF/context.xml > > > > > Jul 26, 2006 1:25:36 PM > > > > org.apache.catalina.startup.HostConfigcheckResources > > > > > FINE: Checking context[] reload resource /source/solr/apache- > > > > tomcat-5.5.17 > > > > > /webapps/ROOT/WEB-INF/web.xml > > > > > Jul 26, 2006 1:25:36 PM > > > > org.apache.catalina.startup.HostConfigcheckResources > > > > > FINE: Checking context[] reload resource /source/solr/apache- > > > > tomcat-5.5.17 > > > > > /webapps/ROOT/META-INF/context.x