Thanks a ton Mark. I have tried SOLR-4816 and it didn't help. But I will try Mark's patch next week, and see what happens.
-Kevin On Thu, Sep 5, 2013 at 4:46 AM, Erick Erickson <erickerick...@gmail.com>wrote: > If you run into this again, try a jstack trace. You should see > evidence of being stuck in SolrCmdDistributor on a variable > called "semaphore"... On current 4x this is around line 420. > > If you're using SolrJ, then SOLR-4816 is another thing to try. > > But Mark's patch would be best of all to test, If that doesn't > fix it then the jstack suggestion would at least tell us if it's > the issue we think it is. > > FWIW, > Erick > > > On Wed, Sep 4, 2013 at 12:51 PM, Mark Miller <markrmil...@gmail.com> > wrote: > > > It would be great if you could give this patch a try: > > http://pastebin.com/raw.php?i=aaRWwSGP > > > > - Mark > > > > > > On Wed, Sep 4, 2013 at 8:31 AM, Kevin Osborn <kevin.osb...@cbsi.com> > > wrote: > > > > > Thanks. If there is anything I can do to help you resolve this issue, > let > > > me know. > > > > > > -Kevin > > > > > > > > > On Wed, Sep 4, 2013 at 7:51 AM, Mark Miller <markrmil...@gmail.com> > > wrote: > > > > > > > Ill look at fixing the root issue for 4.5. I've been putting it off > for > > > > way to long. > > > > > > > > Mark > > > > > > > > Sent from my iPhone > > > > > > > > On Sep 3, 2013, at 2:15 PM, Kevin Osborn <kevin.osb...@cbsi.com> > > wrote: > > > > > > > > > I was having problems updating SolrCloud with a large batch of > > records. > > > > The > > > > > records are coming in bursts with lulls between updates. > > > > > > > > > > At first, I just tried large updates of 100,000 records at a time. > > > > > Eventually, this caused Solr to hang. When hung, I can still query > > > Solr. > > > > > But I cannot do any deletes or other updates to the index. > > > > > > > > > > At first, my updates were going as SolrJ CSV posts. I have also > tried > > > > local > > > > > file updates and had similar results. I finally slowed things down > to > > > > just > > > > > use SolrJ's Update feature, which is basically just JavaBin. I am > > also > > > > > sending over just 100 at a time in 10 threads. Again, it eventually > > > hung. > > > > > > > > > > Sometimes, Solr hangs in the first couple of chunks. Other times, > it > > > > hangs > > > > > right away. > > > > > > > > > > These are my commit settings: > > > > > > > > > > <autoCommit> > > > > > <maxTime>15000</maxTime> > > > > > <maxDocs>5000</maxDocs> > > > > > <openSearcher>false</openSearcher> > > > > > </autoCommit> > > > > > <autoSoftCommit> > > > > > <maxTime>30000</maxTime> > > > > > </autoSoftCommit> > > > > > > > > > > I have tried quite a few variations with the same results. I also > > tried > > > > > various JVM settings with the same results. The only variable seems > > to > > > be > > > > > that reducing the cluster size from 2 to 1 is the only thing that > > > helps. > > > > > > > > > > I also did a jstack trace. I did not see any explicit deadlocks, > but > > I > > > > did > > > > > see quite a few threads in WAITING or TIMED_WAITING. It is > typically > > > > > something like this: > > > > > > > > > > java.lang.Thread.State: WAITING (parking) > > > > > at sun.misc.Unsafe.park(Native Method) > > > > > - parking to wait for <0x000000074039a450> (a > > > > > java.util.concurrent.Semaphore$NonfairSync) > > > > > at > > > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > > > > at > > > > > > > > > > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) > > > > > at > > > > > > > > > > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) > > > > > at > > > > > > > > > > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) > > > > > at > java.util.concurrent.Semaphore.acquire(Semaphore.java:317) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.util.AdjustableSemaphore.acquire(AdjustableSemaphore.java:61) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:418) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:368) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.update.SolrCmdDistributor.flushAdds(SolrCmdDistributor.java:300) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.update.SolrCmdDistributor.distribAdd(SolrCmdDistributor.java:139) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:474) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.loader.CSVLoaderBase.doAdd(CSVLoaderBase.java:395) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.loader.SingleThreadedCSVLoader.addDoc(CSVLoader.java:44) > > > > > at > > > > > > > > > org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:364) > > > > > at > > > > org.apache.solr.handler.loader.CSVLoader.load(CSVLoader.java:31) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) > > > > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) > > > > > at > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) > > > > > at > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) > > > > > > > > > > It basically appears that Solr gets stuck while trying to acquire a > > > > > semaphore that never becomes available. > > > > > > > > > > Anyone have any ideas? This is definitely causing major problems > for > > > us. > > > > > > > > > > -- > > > > > *KEVIN OSBORN* > > > > > LEAD SOFTWARE ENGINEER > > > > > CNET Content Solutions > > > > > OFFICE 949.399.8714 > > > > > CELL 949.310.4677 SKYPE osbornk > > > > > 5 Park Plaza, Suite 600, Irvine, CA 92614 > > > > > [image: CNET Content Solutions] > > > > > > > > > > > > > > > > -- > > > *KEVIN OSBORN* > > > LEAD SOFTWARE ENGINEER > > > CNET Content Solutions > > > OFFICE 949.399.8714 > > > CELL 949.310.4677 SKYPE osbornk > > > 5 Park Plaza, Suite 600, Irvine, CA 92614 > > > [image: CNET Content Solutions] > > > > > > > > > > > -- > > - Mark > > > -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.4677 SKYPE osbornk 5 Park Plaza, Suite 600, Irvine, CA 92614 [image: CNET Content Solutions]