This is embarrassing. I just realized that in the experiments where I saw Solr providing good service in the face of the overload requests, I was actually sending requests at a rate of 30 requests per second, not 300 requests per second. Once I ratcheted up the rate a little bit, Solr started to overload like the other applications I've tested.
Thanks for your time, and sorry for the mistake! Mike On Fri, Sep 21, 2012 at 7:19 AM, Mike Gagnon <mikegag...@gmail.com> wrote: > Thanks. If Solr doesn't have any special logic for dealing with > algorithmic-complexity attack-like overloads, then it sounds like Jetty and > Tomcat are responsible for Solr's unusually good performance in my > experiments (unusual compared to other non-Java web applications). > > Cheers, > Mike > > On Wed, Sep 19, 2012 at 8:30 AM, Walter Underwood > <wun...@wunderwood.org>wrote: > >> The front-end code protection that I mentioned was outside of Solr. At >> that time, requests with very large start values were slow, so we put code >> in the front end to never request those. Even if the user wanted page 5000 >> of the results, they would get page 100. >> >> Now, those requests are fast, so that external protection is not needed. >> >> I was running overload tests this summer and could not get Solr to behave >> badly. The throughput would drop off with overload, but not too bad. This >> was all with simple queries on a 1.2M doc index. >> >> wunder >> Walter Underwood >> Search Guy, Chegg >> >> On Sep 19, 2012, at 8:20 AM, Erik Hatcher wrote: >> >> > How are you triggering an infinite loop in your requests to Solr? >> > >> > Erik >> > >> > On Sep 19, 2012, at 11:12 , Mike Gagnon wrote: >> > >> >> [ I am sorry for breaking the thread, but my inbox has neither >> received my >> >> original post to the mailing list, nor Otis's response (so I can't >> reply to >> >> his response) ] >> >> >> >> Thanks a bunch for your response Otis. Let me more thoroughly explain >> my >> >> experimental workload and why I am surprised Solr works so well. >> >> >> >> The most important characteristic of my workload is that many of the >> >> requests (60 per second) cause infinite loops within Solr. That is, >> each of >> >> those requests causes a separate infinite loop within it's request >> context. >> >> >> >> This workload is similar to an algorithmic-complexity attack --- a >> type of >> >> DoS. In every web-app stack I've tested (except Solr/Jetty and >> >> Solr/Tomcat) such workloads cause an immediate and complete denial of >> >> service. What happens for these vulnerable applications, is that the >> thread >> >> pool fills up with infinite loops, and incoming requests become >> rejected. >> >> >> >> But Solr manages to survive such an attack. My best guess is that Solr >> has >> >> an especially good overload strategy that quickly kicks out the >> infinite >> >> loop requests -- which lowers CPU contention, and allows other >> requests to >> >> be admitted. >> >> >> >> My first guess would be that Tomcat or Jetty is responsible for the >> good >> >> response to overload. However, >> >> there was a good discussion in 2008 on this mailing list about Solr >> >> Security: >> >> >> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200811.mbox/browser >> >> >> >> In this discuss Walter Underwood commented: "We have protected against >> >> several different DoS problems in our front-end code." >> >> >> >> Perhaps it is these front-end defenses that help Solr survive my >> workloads? >> >> >> >> Thanks! >> >> Mike Gagnon >> >> >> >> >> >>> Hm, I'm not sure how to approach this. Solr is not alone here - >> there's >> >>> container like jetty, solr inside it and lucene inside solr. >> >>> Next, that index is reeeeally small, so there is no disk IO. The >> request >> >>> rate is also not super high and if you did this over a fast connection >> >> then >> >>> there are also no issues with slow response writing or with having >> lots of >> >>> concurrent connections or running out of threads ... >> >>> >> >>> ...so it's not really that surprising solr keeps working :) >> >>> >> >>> But...tell us more. >> >>> >> >>> Otis >> >>> -- >> >>> Performance Monitoring - http://sematext.com/spm >> >>> >> >>> >> >>> >> >>> On Sep 12, 2012 8:51 PM, "Mike Gagnon" <mikegag...@gmail.com> wrote: >> >>> >> >>> Hi, >> >>> >> >>> I have been studying how server software responds to requests that >> cause >> >>> CPU overloads (such as infinite loops). >> >>> >> >>> In my experiments I have observed that Solr performs unusually well >> when >> >>> subjected to such loads. Every other piece of web software I've >> >>> experimented with drops to zero service under such loads. Do you know >> how >> >>> Solr achieves such good performance? I am guessing that when Solr is >> >>> overload sheds load to make room for incoming requests, but I could >> not >> >>> find any documentation that describes Solr's overload strategy. >> >>> >> >>> Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB >> ram, >> >>> using it index and search about 10,000 pages on MediaWiki. I test both >> >>> Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a >> >> rate >> >>> of 300 requests per second. At the same time, I submitted "overload >> >>> requests" at a rate of 60 requests per second. Each overload request >> >> caused >> >>> an infinite loop in Solr via >> >>> https://issues.apache.org/jira/browse/SOLR-2631. >> >>> >> >>> With Jetty about 70% of non-overload requests completed --- 95% of >> >> requests >> >>> completing within 0.6 seconds. >> >>> With Tomcat about 34% of non-overload requests completed --- 95% of >> >>> requests completing within 0.6 seconds. >> >>> >> >>> I also ran Solr+Jetty with non-overload requests coming in 65 >> requests per >> >>> second (overload requests remain at 60 requests per second). In this >> >>> workload, the completion rate drops to 15% and the 95th percentile >> latency >> >>> increases to 25. >> >>> >> >>> Cheers, >> >>> Mike Gagnon >> >>> >> > >> >> >> >> >> >> >