Hi Shawn, Wow! Thank you for your considered reply!
I'm going to dig into these issues, but I have a few questions: Regarding memory: Including duplicate data in shard replicas the entire index is 350GB. Each server hosts a total of 44GB of data. Each server has 28GB of memory. I haven't been setting -Xmx or -Xms, in the hopes that Java would take the memory it needs and leave the rest to the OS for cache. Given that I'll never need to serve 200 concurrent connections in production, do you think my servers need more memory? Should I be tinkering with -Xmx and -Xms? Regarding commits: My end-users want new data to be made available quickly. Thankfully I'm only inserting between 1 and 3 documents per second so the change-rate isn't crazy. Should I just slow down my commit frequency, and depend on soft-commits? If I do this, will the commits take even longer? Given 1000 documents, is it generally faster to do 10 commits of 100, or 1 commit of 1000? Thanks so much! -D On Fri, Nov 22, 2013 at 2:27 AM, Shawn Heisey <s...@elyograg.org> wrote: > On 11/21/2013 6:41 PM, Dave Seltzer wrote: > > In digging a little deeper and looking at the config I see that > > <nrtMode>true</nrtMode> is commented out. I believe this is the default > > setting. So I don't know if NRT is enabled or not. Maybe just a red > herring. > > I had never seen this setting before. The default is true. SolrCloud > requires that it be set to true. Looks like it's a new parameter in > 4.5, added by SOLR-4909. From what I can tell reading the issue, > turning it off effectively disables soft commits. > > https://issues.apache.org/jira/browse/SOLR-4909 > > You've said that you are adding about 3 documents per second, but you > haven't said anything about how often you are doing commits. Erick's > question basically boils down to this: How quickly after indexing do > you expect the changes to be visible on a search, and how often are you > doing commits? > > Generally speaking (and ignoring the fact that nrtMode now exists), NRT > is not something you enable, it's something you try to achieve, by using > soft commits quickly and often, and by adjusting the configuration to > make the commits go faster. > > If you are trying to keep the interval between indexing and document > visibility down to less than a few seconds (especially if it's less than > one second), then you are trying to achieve NRT. > > There's a lot of information on the following wiki page about > performance problems. This specific link is to the last part of that > page, which deals with slow commits: > > http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_commits > > > I don't know what Garbage Collector we're using. In this test I'm running > > Solr 4.5.1 using Jetty from the example directory. > > If you aren't using any tuning parameters beyond setting the max heap, > then you are using the default parallel collector. It's a poor choice > for Solr unless your heap is very small. At 6GB, yours isn't very > small. It's not particularly huge either, but not small. > > > The CPU on the 8 nodes all stay around 70% use during the test. The nodes > > have 28GB of RAM. Java is using about 6GB and the rest is being used by > OS > > cache. > > How big is your index? If it's larger than about 30 GB, you probably > need more memory. If it's much larger than about 40 GB, you definitely > need more memory. > > > To perform the test we're running 200 concurrent threads in JMeter. The > > threads hit HAProxy which loadbalances the requests among the nodes. Each > > query is for a random word out of a list of about 10,000 words. Some of > the > > queries have faceting turned on. > > That's a pretty high query load. If you want to get anywhere near top > performance out of it, you'll want to have enough memory to fit your > entire index into RAM. You'll also need to reduce the load introduced > by indexing. A large part of the load from indexing comes from commits. > > > Because we're heavily loading the system the queries are returning quite > > slowly. For a simple search, the average response time was 300ms. The > peak > > response time was 11,000ms. The spikes in latency seem to occur about > every > > 2.5 minutes. > > I would bet that you're having one or both of the following issues: > > 1) Garbage collection issues from one or more of the following: > a) Heap too small. > b) Using the default GC instead of CMS with tuning. > 2) General performance issues from one or more of the following: > a) Not enough cache memory for your index size. > b) Too-frequent commits. > c) Commits taking a lot of time and resources due to cache warming. > > With a high query and index load, any problems become magnified. > > > I haven't spent that much time messing with SolrConfig, so most of the > > settings are the out-of-the-box defaults. > > The defaults are very good for small to medium indexes and low to medium > query load. If you have a big index and/or high query load, you'll > generally need to tune. > > Thanks, > Shawn > >