Just tried it with no other changes than upping the RAM to 128GB total, and it's flying. I think that proves that RAM is good. =) Will implement suggested changes later, though.
cheers, Tom On 22 October 2013 09:04, Tom Mortimer <tom.m.f...@gmail.com> wrote: > Boogie, Shawn, > > Thanks for the replies. I'm going to try out some of your suggestions > today. Although, without more RAM I'm not that optimistic.. > > Tom > > > > On 21 October 2013 18:40, Shawn Heisey <s...@elyograg.org> wrote: > >> On 10/21/2013 9:48 AM, Tom Mortimer wrote: >> >>> Hi everyone, >>> >>> I've been working on an installation recently which uses SolrCloud to >>> index >>> 45M documents into 8 shards on 2 VMs running 64-bit Ubuntu (with another >>> 2 >>> identical VMs set up for replicas). The reason we're using so many shards >>> for a relatively small index is that there are complex filtering >>> requirements at search time, to restrict users to items they are licensed >>> to view. Initial tests demonstrated that multiple shards would be >>> required. >>> >>> The total size of the index is about 140GB, and each VM has 16GB RAM >>> (32GB >>> total) and 4 CPU units. I know this is far under what would normally be >>> recommended for an index of this size, and I'm working on persuading the >>> customer to increase the RAM (basically, telling them it won't work >>> otherwise.) Performance is currently pretty poor and I would expect more >>> RAM to improve things. However, there are a couple of other oddities >>> which >>> concern me, >>> >> >> Running multiple shards like you are, where each operating system is >> handling more than one shard, is only going to perform better if your query >> volume is low and you have lots of CPU cores. If your query volume is high >> or you only have 2-4 CPU cores on each VM, you might be better off with >> fewer shards or not sharded at all. >> >> The way that I read this is that you've got two physical machines with >> 32GB RAM, each running two VMs that have 16GB. Each VM houses 4 shards, or >> 70GB of index. >> >> There's a scenario that might be better if all of the following are true: >> 1) I'm right about how your hardware is provisioned. 2) You or the client >> owns the hardware. 3) You have an extremely low-end third machine >> available - single CPU with 1GB of RAM would probably be enough. In this >> scenario, you run one Solr instance and one zookeeper instance on each of >> your two "big" machines, and use the third wimpy machine as a third >> zookeeper node. No virtualization. For the rest of my reply, I'm assuming >> that you haven't taken this step, but it will probably apply either way. >> >> >> The first is that I've been reindexing a fixed set of 500 docs to test >>> indexing and commit performance (with soft commits within 60s). The time >>> taken to complete a hard commit after this is longer than I'd expect, and >>> highly variable - from 10s to 70s. This makes me wonder whether the SAN >>> (which provides all the storage for these VMs and the customers several >>> other VMs) is being saturated periodically. I grabbed some iostat output >>> on >>> different occasions to (possibly) show the variability: >>> >>> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn >>> sdb 64.50 0.00 2476.00 0 4952 >>> ... >>> sdb 8.90 0.00 348.00 0 6960 >>> ... >>> sdb 1.15 0.00 43.20 0 864 >>> >> >> There are two likely possibilities for this. One or both of them might >> be in play. 1) Because the OS disk cache is small, not much of the index >> can be cached. This can result in a lot of disk I/O for a commit, slowing >> things way down. Increasing the size of the OS disk cache is really the >> only solution for that. 2) Cache autowarming, particularly the filter >> cache. In the cache statistics, you can see how long each cache took to >> warm up after the last searcher was opened. The solution for that is to >> reduce the autowarmCount values. >> >> >> The other thing that confuses me is that after a Solr restart or hard >>> commit, search times average about 1.2s under light load. After searching >>> the same set of queries for 5-6 iterations this improves to 0.1s. >>> However, >>> in either case - cold or warm - iostat reports no device reads at all: >>> >>> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn >>> sdb 0.40 0.00 8.00 0 160 >>> ... >>> sdb 0.30 0.00 10.40 0 104 >>> >>> (the writes are due to logging). This implies to me that the 'hot' blocks >>> are being completely cached in RAM - so why the variation in search time >>> and the number of iterations required to speed it up? >>> >> >> Linux is pretty good about making limited OS disk cache resources work. >> Sounds like the caching is working reasonably well for queries. It might >> not be working so well for updates or commits, though. >> >> Running multiple Solr JVMs per machine, virtual or not, causes more >> problems than it solves. Solr has no limits on the number of cores (shard >> replicas) per instance, assuming there are enough system resources. There >> should be exactly one Solr JVM per operating system. Running more than one >> results in quite a lot of overhead, and your memory is precious. When you >> create a collection, you can give the collections API the >> "maxShardsPerNode" parameter to create more than one shard per instance. >> >> >> I don't have a great deal of experience in low-level performance tuning, >>> so >>> please forgive any naivety. Any ideas of what to do next would be greatly >>> appreciated. I don't currently have details of the VM implementation but >>> can get hold of this if it's relevant. >>> >> >> I don't think the virtualization details matter all that much. Please >> feel free to ask questions or supply more info based on what I've told you. >> >> Thanks, >> Shawn >> >> >