Boogie, Shawn, Thanks for the replies. I'm going to try out some of your suggestions today. Although, without more RAM I'm not that optimistic..
Tom On 21 October 2013 18:40, Shawn Heisey <s...@elyograg.org> wrote: > On 10/21/2013 9:48 AM, Tom Mortimer wrote: > >> Hi everyone, >> >> I've been working on an installation recently which uses SolrCloud to >> index >> 45M documents into 8 shards on 2 VMs running 64-bit Ubuntu (with another 2 >> identical VMs set up for replicas). The reason we're using so many shards >> for a relatively small index is that there are complex filtering >> requirements at search time, to restrict users to items they are licensed >> to view. Initial tests demonstrated that multiple shards would be >> required. >> >> The total size of the index is about 140GB, and each VM has 16GB RAM (32GB >> total) and 4 CPU units. I know this is far under what would normally be >> recommended for an index of this size, and I'm working on persuading the >> customer to increase the RAM (basically, telling them it won't work >> otherwise.) Performance is currently pretty poor and I would expect more >> RAM to improve things. However, there are a couple of other oddities which >> concern me, >> > > Running multiple shards like you are, where each operating system is > handling more than one shard, is only going to perform better if your query > volume is low and you have lots of CPU cores. If your query volume is high > or you only have 2-4 CPU cores on each VM, you might be better off with > fewer shards or not sharded at all. > > The way that I read this is that you've got two physical machines with > 32GB RAM, each running two VMs that have 16GB. Each VM houses 4 shards, or > 70GB of index. > > There's a scenario that might be better if all of the following are true: > 1) I'm right about how your hardware is provisioned. 2) You or the client > owns the hardware. 3) You have an extremely low-end third machine > available - single CPU with 1GB of RAM would probably be enough. In this > scenario, you run one Solr instance and one zookeeper instance on each of > your two "big" machines, and use the third wimpy machine as a third > zookeeper node. No virtualization. For the rest of my reply, I'm assuming > that you haven't taken this step, but it will probably apply either way. > > > The first is that I've been reindexing a fixed set of 500 docs to test >> indexing and commit performance (with soft commits within 60s). The time >> taken to complete a hard commit after this is longer than I'd expect, and >> highly variable - from 10s to 70s. This makes me wonder whether the SAN >> (which provides all the storage for these VMs and the customers several >> other VMs) is being saturated periodically. I grabbed some iostat output >> on >> different occasions to (possibly) show the variability: >> >> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn >> sdb 64.50 0.00 2476.00 0 4952 >> ... >> sdb 8.90 0.00 348.00 0 6960 >> ... >> sdb 1.15 0.00 43.20 0 864 >> > > There are two likely possibilities for this. One or both of them might be > in play. 1) Because the OS disk cache is small, not much of the index can > be cached. This can result in a lot of disk I/O for a commit, slowing > things way down. Increasing the size of the OS disk cache is really the > only solution for that. 2) Cache autowarming, particularly the filter > cache. In the cache statistics, you can see how long each cache took to > warm up after the last searcher was opened. The solution for that is to > reduce the autowarmCount values. > > > The other thing that confuses me is that after a Solr restart or hard >> commit, search times average about 1.2s under light load. After searching >> the same set of queries for 5-6 iterations this improves to 0.1s. However, >> in either case - cold or warm - iostat reports no device reads at all: >> >> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn >> sdb 0.40 0.00 8.00 0 160 >> ... >> sdb 0.30 0.00 10.40 0 104 >> >> (the writes are due to logging). This implies to me that the 'hot' blocks >> are being completely cached in RAM - so why the variation in search time >> and the number of iterations required to speed it up? >> > > Linux is pretty good about making limited OS disk cache resources work. > Sounds like the caching is working reasonably well for queries. It might > not be working so well for updates or commits, though. > > Running multiple Solr JVMs per machine, virtual or not, causes more > problems than it solves. Solr has no limits on the number of cores (shard > replicas) per instance, assuming there are enough system resources. There > should be exactly one Solr JVM per operating system. Running more than one > results in quite a lot of overhead, and your memory is precious. When you > create a collection, you can give the collections API the > "maxShardsPerNode" parameter to create more than one shard per instance. > > > I don't have a great deal of experience in low-level performance tuning, >> so >> please forgive any naivety. Any ideas of what to do next would be greatly >> appreciated. I don't currently have details of the VM implementation but >> can get hold of this if it's relevant. >> > > I don't think the virtualization details matter all that much. Please > feel free to ask questions or supply more info based on what I've told you. > > Thanks, > Shawn > >