Thank you Dwane. Great info :)
Le mer. 5 févr. 2020 à 11:49, Dwane Hall <dwaneh...@hotmail.com> a écrit : > Hey Dominique, > > From a memory management perspective I don't do any container resource > limiting specifically in Docker (although as you mention you certainly > can). In our circumstances these hosts are used specifically for Solr so I > planned and tested my capacity beforehand. We have ~768G of RAM on each of > these 5 hosts so with 20x16G heaps we had ~320G of heap being used by Solr, > some overhead for Docker and the other OS services leaving ~400G for the OS > cache and whatever wants to grab it on each host. Not everyone will have > servers this large which is why we really had to take advantage of multiple > Solr instances/host and Docker became important for our cluster operation > management. Our disk's are not SSD's either and all instances write to the > same raid 5 spinner which is bind mounted to the containers. With this > configuration we've been able to achieve consistent median response times > of under 500ms across the largest collection but obviously query type > varies this (no terms, leading wildcards etc.). Our QPS is not huge > ranging from 2-20/sec but if we need to scale further or speed up response > times there's certainly wins that can be made at a disk level. For our > current circumstances we're very content with the deployment. > > In not sure if you've read Toke's blog on his experiences at the Royal > Danish Library but I found it really useful when capacity planning and > recommend reading it ( > https://sbdevel.wordpress.com/2016/11/30/70tb-16b-docs-4-machines-1-solrcloud/ > ). > > As always it's recommend to test for your own conditions and best of luck > with your deployment! > > Dwane > > ------------------------------ > *From:* Scott Stults <sstu...@opensourceconnections.com> > *Sent:* Thursday, 30 January 2020 1:45 AM > *To:* solr-user@lucene.apache.org <solr-user@lucene.apache.org> > *Subject:* Re: Solr Cloud on Docker? > > One of our clients has been running a big Solr Cloud (100-ish nodes, TB > index, billions of docs) in kubernetes for over a year and it's been > wonderful. I think during that time the biggest scrapes we got were when we > ran out of disk space. Performance and reliability has been solid > otherwise. Like Dwane alluded to, a lot of operations pitfalls can be > avoided if you do your Docker orchestration through kubernetes. > > > k/r, > Scott > > On Tue, Jan 28, 2020 at 3:34 AM Dominique Bejean < > dominique.bej...@eolya.fr> > wrote: > > > Hi Dwane, > > > > Thank you for sharing this great solr/docker user story. > > > > According to your Solr/JVM memory requirements (Heap size + MetaSpace + > > OffHeap size) are you specifying specific settings in docker-compose > files > > (mem_limit, mem_reservation, mem_swappiness, ...) ? > > I suppose you are limiting total memory used by all dockerised Solr in > > order to keep free memory on host for MMAPDirectory ? > > > > In short can you explain the memory management ? > > > > Regards > > > > Dominique > > > > > > > > > > Le lun. 23 déc. 2019 à 00:17, Dwane Hall <dwaneh...@hotmail.com> a > écrit : > > > > > Hey Walter, > > > > > > I recently migrated our Solr cluster to Docker and am very pleased I > did > > > so. We run relativity large servers and run multiple Solr instances per > > > physical host and having managed Solr upgrades on bare metal installs > > since > > > Solr 5, containerisation has been a blessing (currently Solr 7.7.2). In > > our > > > case we run 20 Solr nodes per host over 5 hosts totalling 100 Solr > > > instances. Here I host 3 collections of varying size. The first > contains > > > 60m docs (8 shards), the second 360m (12 shards) , and the third 1.3b > (30 > > > shards) all with 2 NRT replicas. The docs are primarily database > sourced > > > but are not tiny by any means. > > > > > > Here are some of my comments from our migration journey: > > > - Running Solr on Docker should be no different to bare metal. You > still > > > need to test for your environment and conditions and follow the guides > > and > > > best practices outlined in the excellent Lucidworks blog post > > > > > > https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ > > > . > > > - The recent Solr Docker images are built with Java 11 so if you store > > > your indexes in hdfs you'll have to build your own Docker image as > Hadoop > > > is not yet certified with Java 11 (or use an older Solr version image > > built > > > with Java 8) > > > - As Docker will be responsible for quite a few Solr nodes it becomes > > > important to make sure the Docker daemon is configured in systemctl to > > > restart after failure or reboot of the host. Additionally the Docker > > > restart=always setting is useful for restarting failed containers > > > automatically if a single container dies (i.e. JVM explosions). I've > > > deliberately blown up the JVM in test conditions and found the > > > containers/Solr recover really well under Docker. > > > - I use Docker Compose to spin up our environment and it has been > > > excellent for maintaining consistent settings across Solr nodes and > > hosts. > > > Additionally using a .env file makes most of the Solr environment > > variables > > > per node configurable in an external file. > > > - I'd recommend Docker Swarm if you plan on running Solr over multiple > > > physical hosts. Unfortunately we had an incompatible OS so I was unable > > to > > > utilise this approach. The same incompatibility existed for K8s but > > > Lucidworks has another great article on this approach if you're more > > > fortunate with your environment than us > > > https://lucidworks.com/post/running-solr-on-kubernetes-part-1/. > > > - Our Solr instances are TLS secured and use the basic auth plugin and > > > rules based authentication provider. There's nothing I have not been > able > > > to configure with the default Docker images using environment variables > > > passed into the container. This makes upgrades to Solr versions really > > easy > > > as you just need to grab the image and pass in your environment details > > to > > > the container for any new Solr version. > > > - If possible I'd start with the Solr 8 Docker image. The project > > > underwent a large refactor to align it with the install script based on > > > community feedback. If you start with an earlier version you'll need to > > > refactor when you eventually move to Solr version 8. The Solr Docker > page > > > has more details on this. > > > - Matijn Koster (the project lead) is excellent and very responsive to > > > questions on the project page. Read through the q&a page before > reaching > > > out I found a lot of my questions already answered there. > Additionally, > > he > > > provides a number of example Docker configurations from command line > > > parameters to docker-compose files running multiple instances and > > zookeeper > > > quarums. > > > - The Docker extra hosts parameter is useful for adding extra hosts to > > > your containers hosts file particularly if you have multiple nic cards > > with > > > internal and external interfaces and you want to force communication > > over a > > > specific one. > > > - We use the Solr Prometheus exporter to collect node metrics. I've > found > > > I've needed to reduce the metrics to collect as having this many nodes > > > overwhelmed it occasionally. From memory it had something to do with > > > concurrent modification of Future objects the collector users and it > > > sometimes misses collection cycles. This is not Docker related but Solr > > > size related and the exporter's ability to handle it. > > > - We use the zkCli script a lot for updating configsets. As I did not > > want > > > to have to copy them into a container to update them I just download a > > copy > > > of the Solr binaries and use it entirely for this zookeeper script. > It's > > > not elegant but a number of our Dev's are not familiar with Docker and > > this > > > was a nice compromise. Another alternative is to just use the rest API > to > > > do any configset manipulation. > > > - We load balance all of these nodes to external clients using a > haproxy > > > Docker image. This combined with the Docker restart policy and Solr > > > replication and autoscaling capabilities provides a very stable > > environment > > > for us. > > > > > > All in all migrating and running Solr on Docker has been brilliant. It > > was > > > primarily driven by a need to scale our environment vertically on large > > > hardware instances as running 100 nodes on bare metal was too big a > > > maintenance and administrative burden for us with a small Dev and > support > > > team. To date it's been very stable and reliable so I would recommend > the > > > approach if you are in a similar situation. > > > > > > Thanks, > > > > > > Dwane > > > > > > > > > > > > > > > > > > > > > ________________________________ > > > From: Walter Underwood <wun...@wunderwood.org> > > > Sent: Saturday, 14 December 2019 6:04 PM > > > To: solr-user@lucene.apache.org <solr-user@lucene.apache.org> > > > Subject: Solr Cloud on Docker? > > > > > > Does anyone have experience running a big Solr Cloud cluster on Docker > > > containers? By “big”, I mean 35 million docs, 40 nodes, 8 shards, with > 36 > > > CPU instances. We are running version 6.6.2 right now, but could > upgrade. > > > > > > If people have specific things to do or avoid, I’d really appreciate > it. > > > > > > I got a couple of responses on the Slack channel, but I’d love more > > > stories from the trenches. This is a direction for our company > > architecture. > > > > > > We have a master/slave cluster (Solr 4.10.4) that is awesome. I can > > > absolutely see running the slaves as containers. For Solr Cloud? Makes > me > > > nervous. > > > > > > wunder > > > Walter Underwood > > > wun...@wunderwood.org > > > http://observer.wunderwood.org/ (my blog) > > > > > > > > > > > -- > Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC > | 434.409.2780 > http://www.opensourceconnections.com >