On Thu, 2017-10-26 at 19:09 +1100, Huw Davies wrote: > I was wondering if anyone had looked into getting cluster metrics > (counters and performance) and exposing them via collect? > > We’re about to go live with a large application on a four node RHEL > and are interested in understanding the cluster loads - this will > help us decide whether it’s better to run everything on a couple of > nodes or to spread out across the whole cluster. > > Locking in particular we are concerned could be a bottleneck (it is > in the current implementation on another clustering solution) > > Huw Davies | e-mail: [email protected] > Melbourne | "If soccer was meant to be played in the > Australia | air, the sky would be painted green" >
I'm not familiar with any collectd metrics for pacemaker, but I would be curious what you end up going with. Regarding load-balancing, the key thing is that the point of high- availability is to withstand node loss. Your resources should be *able* to run on a quorate subset of nodes, even if you spread them out during normal operation, so it's a good idea to test that scenario under production load. With the default quorum options, a four-node cluster can lose only one node and remain quorate, so you'd want to ensure you can run comfortably on three nodes. If you use corosync's auto tie-breaker feature, you could go down to two nodes in some conditions. Whether it's "better" to concentrate or load-balance in normal operation is a matter of trade-offs. The advantages of concentrating are (1) lower power usage on the idle nodes, and (2) if production load increases, you'll notice more quickly whether your subset of nodes can handle it. The advantages of load-balancing are (1) continuously exercising all nodes so you're not surprised in an outage if a node has become degraded in some fashion, and (2) possibly better performance, depending on workload and capacities. Note that pacemaker has a "placement-strategy" option that lets you fine-tune load-balancing vs concentration. -- Ken Gaillot <[email protected]> _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
