On 04/18/2017 11:46 AM, Ferenc Wágner wrote: > Ken Gaillot <[email protected]> writes: > >> On 04/13/2017 11:11 AM, Ferenc Wágner wrote: >> >>> I encountered several (old) statements on various forums along the lines >>> of: "the CIB is not a transactional database and shouldn't be used as >>> one" or "resource parameters should only uniquely identify a resource, >>> not configure it" and "the CIB was not designed to be a configuration >>> database but people still use it that way". Sorry if I misquote these, >>> I go by my memories now, I failed to dig up the links by a quick try. >>> >>> Well, I've been feeling guilty in the above offenses for years, but it >>> worked out pretty well that way which helped to suppress these warnings >>> in the back of my head. Still, I'm curious: what's the reason for these >>> warnings, what are the dangers of "abusing" the CIB this way? >>> /var/lib/pacemaker/cib/cib.xml is 336 kB with 6 nodes and 155 resources >>> configured. Old Pacemaker versions required tuning PCMK_ipc_buffer to >>> handle this, but even the default is big enough nowadays (128 kB after >>> compression, I guess). >>> >>> Am I walking on thin ice? What should I look out for? >> >> That's a good question. Certainly, there is some configuration >> information in most resource definitions, so it's more a matter of degree. >> >> The main concerns I can think of are: >> >> 1. Size: Increasing the CIB size increases the I/O, CPU and networking >> overhead of the cluster (and if it crosses the compression threshold, >> significantly). It also marginally increases the time it takes the >> policy engine to calculate a new state, which slows recovery. > > Thanks for the input, Ken! Is this what you mean? > > cib: info: crm_compress_string: Compressed 1028972 bytes into 69095 (ratio > 14:1) in 138ms
yep > At the same time /var/lib/pacemaker/cib/cib.xml is 336K, and > > # cibadmin -Q --scope resources | wc -c > 330951 > # cibadmin -Q --scope status | wc -c > 732820 > > Even though I consume about 2 kB per resource, the status section > weights 2.2 times the resources section. Which means shrinking the > resource size wouldn't change the full size significantly. good point > At the same time, we should probably monitor the trends of the cluster > messaging health as we expand it (with nodes and resources). What would > be some useful indicators to graph? I think the main concern would be CPU spikes when a new state needs to be calculated (which is at least every cluster-recheck-interval). Network traffic on the cluster communication link would be interesting, especially at start-up when everything is happening at once, or after a global clean-up of all resources. I/O on whatever holds /var/lib/pacemaker will probably be small, but wouldn't hurt to check. >> 2. Consistency: Clusters can become partitioned. If changes are made on >> one or more partitions during the separation, the changes won't be >> reflected on all nodes until the partition heals, at which time the >> cluster will reconcile them, potentially losing one side's changes. > > Ah, that's a very good point, which I neglected totally: even inquorate > partitions can have configuration changes. Thanks for bringing this up! > I wonder if there's any practical workaround for that. > >> I suppose this isn't qualitatively different from using a separate >> configuration file, but those tend to be more static, and failure to >> modify all copies would be more obvious when doing them individually >> rather than issuing a single cluster command. > > From a different angle: if a node is off, you can't modify its > configuration file. So you need an independent mechanism to do what the > CIB synchronization does anyway, or a shared file system with its added > complexity. On the other hand, one needn't guess how Pacemaker > reconciles the conflicting resource configuration changes. Indeed, how > does it? Good question, I haven't delved deeply into that code. It's not merging diffs or anything like that -- some changes are blessed, and anything incompatible is discarded. _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
