On 02/13/2019 04:29 PM, Ulrich Windl wrote: > Hi! > > I wonder: Can we close this thread with "You have been warned, so please don't > come back later, crying! In the meantime you can do what you want to do."?
I think something like the answer of digimer is the better and more general advise: If you think you don't need fencing then you probably don't need a cluster (or you missed something ;-) ). Klaus > > Regards, > Ulrich > >>>> Jehan-Guillaume de Rorthais <[email protected]> schrieb am 13.02.2019 um > 15:05 in > Nachricht <20190213150549.47634671@firost>: >> On Wed, 13 Feb 2019 13:50:17 +0100 >> Maciej S <[email protected]> wrote: >> >>> Can you describe at least one situation when it could happen? >>> I see situations where data on two masters can diverge but I can't find > the >>> one where data gets corrupted. Or maybe you think that some kind of >>> restoration is required in case of diverged data, but this is not my use >>> case (I can live with a loss of some data on one branch and recover it > from >>> working master). >> With imagination and some "if", we can describe some scenario, but chaos is >> much >> more creative than me. But anyway, bellow is a situation: >> >> PostgreSQL doesn't do sanity check when starting as a standby and catching >> up >> with a primary. If your old primary crashed and catch up with the new one >> without some housecleaning first by a human (rebuilding it or using >> pg_rewind), it will be corrupted. >> >> Please, do not leave on a public mailing list dangerous assumptions like >> "fencing is like for additional precaution". It is not, in a lot a >> situation, >> PostgreSQL included. >> >> I know there is use cases where extreme-HA-failure-coverage is not > required. >> Typically, implementing 80% of the job is enough or just make sure the >> service >> is up, no matter the data loss. In such case, maybe you can avoid the >> complexity >> of a "state of the art full HA stack with seat-belt helmet and parachute" >> and >> have something cheaper. >> >> As instance, Patroni is a very good alternative, but a PostgreSQL-only >> solution. >> At least, it has the elegance to use an external DCS for Quorum and Watchdog >> as >> fencing-of-the-poor-man and self-fencing solution. >> >> >>> śr., 13 lut 2019 o 13:10 Jehan-Guillaume de Rorthais <[email protected]> >>> napisał(a): >>> >>>> On Wed, 13 Feb 2019 13:02:30 +0100 >>>> Maciej S <[email protected]> wrote: >>>> >>>>> Thank you all for the answers. I can see your point, but anyway it > seems >>>>> that fencing is like for additional precaution. >>>> It's not. >>>> >>>>> If my requirements allow some manual intervention in some cases (eg. >>>>> unknown resource state after failover), then I might go ahead without >>>>> fencing. At least until STONITH is not mandatory :) >>>> Well, then soon or later, we'll talk again about how to quickly restore >>>> your >>>> service and/or data. And the answer will be difficult to swallow. >>>> >>>> Good luck :) >>>> >>>>> pon., 11 lut 2019 o 17:54 Digimer <[email protected]> napisał(a): >>>>> >>>>>> On 2019-02-11 6:34 a.m., Maciej S wrote: >>>>>>> I was wondering if anyone can give a plain answer if fencing is >>>> really >>>>>>> needed in case there are no shared resources being used (as far as > I >>>>>>> define shared resource). >>>>>>> >>>>>>> We want to use PAF or other Postgres (with replicated data files on > >>>> the >>>>>>> local drives) failover agent together with Corosync, Pacemaker and >>>>>>> virtual IP resource and I am wondering if there is a need for > fencing >>>>>>> (which is very close bind to an infrastructure) if a Pacemaker is >>>>>>> already controlling resources state. I know that in failover case >>>> there >>>>>>> might be a need to add functionality to recover master that > entered >>>>>>> dirty shutdown state (eg. in case of power outage), but I can't see > >>>> any >>>>>>> case where fencing is really necessary. Am I wrong? >>>>>>> >>>>>>> I was looking for a strict answer but I couldn't find one... >>>>>>> >>>>>>> Regards, >>>>>>> Maciej >>>>>> Fencing is as required as a wearing a seat belt in a car. You can >>>>>> physically make things work, but the first time you're "in an >>>> accident", >>>>>> you're screwed. >>>>>> >>>>>> Think of it this way; >>>>>> >>>>>> If services can run in two or more places at the same time without >>>>>> coordination, you don't need a cluster, just run things everywhere. > If >>>>>> you need coordination though, you need fencing. >>>>>> >>>>>> The role of fencing is to force a node that has entered into an > unknown >>>>>> state and force it into a known state. In a system that requires >>>>>> coordination, often times fencing is the only way to ensure sane >>>> operation. >>>>>> Also, with pacemaker v2, fencing (stonith) became mandatory at a >>>>>> programmatic level. >>>>>> >>>>>> -- >>>>>> Digimer >>>>>> Papers and Projects: https://alteeve.com/w/ >>>>>> "I am, somehow, less interested in the weight and convolutions of >>>>>> Einstein’s brain than in the near certainty that people of equal > talent >>>>>> have lived and died in cotton fields and sweatshops." - Stephen Jay >>>> Gould >> _______________________________________________ >> Users mailing list: [email protected] >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Users mailing list: [email protected] > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Users mailing list: [email protected] https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
