[ClusterLabs] Antw: [EXT] Postgres Cluster PAF problems

Ulrich Windl Wed, 30 Jun 2021 05:17:45 -0700

>>> damiano giuliani <[email protected]> schrieb am 30.06.2021 um 
>>> 13:44
in Nachricht
<CAG=zYNNe=azzalehe3jzkahnsev88nr+yeo0m06hljl4l11...@mail.gmail.com>:
> Hi Guys,
> 
> sorry for bothering, unfortunally i was called for an issue related to a
> cluster i did months ago which was fully functional till last saturday.
> 
> looks some applications lost connection to the master losing some
> update/insert.
> 
> i found the cause into the logs, the psqld-monitor went timeout after
> 10000ms and the master resource been demote, the instance stopped and then
> promoted to master again, generating few seconds of disservices (no master
> during the described process)


Well, I think YOU have to find out why the monitor timed out. Maybe the disks 
being used were too busy, maybe the memory was tight, ...
WE don't know.

> 
> i noticed a redundant info:
> Update score of "ltaoperdbsXX" from 990 to 1000 because of a change in the
> replication lag
> seems some kind of network lag?
> 
> the network should be 10gbs where both corosync and prod network insist.
> netkwork bonding on all of the nodes.
> PAF version resource-agents-paf-2.3.0-1.rhel7.noarch
> Postgres psql (13.1)
> pacemaker-1.1.23-1.el7.x86_64
> pcs-0.9.169-3.el7.centos.x86_64
> 
> i attached the log could be useful to dig further.
> Can some guys point me on the right direction, should be really appreciate.
> 
> thanks for the support
> Pepe




_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Antw: [EXT] Postgres Cluster PAF problems

Reply via email to