What should happen if the DNS resolution does not result in the expected number of peers either? How would a deliberate shrinking or growing of a cluster work?
Another solution I have seen (e.g. in Cassandra) is to have a cluster identity, such as a cluster name. Instances would refuse to talk to other instances if they announce the wrong cluster name. There could be a default cluster name (or a special case for when it's empty), so that it doesn't change anything for single-cluster use cases. It should also support the transition from older versions, or no cluster name, to a named cluster, with a rolling restart. /MR On Thu, Sep 9, 2021, 10:33 Андрей Еньшин <[email protected]> wrote: > Hi prometheus folks, > > I have a question about alertmanager. > > Here is an one year old issue about merging few HA alertmanager clusters > into one big over time: > https://github.com/prometheus/alertmanager/issues/2250 > > I managed to reproduce it on my local k8s kind cluster. Seems there is > small discrepancy between a list of peers reported by gossip library and a > list of peers from am config file. > > We can workaround it by using k8s network policy. However more proper fix > would be on alertmanager side: keep eye on number of peers and compare with > desired number. In case there is some unexpected state, clear table of > peers, do DNS resolution once more and do form a new peer table. Maybe > there is better solution. What do you think? > > Probably I even can introduce a PR if we can agree on a way to fix it and > someone can support me with review : ) > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-developers/45dd29f4-cae7-4c42-9756-0ca92aa76884n%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-developers/45dd29f4-cae7-4c42-9756-0ca92aa76884n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CAMV%3D_gYY6ABgHBNQCZ80dmTLuvPB5HtDzafsxUzT5ZF43aOPVA%40mail.gmail.com.

