There has been many various improvements on that front:
* RabbitMQ's memory consumption has been studied closely, bringing many fixes
in autopkgtest-cloud.
* We have a watchdog restarting RabbitMQ when things go bad.
* This watchdog has been exercised a lot before the memory consumption fixes
and
I first thought we could log some data like:
$ rabbitmqctl list_queues name durable owner_pid messages_ready
messages_unacknowledged messages messages_ready_ram messages_unacknowledged_ram
messages_ram messages_persistent message_bytes message_bytes_ram
message_bytes_persistent memory state
via
Worth noting that we ran for many months before hitting this, and then:
ubuntu@juju-prod-ues-proposed-migration-machine-1:~$ dmesg -T | grep "Out of
memory: Kill" | uniq
[Wed May 9 16:06:23 2018] Out of memory: Kill process 1408 (beam) score 215 or
sacrifice child
[Sun May 20 03:58:24 2018] Out
For autopkgtest-cloud I just cowboyed a change to add Restart=on-failure
to rabbitmq-server.service. Maybe that helps us mitigate, in combination
with delivery_mode=2.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launch
** Also affects: rabbitmq-server (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1772236
Title:
rabbit died and everything else died
To manage not