Control: tags -1 +confirmed
I can confirm both bugs here. During the last upgrade, needrestart
noticed fail2ban needed a restart, so it did. Here's what systemd sees
right now:
root@marcos:/home/anarcat# systemctl status fail2ban
● fail2ban.service - Fail2Ban Service
Loaded: loaded (/lib/systemd/system/fail2ban.service; enabled; vendor
preset: enabled)
Active: active (running) since Sun 2019-02-17 15:32:58 EST; 9min ago
Docs: man:fail2ban(1)
Process: 5829 ExecStop=/usr/bin/fail2ban-client stop (code=exited, status=255)
Process: 12738 ExecStart=/usr/bin/fail2ban-client -x start (code=exited,
status=0/SUCCESS)
Main PID: 12745 (fail2ban-server)
Tasks: 14 (limit: 4915)
Memory: 33.3M
CPU: 3min 14.842s
CGroup: /system.slice/fail2ban.service
└─12745 /usr/bin/python3 /usr/bin/fail2ban-server -s
/var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ba
fév 17 15:42:06 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.80
fév 17 15:42:07 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.82
fév 17 15:42:07 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.84
fév 17 15:42:07 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.85
fév 17 15:42:07 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.86
fév 17 15:42:07 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.87
fév 17 15:42:07 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.88
fév 17 15:42:08 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.90
fév 17 15:42:08 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.91
fév 17 15:42:08 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.93
fév 17 15:42:08 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.94
fév 17 15:42:08 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.95
fév 17 15:42:08 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.178.97
... and those lines are still being added there:
fév 17 15:43:28 marcos fail2ban.actions[12745]: NOTICE [postfix-auth] Ban
60.168.182.250
etc...
In other words, fail2ban has been loading a list of IP addresses in the
firewall for the last 9 minutes. Before that, it was *removing* IP
addresses from the firewall for another three minutes (before systemd
gave up, see #...):
root@marcos:/home/anarcat# journalctl -u fail2ban.service | grep systemd
fév 17 15:29:57 marcos systemd[1]: Stopping Fail2Ban Service...
fév 17 15:31:27 marcos systemd[1]: fail2ban.service: Stopping timed out.
Terminating.
fév 17 15:31:27 marcos systemd[1]: fail2ban.service: Control process exited,
code=exited status=255
fév 17 15:32:57 marcos systemd[1]: fail2ban.service: State 'stop-sigterm' timed
out. Killing.
fév 17 15:32:57 marcos systemd[1]: fail2ban.service: Killing process 2176
(fail2ban-server) with signal SIGKILL.
fév 17 15:32:57 marcos systemd[1]: fail2ban.service: Main process exited,
code=killed, status=9/KILL
fév 17 15:32:57 marcos systemd[1]: Stopped Fail2Ban Service.
fév 17 15:32:57 marcos systemd[1]: fail2ban.service: Unit entered failed state.
fév 17 15:32:57 marcos systemd[1]: fail2ban.service: Failed with result
'timeout'.
fév 17 15:32:57 marcos systemd[1]: Starting Fail2Ban Service...
fév 17 15:32:58 marcos systemd[1]: Started Fail2Ban Service.
Now it systemd considers the thing "started" even though it's still
loading the IP list. Because systemd timed out, we also see this in the
logs:
fév 17 15:45:39 marcos fail2ban.actions[12745]: NOTICE [postfix-auth]
183.160.227.26 already banned
... obviously, fail2ban didn't have time to remove all addresses from
the firewall.
There are many things wrong here:
1. "service fail2ban restart" should try to remove all IP addresses
from the firewall if it's going to re-add them all a second later
2. even if it *does* decide to do that, it shouldn't fail halfway
through.
3. if "stop" waits for all IPs to be cleared out, "start" should as
well
4. loading IP addresses shouldn't be that slow
I don't have that many addresses loaded in that jail:
Status for the jail: postfix-auth
|- Filter
| |- Currently failed: 4
| |- Total failed: 456
| `- File list: /var/log/mail.log
`- Actions
|- Currently banned: 4499
|- Total banned: 4499
`- Banned IP list: [...]
A 5000 IP block list is really not that much. Note that I'm using the
`ipset` functionality in my jails to improve on that process (it was
even slower before):
# head -5 /etc/fail2ban/jail.local
[DEFAULT]
findtime = 60
ignoreip = 192.168.0.7
banaction = iptables-ipset-proto6
banaction_allports = iptables-ipset-proto6-allports
--
On ne résout pas un problème avec les modes de pensée qui l'ont
engendré.
- Albert Einstein