I've currently been running a redundant firewall solution in our
Production environment using OpenBSD (version 4.5-stable) with CARP (4),
PF (4), PFsync (4) and SAsyncd (8) which syncs the pf rules and IPSEC
security associations via the cross-over cable method. We're also
running an IPSEC (4) tunnel between our production and internal networks
with a single OpenBSD machine (version 4.5-stable) running PF (4) on our
internal network.
In the following year since I've implemented this solution we've
experienced a problem in which our firewalls begin to act erratically
roughly every 4 months resulting in loss of SSH connectivity, SNMP
monitoring failure and the inability to run any command from the
console. Despite these problems, both production firewalls are still
pingable and continue to filter packets as they should.
+----| Production Network |----+
| |
bnx2| |bnx2
+-----+ +-----+
| fw1 |-bnx0----------bnx0-| fw2 |
+-----+ +-----+
bnx1| |bnx1
| |
---+--- WAN/Internet ---+---
|
{IPSEC tunnel}
|
+------+
| fw |
+------+
+----| Internal Network |----+
* *
These problems can simply be fixed be rebooting the master and then the
slave production firewalls; however this is obviously not a long term
solution to the problem at hand.
Since I'm not able to view or salvage any of the log files or even run a
top while this problem is occurring I've had a hard time troubleshooting
this issue. However the order of events leading up to the problem seems
to be:
1.) Our monitoring reports that the process load of one or both of the
firewalls can not longer be checked via SNMP
2.) Our IPSEC tunnel goes down
3.) SSH connectivity fails and console command line usage fails (I'm
still able to type a command but then I'm not able to ctrl-c back to the
command line)
Please let me know if you have an ideas why this issue might be
occurring. Thanks in advance.
Regards,
Jeff