On 14-3-2012 10:43, Kapetanakis Giannis wrote:
>> While heavily demoted, it still assumes the master role. I guess it's
>> not seeing the carp announcements from firewall-2 at all. Do you use
>> spanning tree in the network?
>
> Yes. The latest change which I did on the switch where the firewalls are
> connected is adding:
> spanning-tree portfast trunk
> spanning-tree bpdufilter enable
> in order to startup the port faster. Don't know if this is causing the
> problem, cause now the ports are coming up really fast. They used to
> come up after 1 minute.
Fast is good.
>> How many states do you typically have? The bulk pfsync is taking a
>> really long time here... 4 minutes. Any errors on the pfsync interface?
>> What speed is it?
> I usually have around 90k states (pfctl -ss |wc -l)
> On both firewalls it's 1Gbps
> media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause)
> media: Ethernet autoselect (1000baseT full-duplex,master,rxpause,txpause)
>
> # netstat -id
>
> Name Mtu Network Address Ipkts Ierrs
> Opkts Oerrs Colls Drop
>
> em2(sync_if_f1) 1500<Link> 00:19:99:98:e4:ea 682406 225
> 255969304 0 0 0
> bge1(sync_if_f2) 1500<Link> 00:0a:e4:80:73:3d 387753797 461
> 1152887 0 0 0
Hmm, 225 errors on 682406 packets is low, but a bit higher then I would
expect to see.
>
> f1# netstat -s
> carp:
> 12 packets received (IPv4)
> 0 packets received (IPv6)
> 0 packets discarded for bad interface
> 0 packets discarded for wrong TTL
> 0 packets shorter than header
> 0 discarded for bad checksums
> 0 discarded packets with a bad version
> 0 discarded because packet too short
> 0 discarded for bad authentication
> 0 discarded for unknown vhid
> 0 discarded because of a bad address list
> 1586084 packets sent (IPv4)
> 0 packets sent (IPv6)
> 0 send failed due to mbuf memory error
> 8 transitions to master
> pfsync:
> 682381 packets received (IPv4)
> 0 packets received (IPv6)
> 0 packets discarded for bad interface
> 0 packets discarded for bad ttl
> 0 packets shorter than header
> 0 packets discarded for bad version
> 0 packets discarded for bad HMAC
> 0 packets discarded for bad action
> 0 packets discarded for short packet
> 0 states discarded for bad values
> 88 stale states
> 809627 failed state lookup/inserts
> 256080550 packets sent (IPv4)
> 0 packets sent (IPv6)
> 0 send failed due to mbuf memory error
> 0 send error
This is not from just after the reboot right? The "failed state
lookup/inserts" might be interesting just after the firewalls have
stabilized.
> f2# netstat -s
> carp:
> 2236176 packets received (IPv4)
> 0 packets received (IPv6)
> 0 packets discarded for bad interface
> 0 packets discarded for wrong TTL
> 0 packets shorter than header
> 0 discarded for bad checksums
> 0 discarded packets with a bad version
> 0 discarded because packet too short
> 0 discarded for bad authentication
> 0 discarded for unknown vhid
> 0 discarded because of a bad address list
> 460 packets sent (IPv4)
> 0 packets sent (IPv6)
> 0 send failed due to mbuf memory error
> 12 transitions to master
> pfsync:
> 387828563 packets received (IPv4)
> 0 packets received (IPv6)
> 0 packets discarded for bad interface
> 0 packets discarded for bad ttl
> 0 packets shorter than header
> 0 packets discarded for bad version
> 0 packets discarded for bad HMAC
> 0 packets discarded for bad action
> 0 packets discarded for short packet
> 0 states discarded for bad values
> 435 stale states
> 1173653 failed state lookup/inserts
> 1152819 packets sent (IPv4)
> 0 packets sent (IPv6)
> 0 send failed due to mbuf memory error
> 0 send error
>
>
>
>> What does your ifstated.conf look like?
>>
>
> ifstated runs only on primary firewall.
> Primary firewall runs with advbase 1 advskew 10
> secondary firewall runs with advbase 1 advskew 100
>
> carp_up = "carp0.link.up&& carp1.link.up&& carp2.link.up&&
> carp3.link.up"
> carp_down = "!carp0.link.up&& !carp1.link.up&& !carp2.link.up&&
> !carp3.link.up"
> carp_sync = "carp0.link.up&& carp1.link.up&& carp2.link.up&&
> carp3.link.up || \
> !carp0.link.up&& !carp1.link.up&& !carp2.link.up&& !carp3.link.up"
>
> # check remote gateways
> net = '( "ping -q -c 1 -w 1 aaa.aaa.aaa.aaa> /dev/null" every 10&& \
> "ping -q -c 1 -w 1 bbb.bbb.bbb.bbb> /dev/null" every 10&& \
> "ping -q -c 1 -w 1 ccc.ccc.ccc.ccc> /dev/null" every 10&& \
> "ping -q -c 1 -w 1 ddd.ddd.ddd.ddd> /dev/null" every 10)'
>
> # check firewall-2
> peer = '( "ping -q -c 1 -w 1 eee.eee.eee.eee> /dev/null" every 10 )'
>
> state auto {
> if $carp_up
> set-state primary
> if $carp_down
> set-state backup
> }
>
> state primary {
> init {
> run "ifconfig carp0 advskew 10"
> run "ifconfig carp1 advskew 10"
> run "ifconfig carp2 advskew 10"
> run "ifconfig carp3 advskew 10"
> }
> if ! $net
> set-state demoted
> }
>
> state demoted {
> init {
> run "ifconfig carp0 advskew 200"
> run "ifconfig carp1 advskew 200"
> run "ifconfig carp2 advskew 200"
> run "ifconfig carp3 advskew 200"
> }
> if $net
> set-state primary
> }
>
> state promoted {
> init {
> run "ifconfig carp0 advskew 101"
> run "ifconfig carp1 advskew 101"
> run "ifconfig carp2 advskew 101"
> run "ifconfig carp3 advskew 101"
> }
> if $net
> set-state primary
> if ! $net&& $peer
> set-state backup
> }
>
> state backup {
> init {
> run "ifconfig carp0 advskew 254"
> run "ifconfig carp1 advskew 254"
> run "ifconfig carp2 advskew 254"
> run "ifconfig carp3 advskew 254"
> }
> # The "sleep 5" below is a hack to dampen the $carp_sync when we come
> # out of promoted state. Thinking about the correct fix...
> if ! $carp_sync&& $net&& "sleep 5" every 10
> if ! $carp_sync&& $net
> set-state promoted
> }
I would not muck with the advskew like that anymore. The demotion based
on linkstate works automatically now.
If you really like to keep the ping test just use "ifconfig -g carp" for
the demotion and promotion.