Anything in your modem logs? DOCSIS layer 2 is a strange beast :)

Any cabling issue such attenuators or splitters behind the modem?

Regards
Patrick



> On Aug 19, 2015, at 2:34 PM, Devin Reade <[email protected]> wrote:
> 
> I'm trying to understand an odd behavior during carp failover
> where one uplink goes numb until the demarc equipment is power
> cycled.
> 
> Consider the following:
> 
> ISP1-demarc   ISP2-demarc
>         |   |
> SW1 (Net1) SW2 (Net2) ----- C
>         |\ /|
>         | X |
>         |/ \|
>      FW-A - FW-B
>         |\ /|
>         | X |
>         |/ \|
> SW3 (Net3) SW4 (Net4)
>   (no NAT) (NAT)
>             |
>             H4
> 
> ISP1-demarc and ISP2-demarc are the respective ISP's equipment (outside
> of my control, other than power cycling them).  SWn are all unmanaged
> switches.
> 
> FW-A, FW-B, and C are all OpenBSD boxes.  FW-A and FW-B, in particular,
> are running 5.7-STABLE in a master/slave carp configuration.  Things
> are set up so that traffic to/from Net3 is sent via ISP1 (no NAT) and
> traffic to/from Net4 is sent via ISP2 (using NAT on on FW-A and FW-B).
> H4 is a host sitting on Net4 in private address space.
> 
> Static IPs are used throughout, including on both the SW1 and SW2
> subnets.  FW-n are routers, not bridges.  Pfsync is running via
> a crossover cable between FW-A and FW-B.
> 
> Behavior:
> 
> In normal operations everything works as expected.  During a carp
> failover, everything for Net3 via ISP1 also works as expected.
> However, during a failover I lose connectivity on Net4, in a qualified
> manner (see below) until ISP2-demarc is power cycled.
> 
> The obvious first answer is that ISP2-demarc (which is a Motorola
> cable modem) probably has a limited number of MAC slots available
> to it.  However, that doesn't seem quite right.  More details ...
> 
> Before failover, I set up a 'ping -n' running on H4 and going to
> a host elsewhere on the Internet (call it EXT).  I also set up
> a 'ping -n' on C going to the carp IP of FW-A and FW-B on Net2
> (lets call that Carp2).
> 
> Now comes the wierd part.  If I shut down the master, FW-A, I see
> the following:
> 
> 1. the running pings from C to Carp2 continue to work until ^C
> 2. the running pings from H4 to EXT continue to work until ^C
> 3. a concurrent newly created ping from C to Carp2 fails
> 4. a concurrent newly created ping from H4 to EXT fails
> 5. all other outbound traffic from Net4 fails (this is just
>    a generalization of (4).
> 
> If I power cycle ISP2-demarc, sanity returns.  That is, until
> FW-A comes back up and FW-B is demoted again.  Then I get the same
> type of failures until ISP2-demarc is power cycled again.
> 
> Power cycling switch SW2 instead of ISP2-demarc does not affect the
> outcome.
> 
> Ok, so how about the MACs?  On Net2 we have the following MACs:
> 
> - ISP2-demarc-mac (on ISP2-demarc)
> - C-mac (on C)
> - FW-A-mac (physical MAC on FW-A)
> - FW-B-mac (physical MAC on FW-B)
> - Carp2-mac (the virtual MAC used by Carp2, which I've verified
>   to be the same for both FW-A and FW-B when they are respectively
>   running as master.
> 
> One wart here, and a difference between Net1 and Net2 is that on
> Net1 both firewalls have their own IPs in addition to the Carp1
> IP.  However, on Net2 both firewall's hostname.if file contains
> only the 'up' keyword; no IP is used on that network until the
> machine becomes the carp master.
> 
> So that means that when H4 is pinging EXT, the pings are being
> NAT'd to use the Carp1 IP.  Therefore I wouldn't expect a failover
> to cause the modem's MAC slots to overflow.
> 
> But the *really* weird part is what is happening with C; why would
> C not be able to ping Carp1 until ISP2-demarc is power-cycled, especially
> with SW2 isolating the latter from Carp1 and C?
> 
> And the story with C gets better.  If I set up a tcpdump on FW-B's Net2
> interface, I see the following sequence of events:
> 
> - before killing FW-A, I see arp requests and CARPv2 advertisements
>   from FW-A (based on the skew), and that's about it (as expected)
> - upon shutting down FW-A, I see a CARPv2 packet from FW-B, and then
>   start seeing the ping request/reply pairs coming in from C (as expected)
> - upon killing and restarting C's ping to Carp2, I no longer see the
>   response on C, but I'm seeing both the request and response in FW-B's
>   tcpdump.  On C, I see only the echo response. (NOT expected)
> 
> Does this last bit point the finger at SW2 being the culprit (perhaps
> not routing packets to the appropriate NIC port), even though power
> cycling SW2 isn't sufficient to fix the problem?
> 
> Any other thoughts?
> 
> Devin

Reply via email to