On Thu, May 20, 2010 at 08:17:48PM +0200, Henning Brauer wrote:
> > I have two identical "core" switches in one (not really so critical at
> > all) place running OSPF, with a bunch of routers connecting to both
> > switches for redundancy. Works pretty well and there has even been a
> > config reset incident, which didn't break anything - because OSPF can
> > detect link failures. Trying to do the same all the way to the end hosts
> > (i.e.  without a routing protocol) is pretty difficult.
> 
> i would never ever run any L3 on switches.
 
Bad wording on my part, the routers run OSPF and the switches are dumb
L2 devices.

Still, without OSPF et al there would be no way to detect a crappy
switch failing in funny ways, which was my point.

As an extra note, if you do get a crappy switch, be very careful with
its management interface. The cheapest ones have unbelievably slow CPUs
that are easily overloaded by broadcasts making the whole thing stop
responding. Even worse, the interrupt load seems to trigger some other
bugs, like LACP mysteriously failing and disabling one port on a trunk
and blackholing half of your traffic (this happened on a ZyXEL GS-4024,
which has otherwise totally Just Worked as a L2 switch for years) or
even the whole switch ASIC "crashing" after a broadcast storm and
requiring a reboot (though the management CPU was still responding
through the out of band ether and serial port after the storm was gone)

Also, it's a very obvious DoS; a malicious person needs to send a rather
small amount of BPDUs to overload the tiny CPU and the cheap switches
obviously have no rate limiting for packets going to the CPU (only on
all broadcasts). So, blocking BPDUs from non-trusted devices should be
enabled (but that should probably be done anyway.)

Even among "trusted" devices STP and LACP involve the shitty code
running on the underpowered management CPU, and that is not the part
that shines in the cheap switches. Static link aggregation works OK.

Reply via email to