On 03/01/16 23:03, Martin Pieuchot wrote:
> On 18/02/16(Thu) 16:46, Florian Riehm wrote:
>> On 02/16/16 11:23, Martin Pieuchot wrote:
>>> On 12/02/16(Fri) 16:33, Florian Riehm wrote:
>>>> Hi Tech,
>>>>
>>>> I have noticed that CARP IP-Balancing is broken, so I am testing and
>>>> fixing it.
>>>>
>>>> The first problem came in with this commit:
>>>> http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/if_ethersubr.c.diff?r1=1.176&r2=1.177
>>>> It enforces that outgoing packets use the src mac of the carp interface 
>>>> instead
>>>> the mac of its carpdev. That's a good idea for failover carp but it breaks
>>>> balancing ip(-stealth), because the same source MAC is seen by multiple
>>>> switchports now. That leads to continues MAC flapping at switches.
>>>> My attached patch fixes the problem. Thanks Martin's refactoring, we can
>>>> enforce the right MAC on a central place in carp_start() now.
>>>
>>> Can't you completely get rid of the else chunk then?  It looks like a
>>> bad refactoring.
>>
>> I was also thinking about deleting the else chunk. At the end I decided to 
>> keep
>> it, because I wasn't sure about its purpose.
>> My intention was:
>> Fix the more or less obviously broken balancing case and don't change the
>> behavior of other CARP modes.
>> I will remove the else chunk and do some more testing to determine if it's
>> really unnecessary.
>>
>>> I'm also wondering can't we use "sc_realmac" here?
>>
>> I have not know "sc_realmac" until now. As far as I understand the intention 
>> of
>> the sc_realmac flag is, to indicate that the user has overwritten the mac
>> address of the carp interface with the mac of its parent. I have never had a
>> use case to do this. Without overwriting the mac by hand sc_realmac is 0 for
>> carp with balancing ip/ip-stealth AND carp without balancing.
>>
>> I don't see how sc_realmac could help us by this problem?!
>> Can you give me a hint please?
> 
> It was an open question, if that does not make any sense, then don't do
> it ;)
> 
>>>> A second problem was introduced two weeks ago. Since we always check the
>>>> destination MAC address of received unicast packets, not only when in
>>>> promiscuous mode, carp balancing ip is always broken, not only when in
>>>> promiscuous mode. The new check collides with "Clear mcast if received on a
>>>> carp IP balanced address." in carp_input().
>>>
>>> Could you describe the problem?  Which Destination MAC the packet
>>> contains and why it doesn't match the interface?  I still don't grok why
>>> a carp(4) interface in balancing mode cannot have the right MAC
>>> configured...  Do you know?  But if what you want is the parent's MAC,
>>> then I'd rather go with the symmetric of your other change:  modify
>>> carp_input() to overwrite ``ether_dhost''.
>>
>> If using carp balancing, incoming packets contain the destination mac address
>> of the carp interface, let's say: 01:00:5e:00:01:01.
>> The MCAST-Bit is set here!
>>
>> carp_input() removes the MCAST-Bit:
>>      /*
>>       * Clear mcast if received on a carp IP balanced address.
>>       */
>>      if (sc->sc_balancing == CARP_BAL_IP &&
>>          ETHER_IS_MULTICAST(eh->ether_dhost))
>>              *(eh->ether_dhost) &= ~0x01;
>>
>> Then we run into ether_input() and 
>>     memcmp(ac->ac_enaddr, eh->ether_dhost, ETHER_ADDR_LEN)
>> evaluates to false because 
>>     ac->ac_enaddr (01:00:5e:00:01:01) != eh->ether_dhost (00:00:5e:00:01:01)
> 
> Taking a step back why do you want to clear the MULTICAST bit?
> 
> If it is just to differentiate between advertisement and normal traffic
> couldn't we use a mbuf_tag(9) for CARP advertisement instead?  The tag
> could be set in carp_input() and checked in carp_lsdrop() in the input
> path.  It could also be used in the output path to get rid of the
> horrible ``cur_vhe'' hack.
> 
Thanks for your comments. I took a look to the big picture:
The idea behind carp balancing is that clients send all their packets to layer 2
multicast addresses. Switches flood those packets to all ports and every
CARP-System receives them. Only one system processes the packet, the other
systems drop it. The processing system handles the packet like a normal unicast
packet. The packet distribution described above is the only reason why the
multicast bit is required. We don't want further multicast handling.

What happens if we don't remove the MCAST bit inside carp_input()?
In ether_input() packets with MCAST bit get the mbuf flag M_MCAST. From now the
kernel assumes to deal with real multicast packets. Upper layers will do many
things we don't want, e.g. tcp_input() and icmp_input_if() would drop the
packet.
 
What can we do?
1.) Introduce a new mbuf flag to mark the special carp MCAST packets and handle
them properly.
=> Bad idea, because we have no unused bits inside m_flags. It would also spread
the CARP hacks to more places - that's not what we want.

2.) Remove the multicast bit of the carp interface's mac address to make the
check "memcmp(ac->ac_enaddr, eh->ether_dhost, ETHER_ADDR_LEN)" in
ether_input() passing. This means:
- Carp-Interfaces would have lladdr like 00:00:5e:00:01:01 instead of
  01:00:5e:00:01:01 (Same as in balancing-stealth-Mode)
- We have to add the MCAST-Bit to outgoing arp-replies to make sure that clients
  sending their packets to layer-2 multicast addresses.
- Keep removing the MCAST-Bit inside carp_input()
=> Could work, but it looks like 'move the hack to an other place'. Actually
it's not that bad because we already have carp logic inside in_arpinput().

What do you think about solution 2? I would submit a diff if you like the idea.

Regards,

Florian

Reply via email to