Patrick McHardy wrote:
> Chinh Nguyen wrote:
>
>>Patrick McHardy wrote:
>>
>>
>>>What values does skb->ip_summed have before that?
>>
>>
>>the skb->ip_summed value before the checksum check in tcp_v4_rcv is
>>CHECKSUM_NONE. Hence tcp_v4_rcv checks its value, which is incorrect because
>>the
>>checksum is with regards to the private IP but the NAT device has modified the
>>source IP.
>
>
> Netfilter recalculates the checksum when NATing it.
The NATing is not done by netfilter but by the NAT device between the IPsec
peers.
>
> I believe that skb->ip_summed is set to CHECKSUM_NONE by esp_input
>
>>(net/ipv4/esp4.c:180) which is called by xfrm4_rcv_encap
>>(net/ipv4/xfrm4_input.c:101).
>
>
> The question is why the checksum is invalid. Please start by describing
> what you're trying to do.
[Linux ipsec client C] ------ [NAT device] ---------- [Linux ipsec server S]
C negotiates a IPsec Transport Mode with S. Because of Transport Mode/NAT-T, 2
things happen to an IPsec packet.
1. It is UDP-encapsulated, typically on port 4500/udp.
2. Transport Mode traffic leaves the original IP header alone whereas tunnel
mode wraps the entire traffic in a second IP header. As such, when the packet
passes through the NAT device, the source IP is N. However, the original
unencrypted packet had source IP C.
S rips off the UDP-encap header, decrypts the payload, and "joins" the content
back to the IP header. If the decrypted content is UDP or TCP, the UDP/TCP
checksum is now incorrect because the source IP is now N not C.
(In tunnel mode, we would ignore the NAT-ted outer IP header because the
decrypted content has an entire IP header + UDP/TCP etc)
This is a well-known problem with transport mode/NAT. One solution is to use
NAT-OA and NAT-OR to recalculate the checksum. The linux kernel does the simpler
thing of ignoring the UDP/TCP checksum altogether in this particular case:
function esp_post_input (net/ipv4/esp4.c)
290 /*
291 * 2) ignore UDP/TCP checksums in case
292 * of NAT-T in Transport Mode, or
293 * perform other post-processing fixes
294 * as per * draft-ietf-ipsec-udp-encaps-06,
295 * section 3.1.2
296 */
297 if (!x->props.mode)
298 skb->ip_summed = CHECKSUM_UNNECESSARY;
299
300 break;
As noted, esp_post_input is called in xfrm4_policy_check. Decrypted UDP traffic
through transport mode/nat also has bad checksums. However, since it is passed
through udp_queue_rcv_skb after decryption, and this function calls
xfrm4_policy_check before checking the UDP checksum, line 298 means the kernel
ignores the bad checksum.
Decrypted TCP traffic has bad checksums too. But since tcp_v4_rcv checks the TCP
checksum before calling xfrm4_policy_check, the bad checksum means the TCP
packet is dropped as a bad segment.
The end result is that UDP and other traffic (eg, ICMP) can pass through
transport mode/nat but not TCP.
I don't know what correct fix is. Adding an extra call to xfrm4_policy_check in
tcp_v4_rcv before the checksum check fixes this problem and doesn't seem to
break anything else. On the other hand, moving some of the code in
esp_post_input into esp_input (especially line 298) will work, too.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html