Why not use all the syn queues? in the function "tcp_conn_request", I have some questions.
Hello everyone,recently I am looking at the source code for handling TCP three-way handshake(Linux Kernel version 4.18.5). I found some strange places in the source code for handling syn messages. in the function "tcp_conn_request" This code will be executed when we don't enable the syn cookies. if (!net->ipv4.sysctl_tcp_syncookies && (net->ipv4.sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) < (net->ipv4.sysctl_max_syn_backlog >> 2)) && !tcp_peer_is_proven(req, dst)) { /* Without syncookies last quarter of * backlog is filled with destinations, * proven to be alive. * It means that we continue to communicate * to destinations, already remembered * to the moment of synflood. */ pr_drop_req(req, ntohs(tcp_hdr(skb)->source), rsk_ops->family); goto drop_and_release; } But why don't we use all the syn queues? Why do we need to leave the size of (net->ipv4.sysctl_max_syn_backlog >> 2) in the queue? Even if the system is attacked by a syn flood, there is no need to leave a part. Why do we need to leave a part? The value of sysctl_max_syn_backlog is the maximum length of the queue only if syn cookies are enabled. This is the first strange place, here is another strange place __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn; if ((net->ipv4.sysctl_tcp_syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) && !isn) { if (!want_cookie && !isn) { The value of "isn" comes from TCP_SKB_CB(skb)->tcp_tw_isn, then it is judged twice whether its value is indeed 0. But "tcp_tw_isn" is initialized in the function "tcp_v4_fill_cb" TCP_SKB_CB(skb)->tcp_tw_isn = 0; So it has always been 0, I used printk to test, and the result is always 0. Since it is always 0, why do you need to judge twice? This is two strange places I found. Can anyone tell me why the code here is written like this?
Re: Why not use all the syn queues? in the function "tcp_conn_request", I have some questions.
Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On 4 September 2018 9:06 PM, Neal Cardwell wrote: > On Tue, Sep 4, 2018 at 1:48 AM Ttttabcd a...@protonmail.com wrote: > > > Hello everyone,recently I am looking at the source code for handling TCP > > three-way handshake(Linux Kernel version 4.18.5). > > I found some strange places in the source code for handling syn messages. > > in the function "tcp_conn_request" > > This code will be executed when we don't enable the syn cookies. > > > > if (!net->ipv4.sysctl_tcp_syncookies && > > (net->ipv4.sysctl_max_syn_backlog - > > inet_csk_reqsk_queue_len(sk) < > > (net->ipv4.sysctl_max_syn_backlog >> 2)) && > > !tcp_peer_is_proven(req, dst)) { > > /* Without syncookies last quarter of > > * backlog is filled with destinations, > > * proven to be alive. > > * It means that we continue to communicate > > * to destinations, already remembered > > * to the moment of synflood. > > */ > > pr_drop_req(req, ntohs(tcp_hdr(skb)->source), > > rsk_ops->family); > > goto drop_and_release; > > } > > > > > > But why don't we use all the syn queues? > > If tcp_peer_is_proven() returns true then we do allow ourselves to use > the whole queue. > > > Why do we need to leave the size of (net->ipv4.sysctl_max_syn_backlog >> 2) > > in the queue? > > Even if the system is attacked by a syn flood, there is no need to leave a > > part. Why do we need to leave a part? > > The comment describes the rationale. If syncookies are disabled, then > the last quarter of the backlog is reserved for filling with > destinations that were proven to be alive, according to > tcp_peer_is_proven() (which uses RTTs measured in previous > connections). The idea is that if there is a SYN flood, we do not want > to use all of our queue budget on attack traffic but instead want to > reserve some queue space for SYNs from real remote machines that we > have actually contacted in the past. > > > The value of sysctl_max_syn_backlog is the maximum length of the queue only > > if syn cookies are enabled. > > Even if syncookies are disabled, sysctl_max_syn_backlog is the maximum > length of the queue. > > > This is the first strange place, here is another strange place > > > > __u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn; > > > > if ((net->ipv4.sysctl_tcp_syncookies == 2 || > > inet_csk_reqsk_queue_is_full(sk)) && !isn) { > > > > if (!want_cookie && !isn) { > > > > > > The value of "isn" comes from TCP_SKB_CB(skb)->tcp_tw_isn, then it is > > judged twice whether its value is indeed 0. > > But "tcp_tw_isn" is initialized in the function "tcp_v4_fill_cb" > > > > TCP_SKB_CB(skb)->tcp_tw_isn = 0; > > > > > > So it has always been 0, I used printk to test, and the result is always 0. > > That field is also set in tcp_timewait_state_process(): > > TCP_SKB_CB(skb)->tcp_tw_isn = isn; > > So there can be cases where it is not 0. > > Hope that helps, > neal Thank you very much, I understand
Re: Why not use all the syn queues? in the function "tcp_conn_request", I have some questions.
Thank you very much for your previous answer, sorry for the inconvenience. But now I want to ask you one more question. The question is why we need two variables to control the syn queue? The first is the "backlog" parameter of the "listen" system call that controls the maximum length limit of the syn queue, it also controls the accept queue. The second is /proc/sys/net/ipv4/tcp_max_syn_backlog, which also controls the maximum length limit of the syn queue. So simply changing one of them and wanting to increase the syn queue is not working. In our last discussion, I understood tcp_max_syn_backlog will retain the last quarter to the IP that has been proven to be alive But if tcp_max_syn_backlog is very large, the syn queue will be filled as well. So I don't understand why not just use a variable to control the syn queue. For example, just use tcp_max_syn_backlog, which is the maximum length limit for the syn queue, and it can also be retained to prove that the IP remains the last quarter. The backlog parameter of the listen system call only controls the accpet queue. I feel this is more reasonable. If I don't look at the source code, I really can't guess the backlog parameter actually controls the syn queue. I always thought that it only controlled the accept queue before I looked at the source code, because the man page is written like this. Here is the man page's original words. The behavior of the backlog argument on TCP sockets changed with Linux 2.2. Now it specifies the queue length for completely established sockets waiting to be accepted, instead of the number of incomplete connection requests. The maximum length of the queue for incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog. When syncookies are enabled there is no logical maximum length and this setting is ignored. See tcp(7) for more information.
Re: Why not use all the syn queues? in the function "tcp_conn_request", I have some questions.
Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Sunday, 9 September 2018 02:24, Neal Cardwell wrote: > By default, and essentially always in practice (AFAIK), Linux > installations enable syncookies. With syncookies, there is essentially > no limit on the syn queue, or number of incomplete passive connections > (as the man page you quoted notes). So in practice the listen() > parameter usually controls only the accept queue. > > > That discussion pertains to a code path that is relevant if syncookies > are disabled, which is very uncommon (see above). > Yes, when I tested, I disabled syncookies. I want to know how the kernel will handle syn attacks if syncookies are disabled. > Keep in mind that the semantics of the listen() argument and the > /proc/sys/net/ipv4/tcp_max_syn_backlog sysctl knob, as described in > the man page, are part of the Linux kernel's user-visible API. So, in > essence, they cannot be changed. Changing the semantics of system > calls and sysctl knobs breaks applications and system configuration > scripts. :-) So, as you said Is there a historical issue with two variables controlling the syn queue?
Does the kernel IPv6 module plan to implement Secure Neighbor Discovery?
IPv6 is rapidly deploying globally. NDP replaces the role of ARP in IPv6 and provides mapping from IP address to MAC address. However, the NDP protocol is as insecure as the ARP protocol, and can be easily spoofed, and then the attacker can conduct man-in-the-middle attacks. The solution to the weak security problem is to use Secure Neighbor Discovery, Abbreviation, SeND. SeND uses Cryptographically Generated Addresses and public keys to authenticate information provided by NDP messages. I think SeND is a very important security facility under IPv6. I found some implementations in user space, but not in the kernel. I searched the mail records in lkml.org and found that no one was discussing SeND. So I am confused, is the kernel planning to implement SeND? Or should SeND be implemented in user space?
Re: Does the kernel IPv6 module plan to implement Secure Neighbor Discovery?
> > Usually it requires someone motivated to step up and do the work. You > sound motivated. The easiest thing would be for you to step up and > write the implementation. > > Having looked at this once long ago my memory is that SeND only protects > against an attacker on a local lan. That is not an attack scenario I am > particularly worried about. If my memory is correct there are > additional issues with how you perform the initial key distribution. > All of which is why I am personally not interested. > > But if you are interested and would like to make this happen more power > to you. > > Eric I also thought about implementing it myself, but I am still a kernel beginner. I am learning the source implementation of the network protocol stack, but I think there is still a long way to go to implement a complete function. I will work hard, I hope to contribute to the kernel, thank you
I found a strange place while reading “net/ipv6/reassembly.c”
Hello everyone who develops the kernel. At the beginning I was looking for the source author, but his email address has expired, so I can only come here to ask questions. The problem is in the /net/ipv6/reassembly.c file, the author is Pedro Roque. I found some strange places when I read the code for this file (Linux Kernel version 4.18). In the "/net/ipv6/reassembly.c" In the function "ip6_frag_queue" offset = ntohs(fhdr->frag_off) & ~0x7; end = offset + (ntohs(ipv6_hdr(skb)->payload_len) - ((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1))); if ((unsigned int)end > IPV6_MAXPLEN) { *prob_offset = (u8 *)&fhdr->frag_off - skb_network_header(skb); return -1; } Here the length of the payload is judged. And in the function "ip6_frag_reasm" payload_len = ((head->data - skb_network_header(head)) - sizeof(struct ipv6hdr) + fq->q.len - sizeof(struct frag_hdr)); if (payload_len > IPV6_MAXPLEN) goto out_oversize; .. out_oversize: net_dbg_ratelimited("ip6_frag_reasm: payload len = %d\n", payload_len); goto out_fail; Here also judges the length of the payload. Judged the payload length twice. I tested that the code in the label "out_oversize:" does not execute at all, because it has been returned in "ip6_frag_queue". Unless I comment out the code that judge the payload length in the function "ip6_frag_queue", the code labeled "out_oversize:" can be executed. So, is this repeated?