On Fri, Oct 9, 2020 at 8:10 PM Xie He <xie.he.0...@gmail.com> wrote: > > On Fri, Oct 9, 2020 at 6:07 PM Cong Wang <xiyou.wangc...@gmail.com> wrote: > > > > Looking a bit deeper, I doubt the ipgre_header_ops->create is necessary, > > because 1) all other tunnels devices do not have it (ip_tunnel_header_ops > > only contains ->parse_protocol); 2) GRE headers are pushed in xmit > > anyway, so at least SOCK_DGRAM does not need it; 3) ipgre_header() > > creates the GRE header, later ipgre_xmit() pulls it back, then __gre_xmit() > > builds GRE header again... > > Haha, I also don't like ipgre_header_ops->create and want to get rid > of it. We are thinking the same :)
Great! > > From what I see, the flow when sending skbs (when header_ops is used) > is like this: > > 1) First ipgre_header creates the IP header and the GRE base header, > leaving space for the GRE optional fields and the "encap" header (the > space for the "encap" header should be left before the GRE header, but > it is incorrectly left after the GRE header). > > 2) Then ipgre_xmit pulls the header created by ipgre_header (but > leaves the data there). Then it calls __gre_xmit. > > 3) __gre_xmit calls gre_build_header. gre_build_header will push back > the GRE header, read the GRE base header and build the GRE optional > fields. > > 4) Then __gre_xmit calls ip_tunnel_xmit. ip_tunnel_xmit will build the > "encap" header by calling ip_tunnel_encap. > > So what ipgre_header does is actually creating the IP header and the > GRE header, and leaving some holes for the GRE optional fields and the > "encap" header to be built later. Right. For the encap header, I'd guess it is because this is relatively new. > > This seems so weird to me. If a user is using an AF_PACKET/RAW socket, > the user is supposed to do what the header_ops->create function does > (that is, creating two headers and leaving two holes to be filled in > later). I think no user would actually do that. That is so weird. Well, AF_PACKET RAW socket is supposed to construct L2 headers from the user buffer, and for a tunnel device these headers are indeed its L2. If users do not want to do this, they can switch to DGRAM anyway. I know how inconvenient it is to construct a GRE tunnel header, I guess this is why all other tunnel devices do not provide a header_ops::create. GRE tunnel is not a special case, this is why I agree on removing its ->create() although it does look like all tunnel devices should let users construct headers by definition of RAW. Of course, it may be too late to change this behavior even if we really want, users may already assume there is always no tunnel header needed to construct. Thanks.