On 2018/11/10 10:09, Cong Wang wrote: > On Fri, Nov 9, 2018 at 6:02 PM Yunsheng Lin <linyunsh...@huawei.com> wrote: >> >> On 2018/11/10 9:42, Cong Wang wrote: >>> On Fri, Nov 9, 2018 at 5:39 PM Yunsheng Lin <linyunsh...@huawei.com> wrote: >>>> >>>> On 2018/11/10 3:43, Cong Wang wrote: >>>>> Currently netdev_rx_csum_fault() only shows a device name, >>>>> we need more information about the skb for debugging. >>>>> >>>>> Sample output: >>>>> >>>>> ens3: hw csum failure >>>>> dev features: 0x0000000000014b89 >>>>> skb len=84 data_len=0 gso_size=0 gso_type=0 ip_summed=0 csum=0, >>>>> csum_complete_sw=0, csum_valid=0 >>>>> >>>>> Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com> >>>>> --- >>>>> include/linux/netdevice.h | 5 +++-- >>>>> net/core/datagram.c | 6 +++--- >>>>> net/core/dev.c | 10 ++++++++-- >>>>> net/sunrpc/socklib.c | 2 +- >>>>> 4 files changed, 15 insertions(+), 8 deletions(-) >>>>> >>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >>>>> index 857f8abf7b91..fabcd9fa6cf7 100644 >>>>> --- a/include/linux/netdevice.h >>>>> +++ b/include/linux/netdevice.h >>>>> @@ -4332,9 +4332,10 @@ static inline bool >>>>> can_checksum_protocol(netdev_features_t features, >>>>> } >>>>> >>>>> #ifdef CONFIG_BUG >>>>> -void netdev_rx_csum_fault(struct net_device *dev); >>>>> +void netdev_rx_csum_fault(struct net_device *dev, struct sk_buff *skb); >>>>> #else >>>>> -static inline void netdev_rx_csum_fault(struct net_device *dev) >>>>> +static inline void netdev_rx_csum_fault(struct net_device *dev, >>>>> + struct sk_buff *skb) >>>>> { >>>>> } >>>>> #endif >>>>> diff --git a/net/core/datagram.c b/net/core/datagram.c >>>>> index 57f3a6fcfc1e..d8f4d55cd6c5 100644 >>>>> --- a/net/core/datagram.c >>>>> +++ b/net/core/datagram.c >>>>> @@ -736,7 +736,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff >>>>> *skb, int len) >>>>> if (likely(!sum)) { >>>>> if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && >>>>> !skb->csum_complete_sw) >>>>> - netdev_rx_csum_fault(skb->dev); >>>>> + netdev_rx_csum_fault(skb->dev, skb); >>>>> } >>>>> if (!skb_shared(skb)) >>>>> skb->csum_valid = !sum; >>>>> @@ -756,7 +756,7 @@ __sum16 __skb_checksum_complete(struct sk_buff *skb) >>>>> if (likely(!sum)) { >>>>> if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && >>>>> !skb->csum_complete_sw) >>>>> - netdev_rx_csum_fault(skb->dev); >>>>> + netdev_rx_csum_fault(skb->dev, skb); >>>>> } >>>>> >>>>> if (!skb_shared(skb)) { >>>>> @@ -810,7 +810,7 @@ int skb_copy_and_csum_datagram_msg(struct sk_buff >>>>> *skb, >>>>> >>>>> if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && >>>>> !skb->csum_complete_sw) >>>>> - netdev_rx_csum_fault(NULL); >>>>> + netdev_rx_csum_fault(NULL, skb); >>>>> } >>>>> return 0; >>>>> fault: >>>>> diff --git a/net/core/dev.c b/net/core/dev.c >>>>> index 0ffcbdd55fa9..2b337df26117 100644 >>>>> --- a/net/core/dev.c >>>>> +++ b/net/core/dev.c >>>>> @@ -3091,10 +3091,16 @@ EXPORT_SYMBOL(__skb_gso_segment); >>>>> >>>>> /* Take action when hardware reception checksum errors are detected. */ >>>>> #ifdef CONFIG_BUG >>>>> -void netdev_rx_csum_fault(struct net_device *dev) >>>>> +void netdev_rx_csum_fault(struct net_device *dev, struct sk_buff *skb) >>>>> { >>>>> if (net_ratelimit()) { >>>>> pr_err("%s: hw csum failure\n", dev ? dev->name : >>>>> "<unknown>"); >>>>> + if (dev) >>>>> + pr_err("dev features: %pNF\n", &dev->features); >>>>> + pr_err("skb len=%d data_len=%d gso_size=%d gso_type=%d >>>>> ip_summed=%d csum=%x, csum_complete_sw=%d, csum_valid=%d\n", >>>>> + skb->len, skb->data_len, skb_shinfo(skb)->gso_size, >>>>> + skb_shinfo(skb)->gso_type, skb->ip_summed, skb->csum, >>>>> + skb->csum_complete_sw, skb->csum_valid); >>>> >>>> >>>> This function also have the netdev available, use netdev_err to log the >>>> error? >>> >>> It is apparently not me who picked pr_err() from the beginning, >>> I just follow that pr_err(). If you are not happy with it, please send >>> a followup. >> >> Yes, but perhaps it is something to improve. > > > Sure, no one stops you from improving it in a followup patch. :) > > >> When using the netdev, then maybe it does not have to check if dev is null, >> because >> netdev_err has handled the netdev being NULL case. >> Maybe I missed something that netdev can not be used here? >> If not, maybe I can send a followup. >> > > Maybe. Again, my patch intends to add a few debugging logs, > not to convert pr_err() to whatever else, they are totally different > goals. I choose pr_err() only because I follow the existing one, > not to say which one is better than the other.
Ok. :) > > Thanks. > > . >