2017-04-18 21:46 GMT-07:00 Michael Ma <make0...@gmail.com>: > 2017-04-18 16:12 GMT-07:00 Cong Wang <xiyou.wangc...@gmail.com>: >> On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0...@gmail.com> wrote: >>> Hi - >>> >>> We've implemented a "glue" qdisc similar to mqprio which can associate >>> one qdisc to multiple txqs as the root qdisc. Reference count of the >>> child qdiscs have been adjusted properly in this case so that it >>> represents the number of txqs it has been attached to. However when >>> sending packets we saw the skb from dequeue_skb() corrupted with the >>> following call stack: >>> >>> [exception RIP: netif_skb_features+51] >>> RIP: ffffffff815292b3 RSP: ffff8817f6987940 RFLAGS: 00010246 >>> >>> #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa >>> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9 >>> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193 >>> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03 >>> >>> It looks like the skb has already been released since its dev pointer >>> field is invalid. >>> >>> Any clue on how this can be investigated further? My current thought >>> is to add some instrumentation to the place where skb is released and >>> analyze whether there is any race condition happening there. However >> >> Either dropwatch or perf could do the work to instrument kfree_skb(). > > Thanks - will try it out.
I'm using perf to collect the callstack for kfree_skb and trying to correlate that with the corrupted SKB address however when system crashes the perf.data file is also corrupted - how can I view this file in case the system crashes before perf exits? >> >>> by looking through the existing code I think the case where one root >>> qdisc is associated with multiple txqs already exists (when mqprio is >>> not used) so not sure why it won't work when we group txqs and assign >>> each group a root qdisc. Any insight on this issue would be much >>> appreciated! >> >> How do you implement ->attach()? How does it work with netdev_pick_tx()? > > attach() essentially grafts the default qdisc(pfifo) to each "txq > group" represented by a TC class. For netdev_pick_txq() we use classid > of the socket to select a class based on a "class id base" and the > class to txq mapping defined together with this glue qdisc - it's > pretty much the same as mqprio with the difference of mapping one > class to multiple txqs and selecting the txq through a hash.