On 2020/5/6 下午3:31, Michael S. Tsirkin wrote:
On Tue, May 05, 2020 at 07:24:18PM -0700, Eric Dumazet wrote:
On 5/5/20 6:25 PM, Michael S. Tsirkin wrote:
On Tue, May 05, 2020 at 06:19:09PM -0700, Eric Dumazet wrote:
On 5/5/20 5:43 PM, Michael S. Tsirkin wrote:
On Tue, May 05, 2020 at 03:40:09PM -0700, Eric Dumazet wrote:
On 5/5/20 3:30 PM, Thomas Gleixner wrote:
"Michael S. Tsirkin"<m...@redhat.com>  writes:
On Tue, May 05, 2020 at 02:08:56PM +0200, Thomas Gleixner wrote:
The following lockdep splat happens reproducibly on 5.7-rc4
================================
WARNING: inconsistent lock state
5.7.0-rc4+ #79 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ip/356 [HC0[0]:SC1[1]:HE1:SE0] takes:
f3ee4cd8 (&syncp->seq#2){+.?.}-{0:0}, at: net_rx_action+0xfb/0x390
{SOFTIRQ-ON-W} state was registered at:
   lock_acquire+0x82/0x300
   try_fill_recv+0x39f/0x590
Weird. Where does try_fill_recv acquire any locks?
   u64_stats_update_begin(&rq->stats.syncp);

That's a 32bit kernel which uses a seqcount for this. sequence counts
are "lock" constructs where you need to make sure that writers are
serialized.

Actually the problem at hand is that try_fill_recv() is called from
fully preemptible context initialy and then from softirq context.

Obviously that's for the open() path a non issue, but lockdep does not
know about that. OTOH, there is other code which calls that from
non-softirq context.

The hack below made it shut up. It's obvioulsy not ideal, but at least
it let me look at the actual problem I was chasing down:)

Thanks,

         tglx

8<-----------
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1243,9 +1243,11 @@ static bool try_fill_recv(struct virtnet
                        break;
        } while (rq->vq->num_free);
        if (virtqueue_kick_prepare(rq->vq) && virtqueue_notify(rq->vq)) {
+               local_bh_disable();
Or use u64_stats_update_begin_irqsave() whic is a NOP on 64bit kernels
I applied this, but am still trying to think of something that
is 0 overhead for all configs.
Maybe we can select a lockdep class depending on whether napi
is enabled?
Do you_really_  need 64bit counter for stats.kicks on 32bit kernels ?

Adding 64bit counters just because we can might be overhead anyway.
Well 32 bit kernels don't fundamentally kick less than 64 bit ones,
and we kick more or less per packet, sometimes per batch,
people expect these to be in sync ..
Well, we left many counters in networking stack as 'unsigned long'
and nobody complained yet of overflows on 32bit kernels.
Right.  For TX it is helpful that everything is maintained
atomically so we do need the seqlock machinery anyway:

         u64_stats_update_begin(&sq->stats.syncp);
         sq->stats.bytes += bytes;
         sq->stats.packets += packets;
         sq->stats.xdp_tx += n;
         sq->stats.xdp_tx_drops += drops;
         sq->stats.kicks += kicks;
         u64_stats_update_end(&sq->stats.syncp);

for RX kicks are currently updated separately.  Which I guess is more or
less a minor bug.

         if (rq->vq->num_free > min((unsigned int)budget, 
virtqueue_get_vring_size(rq->vq)) / 2) {
                 if (!try_fill_recv(vi, rq, GFP_ATOMIC))
                         schedule_delayed_work(&vi->refill, 0);
         }

         u64_stats_update_begin(&rq->stats.syncp);
         for (i = 0; i < VIRTNET_RQ_STATS_LEN; i++) {
                 size_t offset = virtnet_rq_stats_desc[i].offset;
                 u64 *item;

                 item = (u64 *)((u8 *)&rq->stats + offset);
                 *item += *(u64 *)((u8 *)&stats + offset);
         }
         u64_stats_update_end(&rq->stats.syncp);

we should update kicks in virtnet_receive.

And as long as we do that there's no cost to 64 bit counters ...



Or find way to use u32 to count kick in workqueue (which should be rare) separately.

Thanks

Reply via email to