On 10/1/25 11:39 AM, Jaehoon Kim wrote:
> Previously, set_ind_atomic() returned the entire byte containing
> multiple summary bits. This meant that if any other summary bit in the
> byte was set, interrupt injection could be incorrectly blocked, even
> when the current device's summary bit was not set. As a result, the
> guest could remain blocked after I/O completion during FIO tests.
> 
> This patch replaces set_ind_atomic() with set_ind_bit_atomic(), which
> returns true if the bit was set by this function, and false if it was
> already set or mapping failed. Interrupts are now blocked only when
> the device's own summary bit was not previously set, avoiding
> unintended blocking when multiple PCI summary bits exist within the
> same byte.
> 
> Signed-off-by: Jaehoon Kim <[email protected]>

Thanks Jaehoon!

To give a little additional background info, this issue was noticed when 
issuing simultaneous fio jobs across multiple virtio-blk-pci devices on an 
s390x guest (e.g. not the s390 default virtio-ccw tranpsort nor PCI 
passthrough).
In doing so, occasional I/O hangs were observed in the guest that were 
determined to be loss of initiative.
The root cause of this loss of initiative is the erroneous summary bit checking 
that Jaehoon is fixing here, where a decision might be made to not deliver an 
adapter interrupt because a different summary bit in the same byte happened to 
already be on.  We really only want to skip the interrupt if the exact same bit 
was already on.

I have also tested this against an s390x guest virtio-pci+fio setup where I 
could reliably reproduce the issue and verified that this fix resolves the 
issue.

Reviewed-by: Matthew Rosato <[email protected]>

> ---
>  hw/s390x/s390-pci-bus.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index f87d2748b6..e8e41c8a9a 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -652,7 +652,16 @@ static const PCIIOMMUOps s390_iommu_ops = {
>      .get_address_space = s390_pci_dma_iommu,
>  };
>  
> -static uint8_t set_ind_atomic(uint64_t ind_loc, uint8_t to_be_set)
> +/**
> + * set_ind_bit_atomic - Atomically set a bit in an indicator
> + *
> + * @ind_loc:   Address of the indicator
> + * @to_be_set: Bit to set
> + *
> + * Returns true if the bit was set by this function, false if it was
> + * already set or mapping failed.
> + */
> +static bool set_ind_bit_atomic(uint64_t ind_loc, uint8_t to_be_set)
>  {
>      uint8_t expected, actual;
>      hwaddr len = 1;
> @@ -662,7 +671,7 @@ static uint8_t set_ind_atomic(uint64_t ind_loc, uint8_t 
> to_be_set)
>      ind_addr = cpu_physical_memory_map(ind_loc, &len, true);
>      if (!ind_addr) {
>          s390_pci_generate_error_event(ERR_EVENT_AIRERR, 0, 0, 0, 0);
> -        return -1;
> +        return false;
>      }
>      actual = *ind_addr;
>      do {
> @@ -671,7 +680,7 @@ static uint8_t set_ind_atomic(uint64_t ind_loc, uint8_t 
> to_be_set)
>      } while (actual != expected);
>      cpu_physical_memory_unmap((void *)ind_addr, len, 1, len);
>  
> -    return actual;
> +    return (actual & to_be_set) ? false : true;
>  }
>  
>  static void s390_msi_ctrl_write(void *opaque, hwaddr addr, uint64_t data,
> @@ -693,10 +702,10 @@ static void s390_msi_ctrl_write(void *opaque, hwaddr 
> addr, uint64_t data,
>      ind_bit = pbdev->routes.adapter.ind_offset;
>      sum_bit = pbdev->routes.adapter.summary_offset;
>  
> -    set_ind_atomic(pbdev->routes.adapter.ind_addr + (ind_bit + vec) / 8,
> +    set_ind_bit_atomic(pbdev->routes.adapter.ind_addr + (ind_bit + vec) / 8,
>                     0x80 >> ((ind_bit + vec) % 8));
> -    if (!set_ind_atomic(pbdev->routes.adapter.summary_addr + sum_bit / 8,
> -                                       0x80 >> (sum_bit % 8))) {
> +    if (set_ind_bit_atomic(pbdev->routes.adapter.summary_addr + sum_bit / 8,
> +                   0x80 >> (sum_bit % 8))) {
>          css_adapter_interrupt(CSS_IO_ADAPTER_PCI, pbdev->isc);
>      }
>  }


Reply via email to