------- Comment From niklas.schne...@ibm.com 2020-05-26 04:04 EDT-------
(In reply to comment #12)
> Hi Niklas, please can you have a look at the SRU Justification that I've now
> added to the bug description, with special focus on Test Case and Regression
> Potential.
> Please let me know if is inaccurate or missing or in case additional
> information should be added (this is mandatory information needed for the
> SRU).
> Thx

Sounds good, maybe to add that this can be triggered with shipping user
space tools (including in the Ubuntu repos) e.g.:

1. install the rdma tools:
sudo apt-get install ibverbs-providers ibverbs-utils

2. verify you have some RDMA devices (requires ConnectX adapter)
pcidev@T224LP06:~$ ibv_devices
device                     node GUID
------                  ----------------
mlx5_0                  98039b0300c682b4

3. verify MIO instructions are enabled for the device
pcidev@T224LP06:~$ cat /sys/bus/pci/devices/0000\:00\:00.0/mio_enabled
1

4. try to run an RDMA application from user space, e.g. ibv_rc_pingpong

server process:
ibv_rc_pingpong -d mlx5_0 -g 0 &

client process:
ibv_rc_pingpong -d mlx5_0 -g 0 localhost

5. verify that the kernel crashes

[92406.190525] Unable to handle kernel pointer dereference in virtual kernel add
ress space
[92406.190529] Failing address: ed00000000090000 TEID: ed00000000090403
[92406.190529] Fault in home space mode while using kernel ASCE.
[92406.190531] AS:0000000c1c98c007 R3:0000000ff3bd0007 S:0000000ff3bd6000 P:0000
00000009013d

Also this patch made it into v5.7-rc7 and Linus himself commented:

"And none of the fixes look like there's anything particularly scary
going on. Most of it is very small, and the slightly larger patches
aren't huge either and are well-contained (the two slightly larger
patches are to s390 and rxrpc - and even those patches aren't really
all _that_ big)"

(this patch obviously being the bigger s390 change)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1874055

Title:
  [UBUNTU 20.04] s390x/pci: s390_pci_mmio_write/read fail when MIO
  instructions are available

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  SRU Justification:
  ==================

  [Impact]

  * Code that is using s390_pci_mmio_write/read system calls on a z15
  (that comes with enhanced PCI load/store instructions), fails with
  "Unable to handle kernel pointer dereference in virtual kernel address
  space".

  * This issue happens if enablement for z15 PCI enhancements is in
  place and where customers run workloads which access PCI adapters from
  user space, like RoCE/RDMA.

  * To solve this, the system call implementation needs to be improved to 
execute the enhanced PCI load/store instructions on behalf of the user space 
application,
    making use of the mappings into its virtual address space.

  [Fix]

  * f058599e22d59e594e5aae1dc10560568d8f4a8b f058599e22d5 "s390/pci: Fix
  s390_mmio_read/write with MIO"

  [Test Case]

  * Setting up a z15 with at least one PCI card (like RoCE) using an
  operating system that includes support and enablement for z15 (line
  20.04).

  * A little user space program can be written to provoke this error
  situation using the RoCE adapter.

  * Verification needs to be done by IBM on z15 hardware.

  [Regression Potential]

  * There is some regression potential with having code changes in the
  zPCI sub-system.

  * However, the zPCI system is s390x only and the patch was accepted
  upstream (next-20200515).

  * Nevertheless, it could be that PCI hardware get harmed, but PCI
  hardware is not as wide-spread on s390x than ccw hardware components.

  * Only z15 hardware is affected - no other s390x hardware thatis supported by 
Ubuntu.
  __________

  One of the PCI enhancements on Z15 are the enhanced PCI load/store
  instructions which can be executed directly from user space code. When
  these instructions are available and preexisting user space code still
  uses the old s390_pci_mmio_write/read system calls, the system calls
  fail with an "Unable to handle kernel pointer dereference in virtual
  kernel address space" in the kernel.  This issue affects distributions
  which have the enablement for Z15 PCI enhancements and where customers
  run workloads which accesses PCI adapters from user space, e.g. RDMA
  applications.  To solve this, the system call implementation needs to
  be enhanced to provide to execute enhanced PCI load/store instructions
  on behalf of the user space application making use of the mappings
  into its virtual address space

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1874055/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to