Launchpad has imported 21 comments from the remote bug at
https://bugzilla.kernel.org/show_bug.cgi?id=194521.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2017-02-08T20:32:58+00:00 vaclav.ovsik wrote:

Created attachment 254611
crash logged using netconsole

I bought my daughter a notebook HP 15-ba062nc
(http://support.hp.com/us-en/product/HP-15-ba000-Notebook-PC-series/10862317/model/11792430).
Installed is Debian Stretch/Sid with kernel 4.9.6.

Successful boot without crash is possible with
    - disabled amdgpu (e.g. old nomodeset)
    - or disabled iommu (iommu=off)
otherwise the kernel crashes and the file-system is corrupted.

iommu=off is much better way now, because the notebook runs in energy
efficient manner - the fan is quiet or stopped.

Attached are kernel messages using netconsle.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/0

------------------------------------------------------------------------
On 2017-02-08T20:35:58+00:00 vaclav.ovsik wrote:

Created attachment 254621
nomodeset - no crash

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/1

------------------------------------------------------------------------
On 2017-02-08T20:37:02+00:00 vaclav.ovsik wrote:

Created attachment 254631
iommu=off - no crash

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/2

------------------------------------------------------------------------
On 2017-02-08T20:48:42+00:00 vaclav.ovsik wrote:

Created attachment 254641
lspci -vvv

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/3

------------------------------------------------------------------------
On 2017-02-08T20:52:17+00:00 vaclav.ovsik wrote:

Created attachment 254651
/proc/cpuinfo

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/4

------------------------------------------------------------------------
On 2017-02-08T21:09:53+00:00 vaclav.ovsik wrote:

Created attachment 254661
crash logged using netconsole

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/5

------------------------------------------------------------------------
On 2017-02-09T12:07:28+00:00 fin4478 wrote:

Do you have the amdgpu firmware installed?

When you create bugs against amdgpu driver, use the latest kernel and mesa code:
https://cgit.freedesktop.org/~agd5f/linux/?h=drm-next-4.11-wip
https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers

Problems might be fixed in the latest code.

Latest polaris firmware:
https://people.freedesktop.org/~agd5f/radeon_ucode/polaris/


How to create a custom kernel, see:
https://bugzilla.kernel.org/show_bug.cgi?id=193651

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/6

------------------------------------------------------------------------
On 2017-02-09T23:34:15+00:00 vaclav.ovsik wrote:

Doubt the problem is in the amdgpu driver. What about bug in the amd_iommu?
I think this because I tried to switch off external GPU using acpi_call module.
The following command was successful:

    echo '\_SB_.PCI0.VGA.PX02' > /proc/acpi/call

while running kernel with no KMS (no amdgpu). Fan really went silent
after this, but kernel crashed in several seconds in similar way like
with amdgpu and active iommu.

The filesystem is after every crash corrupted. I'm afraid that storage
controller goes through iommu too and crash causes some random writes to
disk :(. But may be I am wrong and this ACPI call is illegal in reality
and amdgpu does something wrong regarding iommu to. Nevertheless amdgpu works
with iommu=off fine. Maybe the problem is with some buggy BIOS/firmware
from vendor.

I will try a newer kernel.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/7

------------------------------------------------------------------------
On 2017-02-10T18:36:11+00:00 vaclav.ovsik wrote:

I tried kernel 4.10.0-rc6-amd64 from Debian experimental archive and the
result is very similar. To minimize harm on filesystem I booted into
emergency mode with read-only file-system and tried to switch off GPU
using ACPI call. There is some warning during call, but something
happened :)

ACPI Warning: \_SB.PCI0.VGA.PX02: Insufficient arguments - Caller passed
0, method requires 1 (20160930/nsarguments-256)

I did this twice - one time with and one time without iommu=off.

I'm attaching netconsole logs...

Is this a proof the problem is in the amd_iommu.c?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/8

------------------------------------------------------------------------
On 2017-02-10T18:37:59+00:00 vaclav.ovsik wrote:

Created attachment 254695
logged using netconsole - 4.10.0rc6 with  iommu=off, gpu turned off using 
acpi_call

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/9

------------------------------------------------------------------------
On 2017-02-10T18:39:04+00:00 vaclav.ovsik wrote:

Created attachment 254697
logged using netconsole - 4.10.0rc6, gpu turned off using acpi_call  -> crash

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/10

------------------------------------------------------------------------
On 2017-02-13T03:25:42+00:00 fin4478 wrote:

Stock kernels have very little amdgpu driver code, see kernel.org and click 
diff. You have very new amd gpu so Use the command: 
git clone -b drm-next-4.11-wip git://people.freedesktop.org/~agd5f/linux

The kernel configuration file of Debian Official kernel are available in
/boot, named after the kernel release. Copy the .config file to the
linux directory. Connect all your devices and run the command: make
localmodconfig. You can use the command make defconfig too for creating
initial .config file.

Use the command: make xconfig and check that you have enabled: Reroute
Broken IRQ, Virtualization KVM and 300Hz CPU timer, I also disabled
Swap, Kernel Debug, CPU Freq scaling , Cpu handling in Acpi, Used Bios
to control CPU and devices. In the drivers->graphics->amdgpu enable cik
support for a gcn 1.1 gpu and si support for a gcn 1.0 gpu.

Create debian kernel package:
export CONCURRENCY_LEVEL=4
fakeroot make-kpkg --initrd kernel_image

Install the kernel package with Gdebi. To make a custom kernel to boot, add a 
line to /etc/initramfs-tools/modules:
unix
And run: sudo update-initramfs
Reboot.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/11

------------------------------------------------------------------------
On 2017-02-13T20:32:12+00:00 vaclav.ovsik wrote:

Created attachment 254739
drm-next-4.11-wip:  boot into emergency mode - crash after modprobe amdgpu

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/12

------------------------------------------------------------------------
On 2017-02-14T07:56:09+00:00 fin4478 wrote:

Comment on attachment 254739
drm-next-4.11-wip:  boot into emergency mode - crash after modprobe amdgpu

You have Carrizo and Topaz gpus. Can you disable the other from bios?
The linux driver does not support amd dual graphics to speed up fps. In
the kernel configuration you can try to disable iommu and vgaswitcheroo.
>From the kernel command line you can blacklist pci devices.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/13

------------------------------------------------------------------------
On 2017-02-15T20:41:34+00:00 vaclav.ovsik wrote:

The BIOS is really primitive, there is nearly nothing regarding HW that can be 
changed :-/. I can continue to use iommu=off, it seems to be fine.
Thanks

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/14

------------------------------------------------------------------------
On 2017-06-30T15:59:08+00:00 mrromanze wrote:

Same issue persists on HP 15-ba028ur using latest mainline kernel.
(4.12-rc7)

iommu=off  
and
amd_iommu=fullflush
make boot possible.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/15

------------------------------------------------------------------------
On 2017-06-30T16:38:48+00:00 mrromanze wrote:

Patch: https://patchwork.freedesktop.org/patch/157327/

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/16

------------------------------------------------------------------------
On 2019-03-21T08:59:30+00:00 nickel wrote:

Hi,
  I still encounter a same issue concerned to ext4 fs corruption using linux 
kernels 4.19.16... 4.20.27 on HP laptop 17-ak041ur (2 pcs on hand).

  Laptop configs are A6-9220 radeon r4 5 compute cores 2c+3g, 4G RAM, 200GB 
Intel SSD (1st laptop) or 500GB Toshiba HDD (2nd laptop)
I'm using OS ALT linux distribution (www.altlinux.org)
  Boot and installation of the system is performed flawlessly using LiveCD if 
LAN cable is NOT attached. dmesg shows plenty of "AMD-Vi: Completion-Wait loop 
timed out" errors.
  Connecting LAN cable during LiveCD boot results graphical target boot failure 
or kernel panic. 
  After first reboot the system won't boot anyway and ext4 filesystem 
corruption occurs.

 As investigtion revealed that switching IOMMU off (amd_iommu=off and/or
iommu=soft) solves the issue. "amd_iommu=fullflush" doesn't work for me.

I've discovered several patches concerning solution of (amd_)iommu issues in 
linux-kernel mailing list archive, but they are either applied to kernels 
mentioned above already or their application doesn't solve the issue for me.
Above mentioned patch (https://patchwork.freedesktop.org/patch/157327/) is not 
applicable to mentioned kernel versions any more.
  Therefore my question is: am I missing some patch that already solved the 
issue or should I provide more specific bug report?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/62

------------------------------------------------------------------------
On 2019-03-21T13:51:35+00:00 nickel wrote:

Created attachment 281945
HP laptop dmesg output with plenty of errors while LAN cable present

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/63

------------------------------------------------------------------------
On 2019-03-21T13:53:07+00:00 nickel wrote:

Created attachment 281947
HP laptop dmesg output while IOMMU turned on, but no LAN cable

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/64

------------------------------------------------------------------------
On 2019-03-21T13:53:40+00:00 nickel wrote:

Created attachment 281949
HP laptop HW config via dmidecode

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/65


** Changed in: linux
       Status: Unknown => Confirmed

** Changed in: linux
   Importance: Unknown => Medium

** Bug watch added: Linux Kernel Bug Tracker #193651
   https://bugzilla.kernel.org/show_bug.cgi?id=193651

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1747463

Title:
  kernel crashes during boot unless IOMMU is disabled on Ryzen 1800X

Status in Linux:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Cosmic:
  Confirmed

Bug description:
  I'm on a Ryzen 1800X and Biostar B350GT5 on bionic kubuntu.

  There are lots of AMD-Vi logged events and I get irq crashes or acpi
  hangups with a 'normal' boot. I got it to boot by disabling IOMMU in
  the BIOS and adding "iommu=soft" to the kernel booting options in
  grub.

  linux can then detect everything properly (all cores) and I've had
  zero crashes. The only issue is that it's using software IOMMU which
  could have a performance penalty because it has to copy all the data
  of some PCI devices to sub 4G regions.

  Alternatively it boots with the kernel option "acpi=off" but only
  detects a single core/thread.

  I attached a kernel log.

  I believe(d) this might be related to 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1671360
  and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1690085
  ---
  ApportVersion: 2.20.8-0ubuntu8
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC1:  fixme      1487 F.... pulseaudio
   /dev/snd/controlC0:  fixme      1487 F.... pulseaudio
  CurrentDesktop: KDE
  DistroRelease: Ubuntu 18.04
  HibernationDevice: RESUME=UUID=bc971fcc-8e63-4fa5-a149-af4af6c8eece
  InstallationDate: Installed on 2018-01-31 (4 days ago)
  InstallationMedia: Kubuntu 18.04 LTS "Bionic Beaver" - Alpha amd64 (20180131)
  IwConfig:
   lo        no wireless extensions.

   enp3s0    no wireless extensions.
  MachineType: BIOSTAR Group B350GT5
  Package: linux (not installed)
  ProcFB: 0 amdgpudrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.13.0-32-generic.efi.signed 
root=/dev/mapper/kubuntu--vg-root ro iommu=soft quiet splash vt.handoff=1
  ProcVersionSignature: Ubuntu 4.13.0-32.35-generic 4.13.13
  RelatedPackageVersions:
   linux-restricted-modules-4.13.0-32-generic N/A
   linux-backports-modules-4.13.0-32-generic  N/A
   linux-firmware                             1.170
  RfKill:

  Tags:  bionic
  Uname: Linux 4.13.0-32-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 11/30/2017
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 5.13
  dmi.board.asset.tag: None
  dmi.board.name: B350GT5
  dmi.board.vendor: BIOSTAR Group
  dmi.chassis.asset.tag: Default string
  dmi.chassis.type: 3
  dmi.chassis.vendor: Default string
  dmi.chassis.version: Default string
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr5.13:bd11/30/2017:svnBIOSTARGroup:pnB350GT5:pvr:rvnBIOSTARGroup:rnB350GT5:rvr:cvnDefaultstring:ct3:cvrDefaultstring:
  dmi.product.family: None
  dmi.product.name: B350GT5
  dmi.sys.vendor: BIOSTAR Group

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1747463/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to