Hi Daniel,
nice to know that you got it working.
And it is an interesting rational that disabling thunderbolt boot
support in the BIOS fixes thinks. Going to keep that in mind when other
users run into the same issue.
Thanks,
Christian.
Am 09.04.2018 um 18:20 schrieb Daniel Moran:
Christian,
Thanks for the response. That got me in the right direction.
After trial and error I found the cause - Thunderbolt Boot Support
option must be disabled in BIOS.
If I disable it I can boot to Ubuntu and looks like amdgpu inits okay.
If I enable with no other changes, init fails.
The last issue was one of my own - forgetting to use DRI_PRIME and
xrandr correctly.
Happy to say the Red Devil is working now in eGPU mode!
It's about a 20% perf loss over PCI-E slot and right in line with our
previous tests.
As always thank you for your continued time and support.
We'll be happy to give a shout out to you guys for the help at
article/video time.
Respectfully,
Daniel S. Moran (garwynn)
PC Hardware Editor - XDA-Developers
Phone: 1-559-316-0760/+81-90-5484-4155
Article Links: http://www.xda-developers.com/author/garwynn
E-mail: [email protected] <mailto:[email protected]> | Twitter:
@xdagarwynn
On Mon, Apr 9, 2018 at 10:48 PM, Christian König
<[email protected] <mailto:[email protected]>> wrote:
Hi Daniel,
your problem is that the system BIOS is buggy and doesn't assign
resources to the card:
Region 0: Memory at <ignored> (64-bit, prefetchable)
Region 2: Memory at <ignored> (64-bit, prefetchable)
Region 4: I/O ports at 9000 [size=256]
Region 5: Memory at <ignored> (32-bit, non-prefetchable)
Expansion ROM at <ignored> [disabled]
The kernel actually tries to assign resources to the bridges, but
fails as well because the BIOS didn't reserved any during startup.
[ 0.179743] pci 0000:12:00.0: can't claim BAR 14 [mem
0x01c00000-0xef0fffff]: no compatible bridge window
[ 0.179745] pci 0000:12:00.0: [mem 0x01c00000-0xef0fffff]
clipped to [mem 0xef000000-0xef0fffff]
[ 0.179747] pci 0000:12:00.0: bridge window [mem
0xef000000-0xef0fffff]
[ 0.179751] pci 0000:13:01.0: can't claim BAR 14 [mem
0x01c00000-0x01ffffff]: no compatible bridge window
[ 0.179753] pci 0000:14:00.0: can't claim BAR 14 [mem
0x01c00000-0x01ffffff]: no compatible bridge window
[ 0.179754] pci 0000:15:00.0: can't claim BAR 14 [mem
0x01d00000-0x01dfffff]: no compatible bridge window
[ 0.179756] pci 0000:08:04.0: can't claim BAR 13 [io
0xb000-0xcfff]: address conflict with PCI Bus 0000:12 [io
0x9000-0xbfff]
[ 0.179782] pci 0000:14:00.0: can't claim BAR 0 [mem
0x01c00000-0x01c03fff]: no compatible bridge window
[ 0.179789] pci 0000:16:00.0: can't claim BAR 0 [mem
0xd0000000-0xdfffffff 64bit pref]: no compatible bridge window
[ 0.179791] pci 0000:16:00.0: can't claim BAR 2 [mem
0xe0200000-0xe03fffff 64bit pref]: no compatible bridge window
[ 0.179793] pci 0000:16:00.0: can't claim BAR 5 [mem
0x01d00000-0x01d7ffff]: no compatible bridge window
[ 0.179798] pci 0000:16:00.1: can't claim BAR 0 [mem
0x01da0000-0x01da3fff]: no compatible bridge window
There isn't much you can do except for trying to update the BIOS
and if that doesn't help replace your motherboard.
Regards,
Christian.
Am 09.04.2018 um 15:33 schrieb Daniel Moran:
Christian,
Andrey,
Thank you for the responses.
Here's the requested dmesg/lspci. Also pulled journalctl just in
case but didn't see anything that stands out.
I'll take another look at the BIOS settings to see if anything
else may explain the memory error.
I've got 16GB in the system at the moment, can bump up to 32 -
also added a larger swap just in case that was the issue. (No
change.)
As always thank you for your continued time and support.
Respectfully,
Daniel S. Moran (garwynn)
PC Hardware Editor - XDA-Developers
Phone: 1-559-316-0760/+81-90-5484-4155
Article Links: http://www.xda-developers.com/author/garwynn
<http://www.xda-developers.com/author/garwynn>
E-mail: [email protected] <mailto:[email protected]> |
Twitter: @xdagarwynn
On Mon, Apr 9, 2018 at 3:52 PM, Christian König
<[email protected] <mailto:[email protected]>> wrote:
Please provide the full dmesg of the system as well as the
output of "lspci -s 0000:16:00.0 -vvvv" as attachment.
Thanks,
Christian.
Am 09.04.2018 um 06:00 schrieb Andrey Grodzovsky:
Just from a quick look it seems to fail in
amdgpu_device_init->ioremap with ENOMEM, that would explain
why you don't see any more prints - this failure is very
early in the device init process.
No idea why ioremap would fail in this case and not even
sure which implementation of ioremap to look into for your case.
Adding Christian for this.
Andrey
On 04/07/2018 03:16 AM, Daniel Moran wrote:
Also, to clarify... if I move it into a regular slot, turn
off the eGPU it works as expected.
Tested with Intel iGPU enabled and disabled, made sure i915
loaded without error and can connect display to it.
Again, thank you in advance for any time/support offered.
Respectfully,
Daniel S. Moran (garwynn)
PC Hardware Editor - XDA-Developers
Phone: 1-559-316-0760/+81-90-5484-4155
Article Links: http://www.xda-developers.com/author/garwynn
<http://www.xda-developers.com/author/garwynn>
E-mail: [email protected]
<mailto:[email protected]> | Twitter: @xdagarwynn
On Sat, Apr 7, 2018 at 3:58 PM, Daniel Moran
<[email protected] <mailto:[email protected]>> wrote:
Hello all,
I've got a Powercolor Red Devil Vega 56 here that I'm
trying to get working in eGPU mode.
I think on the BIOS/hardware side it's now all fleshed out.
Now I'm at a point where amdgpu tries to init and
reaches a fatal error.
Set loglevel=8 doesn't get any additional messages.
Here's what it does report (full dmesg attached):
[ 429.005909] [drm] amdgpu kernel modesetting enabled.
[ 429.006080] [drm] initializing kernel modesetting
(VEGA10 0x1002:0x687F 0x148C:0x2388 0xC3).
[ 429.006082] amdgpu 0000:16:00.0: Fatal error during
GPU init
[ 429.006155] amdgpu: probe of 0000:16:00.0 failed
with error -12
Using the following commands to unload & reload for
testing. Since it's as an eGPU I'm using the i7-7700K
iGPU (i915 module) as the primary and these commands
work in terminal without requiring a reboot.
sudo rmmod amdgpu
sudo modprobe -v amgpu
Pulled the UMR and tried to make, fails on Cmake. I'll
attach log in a text.
Also will attach a full dmesg and lspci dump. uname -a
below:
/Linux testbox 4.15.15-041515-generic #201803311331 SMP
Sat Mar 31 17:34:21 UTC 2018 x86_64 x86_64 x86_64
GNU/Linux/
Any other ideas on how I can debug this further? Feel
I'm so close, don't want to let this go.
Thank you in advance for your time.
Respectfully,
Daniel S. Moran (garwynn)
PC Hardware Editor - XDA-Developers
Phone: 1-559-316-0760/+81-90-5484-4155
Article Links:
http://www.xda-developers.com/author/garwynn
<http://www.xda-developers.com/author/garwynn>
E-mail: [email protected]
<mailto:[email protected]> | Twitter: @xdagarwynn
_______________________________________________
amd-gfx mailing list
[email protected]
<mailto:[email protected]>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
<https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx