Problem:
When loading driver with debug lockdep enabled the WARN_ON as bellow
was observed. Gooling about this warning i found the follwing explanation -
https://git.sphere.ly/tucstwo/cam-test/commit/671ee198b38694cf1dfbaa0b9ea823929517c367

Fix:
Switch all debugfs attributes in RAS to static

Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.444744] WARNING: CPU: 1 PID: 
1100 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1926 
amdgpu_dm_initialize_drm_device+0x966/0x1650 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.445776] Modules linked in: 
amdgpu(O+) chash gpu_sched(O) ttm(O) drm_kms_helper(O) drm(O) i2c_algo_bit 
fb_sys_fops syscopyarea sysfillrect sysimgblt intel_rapl x86_pkg_temp_thermal 
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul 
snd_hda_codec_realtek ghash_clmulni_intel snd_hda_codec_generic ledtrig_audio 
snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi 
snd_seq_midi_event aesni_intel snd_rawmidi aes_x86_64 crypto_simd cryptd 
glue_helper eeepc_wmi snd_seq asus_wmi sparse_keymap wmi_bmof snd_seq_device 
snd_timer serio_raw joydev snd soundcore mei_me mei acpi_pad mac_hid 
binfmt_misc nfsd auth_rpcgss nfs_acl parport_pc lockd ppdev grace lp parport 
sunrpc autofs4 hid_generic psmouse e1000e r8169 ahci libahci usbhid hid mxm_wmi 
wmi video
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.452042] CPU: 1 PID: 1100 Comm: 
modprobe Tainted: G        W  O      5.0.0-rc1-dev+ #37
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.453417] Hardware name: System 
manufacturer System Product Name/Z170-PRO, BIOS 1902 06/27/2016
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.454920] RIP: 
0010:amdgpu_dm_initialize_drm_device+0x966/0x1650 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.456342] Code: e4 4c 89 e7 e8 3b 
5c 6f e0 48 8b 7c 24 20 e8 31 5c 6f e0 48 c7 c7 e0 77 e7 a0 45 31 e4 45 31 ed 
e8 2f 03 ba ff e9 b6 fa ff ff <0f> 0b e9 5a fa ff ff 31 ff 44 89 54 24 28 e8 07 
5c 6f e0 48 8b 7c
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.459404] RSP: 
0018:ffff8883e0faf160 EFLAGS: 00010202
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.460953] RAX: 0000000000000006 
RBX: ffff8883e1880000 RCX: ffffffffa0c848d6
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.462542] RDX: 0000000000000003 
RSI: dffffc0000000000 RDI: ffff8883d90e9118
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.464134] RBP: ffff8883d90e9100 
R08: ffffed107b3050c9 R09: ffffed107b3050c9
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.465713] R10: 0000000000000001 
R11: ffffed107b3050c8 R12: 0000000000000000
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.467285] R13: 0000000000000006 
R14: 0000000000000006 R15: 0000000000000006
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.468833] FS:  
00007fcc21833700(0000) GS:ffff8883f3e80000(0000) knlGS:0000000000000000
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.470390] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.471967] CR2: 0000562676b32138 
CR3: 00000003d96de006 CR4: 00000000003606e0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.473550] DR0: 0000000000000000 
DR1: 0000000000000000 DR2: 0000000000000000
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.475137] DR3: 0000000000000000 
DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.476729] Call Trace:
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.478328]  ? 
do_raw_spin_unlock+0x94/0x130
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.479942]  ? 
_raw_spin_unlock+0x24/0x30
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.481567]  ? 
__lock_acquire.isra.28+0x2f/0xc20
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.483278]  ? 
dm_resume+0x580/0x580 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.484930]  ? drm_dbg+0xaf/0x150 
[drm]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.486589]  ? 
drm_dev_dbg+0x180/0x180 [drm]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.488239]  ? 
kasan_unpoison_shadow+0x36/0x50
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.489895]  ? 
kasan_kmalloc+0xae/0xf0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.491551]  ? 
kmem_cache_alloc_trace+0x14d/0x2b0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.493293]  ? 
setup_x_points_distribution+0xbd/0x120 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.495070]  
amdgpu_dm_init+0x260/0x3f0 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.496848]  ? 
amdgpu_dm_initialize_drm_device+0x1650/0x1650 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.498642]  dm_hw_init+0xe/0x20 
[amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.500422]  
amdgpu_device_init+0x1d52/0x2950 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.502210]  ? 
amdgpu_device_has_dc_support+0x30/0x30 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.503918]  ? 
__alloc_pages_nodemask+0x232/0x460
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.505598]  ? 
__alloc_pages_slowpath+0x1370/0x1370
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.507270]  ? 
__mutex_unlock_slowpath+0xda/0x420
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.508935]  ? 
policy_nodemask+0x19/0xa0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.510590]  ? 
kasan_unpoison_shadow+0x36/0x50
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.512235]  ? 
kasan_kmalloc_large+0x9a/0xe0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.513964]  
amdgpu_driver_load_kms+0x101/0x540 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.515699]  ? 
amdgpu_driver_unload_kms+0x220/0x220 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.517370]  ? 
drm_dev_register+0x1a4/0x320 [drm]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.519016]  ? 
__kasan_slab_free+0x138/0x170
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.520669]  
drm_dev_register+0x1fd/0x320 [drm]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.522398]  
amdgpu_pci_probe+0xef/0x1a0 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.524126]  ? 
amdgpu_pci_remove+0x60/0x60 [amdgpu]
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.525773]  
local_pci_probe+0x76/0xe0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.527413]  
pci_device_probe+0x205/0x300
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.529045]  ? 
kernfs_create_link+0xae/0x100
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.530672]  ? 
pci_device_remove+0x1c0/0x1c0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.532285]  
really_probe+0x382/0x5e0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.533843]  
driver_probe_device+0x171/0x1b0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.535350]  
__driver_attach+0x193/0x1a0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.536783]  ? 
driver_probe_device+0x1b0/0x1b0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.538148]  
bus_for_each_dev+0xe4/0x160
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.539450]  ? 
lock_downgrade+0x2f0/0x2f0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.540687]  ? 
subsys_dev_iter_exit+0x10/0x10
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.541906]  
bus_add_driver+0x322/0x3a0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.543105]  
driver_register+0xc6/0x1a0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.544275]  ? 0xffffffffa1090000
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.545387]  
do_one_initcall+0xb8/0x29f
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.546451]  ? 
trace_event_raw_event_initcall_finish+0x150/0x150
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.547517]  ? 
kasan_unpoison_shadow+0x36/0x50
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.548570]  ? 
kasan_kmalloc+0xae/0xf0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.549616]  ? 
kmem_cache_alloc_trace+0x14d/0x2b0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.550669]  ? 
do_init_module+0x35/0x335
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.551710]  ? 
kasan_unpoison_shadow+0x36/0x50
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.552760]  
do_init_module+0xec/0x335
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.553788]  
load_module+0x3d5d/0x4780
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.554805]  ? 
module_frob_arch_sections+0x20/0x20
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.555810]  ? 
ima_read_file+0x10/0x10
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.556818]  ? vfs_read+0x127/0x190
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.557822]  ? kernel_read+0x74/0xa0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.558808]  ? 
kernel_read_file+0x16c/0x350
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.559793]  ? 
apparmor_task_free+0xc0/0xc0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.560770]  ? do_mmap+0x55e/0x790
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.561744]  ? 
__do_sys_finit_module+0x175/0x1b0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.562724]  
__do_sys_finit_module+0x175/0x1b0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.563712]  ? 
__ia32_sys_init_module+0x40/0x40
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.564696]  ? 
check_chain_key+0x131/0x1e0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.565679]  ? 
syscall_trace_enter+0x1fc/0x530
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.566666]  ? 
vtime_user_exit+0xc8/0xe0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.567652]  
do_syscall_64+0x7d/0x1f0
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.568622]  
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.569600] RIP: 0033:0x7fcc213654d9
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.570558] Code: 00 f3 c3 66 2e 0f 
1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 
4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8f 29 2c 00 
f7 d8 64 89 01 48
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.572678] RSP: 
002b:00007ffc8d3c2888 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.573766] RAX: ffffffffffffffda 
RBX: 000055b90a1363b0 RCX: 00007fcc213654d9
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.574868] RDX: 0000000000000000 
RSI: 000055b9091b926b RDI: 000000000000000d
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.575973] RBP: 000055b9091b926b 
R08: 0000000000000000 R09: 0000000000000000
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.577089] R10: 000000000000000d 
R11: 0000000000000246 R12: 0000000000000000
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.578191] R13: 000055b90a13aa30 
R14: 0000000000040000 R15: 0000000000040000
Mar  5 12:27:03 ubuntu-1604-test kernel: [   23.579288] ---[ end trace 
d006c1f8e03b5e67 ]---

Signed-off-by: Andrey Grodzovsky <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 54 ++++++++++++++++++---------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  2 --
 2 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index bf462c5..b0575b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -77,8 +77,7 @@ struct ras_manager {
        struct amdgpu_device *adev;
        /* debugfs */
        struct dentry *ent;
-       /* sysfs */
-       struct device_attribute sysfs_attr;
+
        int attr_inuse;
 
        /* fs node name */
@@ -374,10 +373,17 @@ static const struct file_operations 
amdgpu_ras_debugfs_ctrl_ops = {
        .llseek = default_llseek
 };
 
+static struct ras_sysfs_attr {
+       struct device_attribute sysfs_attrs;
+       struct ras_manager *obj;
+} ras_sysfs_attrs[AMDGPU_RAS_BLOCK__LAST];
+
 static ssize_t amdgpu_ras_sysfs_read(struct device *dev,
                struct device_attribute *attr, char *buf)
 {
-       struct ras_manager *obj = container_of(attr, struct ras_manager, 
sysfs_attr);
+       struct ras_sysfs_attr *ras_sysfs_attr = container_of(attr, struct 
ras_sysfs_attr, sysfs_attrs);
+       struct ras_manager *obj = ras_sysfs_attr->obj;
+
        struct ras_query_if info = {
                .head = obj->head,
        };
@@ -694,10 +700,9 @@ int amdgpu_ras_query_error_count(struct amdgpu_device 
*adev,
 static ssize_t amdgpu_ras_sysfs_features_read(struct device *dev,
                struct device_attribute *attr, char *buf)
 {
-       struct amdgpu_ras *con =
-               container_of(attr, struct amdgpu_ras, features_attr);
        struct drm_device *ddev = dev_get_drvdata(dev);
        struct amdgpu_device *adev = ddev->dev_private;
+       struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
        struct ras_common_if head;
        int ras_block_count = AMDGPU_RAS_BLOCK_COUNT;
        int i;
@@ -724,11 +729,12 @@ static ssize_t amdgpu_ras_sysfs_features_read(struct 
device *dev,
        return s;
 }
 
+static DEVICE_ATTR(features, S_IRUGO, amdgpu_ras_sysfs_features_read, NULL);
+
 static int amdgpu_ras_sysfs_create_feature_node(struct amdgpu_device *adev)
 {
-       struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
        struct attribute *attrs[] = {
-               &con->features_attr.attr,
+               &dev_attr_features.attr,
                NULL
        };
        struct attribute_group group = {
@@ -736,22 +742,13 @@ static int amdgpu_ras_sysfs_create_feature_node(struct 
amdgpu_device *adev)
                .attrs = attrs,
        };
 
-       con->features_attr = (struct device_attribute) {
-               .attr = {
-                       .name = "features",
-                       .mode = S_IRUGO,
-               },
-                       .show = amdgpu_ras_sysfs_features_read,
-       };
-
        return sysfs_create_group(&adev->dev->kobj, &group);
 }
 
 static int amdgpu_ras_sysfs_remove_feature_node(struct amdgpu_device *adev)
 {
-       struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
        struct attribute *attrs[] = {
-               &con->features_attr.attr,
+               &dev_attr_features.attr,
                NULL
        };
        struct attribute_group group = {
@@ -778,17 +775,22 @@ int amdgpu_ras_sysfs_create(struct amdgpu_device *adev,
                        head->sysfs_name,
                        sizeof(obj->fs_data.sysfs_name));
 
-       obj->sysfs_attr = (struct device_attribute){
-               .attr = {
-                       .name = obj->fs_data.sysfs_name,
-                       .mode = S_IRUGO,
+
+
+       ras_sysfs_attrs[head->head.block] = (struct ras_sysfs_attr){
+               .sysfs_attrs = {
+                       .attr = {
+                               .name = obj->fs_data.sysfs_name,
+                               .mode = S_IRUGO,
+                       },
+                               .show = amdgpu_ras_sysfs_read,
                },
-                       .show = amdgpu_ras_sysfs_read,
+               .obj = obj
        };
 
        if (sysfs_add_file_to_group(&adev->dev->kobj,
-                               &obj->sysfs_attr.attr,
-                               "ras")) {
+                                   
&ras_sysfs_attrs[head->head.block].sysfs_attrs.attr,
+                                   "ras")) {
                put_obj(obj);
                return -EINVAL;
        }
@@ -807,8 +809,10 @@ int amdgpu_ras_sysfs_remove(struct amdgpu_device *adev,
                return -EINVAL;
 
        sysfs_remove_file_from_group(&adev->dev->kobj,
-                               &obj->sysfs_attr.attr,
+                               &ras_sysfs_attrs[head->block].sysfs_attrs.attr,
                                "ras");
+
+       ras_sysfs_attrs[head->block].obj = NULL;
        obj->attr_inuse = 0;
        put_obj(obj);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 02cb9a1..b572bae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -88,8 +88,6 @@ struct amdgpu_ras {
        struct dentry *dir;
        /* debugfs ctrl */
        struct dentry *ent;
-       /* sysfs */
-       struct device_attribute features_attr;
        /* block array */
        struct ras_manager *objs;
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to