On Wed, May 06, 2015 at 04:07:56PM +0900, Sergey Senozhatsky wrote: > On (05/06/15 14:25), Sergey Senozhatsky wrote: > > a-ha... found it: > > http://lkml.iu.edu/hypermail/linux/kernel/1505.0/04389.html > > > > well... that's weird. can you reproduce this easily?
Easy in a second. > > > we do > > zram_add(): zram_remove(): > ------------------------------------------- > mutex_lock(idr); > idr_alloc() > ... > ->major = 252; > ->first_minor = idr index; > add_disk(); > mumutex_unlock(idr); > mutex_lock(idr); > ... > idr_remove(idr index); > del_gendisk(); > put_disk(); > mutex_unlock(idr); > > > script attempted to create a new device with minor (idr index) 2, but there > was a kernfs node from already removed zram2: '/devices/virtual/bdi/252:2' > > from your logs: > ... > [ 98.756017] zram: Removed device: zram2 > [ 98.757087] ------------[ cut here ]------------ > ... > > locked zram_index_mutex, removed zram2, unlocked zram_index_mutex, > locked zram_index_mutex, attempted to create zram2: zram2 sysfs already exist. > > > hm... need to think. zram hot/remove is serialized. I never look at the code but I doubt others(ex, some admin process on my machine) holds a ref so kobj doesn't disapper yet but zram_add try to create it? > > > > > a separate issue: > > > /* > * FIXME: error handling > */ > void add_disk(struct gendisk *disk) does not handle any errors at all nor > it > reports any errors back to caller: > > 612 /* Register BDI before referencing it from bdev */ > 613 bdi = &disk->queue->backing_dev_info; > 614 bdi_register_dev(bdi, disk_devt(disk)); > 615 > 616 blk_register_region(disk_devt(disk), disk->minors, NULL, > 617 exact_match, exact_lock, disk); > 618 register_disk(disk); > 619 blk_register_queue(disk); > 620 > 621 /* > 622 * Take an extra ref on queue which will be put on disk_release() > 623 * so that it sticks around as long as @disk is there. > 624 */ > 625 WARN_ON_ONCE(!blk_get_queue(disk->queue)); > 626 > 627 retval = sysfs_create_link(&disk_to_dev(disk)->kobj, > &bdi->dev->kobj, > 628 "bdi"); > 629 WARN_ON(retval); > > > if bdi_register_dev()->device_add() fails (for example), then > sysfs_create_link() > is just "expected" to cause: > > > [ 98.819810] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000040 > [ 98.820591] IP: [<ffffffff8126da50>] sysfs_do_create_link_sd.isra.2+0x40/0xd0 > [ 98.821290] PGD 61669067 PUD 61553067 PMD 0 > [ 98.821709] Oops: 0000 [#1] SMP > [..] > [ 98.823663] Call Trace: > [ 98.823663] [<ffffffff8135931b>] ? disk_part_iter_next+0x2b/0x280 > [ 98.823663] [<ffffffff8126db05>] sysfs_create_link+0x25/0x50 > [ 98.823663] [<ffffffff81359fc2>] add_disk+0x252/0x4e0 > > > or: > > [ 98.823663] BUG: sleeping function called from invalid context at > kernel/locking/rwsem.c:21 > [ 98.823663] in_atomic(): 1, irqs_disabled(): 1, pid: 2952, name: cat > [ 98.823663] INFO: lockdep is turned off. > [ 98.823663] irq event stamp: 3392 > [ 98.823663] hardirqs last enabled at (3391): [<ffffffff81644133>] > __mutex_unlock_slowpath+0xb3/0x180 > [ 98.823663] hardirqs last disabled at (3392): [<ffffffff81648d33>] > error_sti+0x5/0x6 > [ 98.823663] softirqs last enabled at (0): [<ffffffff8105c6a9>] > copy_process.part.35+0x4d9/0x1ce0 > [ 98.823663] softirqs last disabled at (0): [< (null)>] (null) > [ 98.823663] CPU: 5 PID: 2952 Comm: cat Tainted: G D W > 4.1.0-rc2-next-20150505+ #1260 > [ 98.823663] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > Bochs 01/01/2011 > [ 98.823663] ffff88005d40f8c8 ffff88005d40f8c8 ffffffff8163d8d8 > 0000000000000000 > [ 98.823663] 0000000000000000 ffff88005d40f8f8 ffffffff8108c18e > ffffffff8108c005 > [ 98.823663] ffffffff819ec8b3 0000000000000015 0000000000000000 > ffff88005d40f928 > [ 98.823663] Call Trace: > [ 98.823663] [<ffffffff8163d8d8>] dump_stack+0x4c/0x65 > [ 98.823663] [<ffffffff8108c18e>] ___might_sleep+0x18e/0x250 > [ 98.823663] [<ffffffff8108c005>] ? ___might_sleep+0x5/0x250 > [ 98.823663] [<ffffffff8108c29d>] __might_sleep+0x4d/0x90 > [ 98.823663] [<ffffffff8108c255>] ? __might_sleep+0x5/0x90 > [ 98.823663] [<ffffffff81644494>] down_read+0x24/0x70 > [ 98.823663] [<ffffffff81644475>] ? down_read+0x5/0x70 > [ 98.823663] [<ffffffff810722a4>] ? exit_signals+0x24/0x130 > [ 98.823663] [<ffffffff810722a4>] exit_signals+0x24/0x130 > [ 98.823663] [<ffffffff81072285>] ? exit_signals+0x5/0x130 > [ 98.823663] [<ffffffff810606f8>] ? do_exit+0xb8/0xbc0 > [ 98.823663] [<ffffffff810606f8>] do_exit+0xb8/0xbc0 > [ 98.823663] [<ffffffff81060645>] ? do_exit+0x5/0xbc0 > [ 98.823663] [<ffffffff810c4a79>] ? kmsg_dump+0x119/0x1a0 > [ 98.823663] [<ffffffff810c4985>] ? kmsg_dump+0x25/0x1a0 > [ 98.823663] [<ffffffff8100665e>] oops_end+0x8e/0xd0 > [ 98.823663] [<ffffffff810065d5>] ? oops_end+0x5/0xd0 > [ 98.823663] [<ffffffff81637f65>] no_context+0x2d9/0x33e > [ 98.823663] [<ffffffff81637c91>] ? no_context+0x5/0x33e > [ 98.823663] [<ffffffff81638042>] __bad_area_nosemaphore+0x78/0x1d1 > [ 98.823663] [<ffffffff816381a0>] ? bad_area_nosemaphore+0x5/0x15 > [ 98.823663] [<ffffffff816381ae>] bad_area_nosemaphore+0x13/0x15 > [ 98.823663] [<ffffffff8104c4de>] __do_page_fault+0x9e/0x490 > [ 98.823663] [<ffffffff8104c445>] ? __do_page_fault+0x5/0x490 > [ 98.823663] [<ffffffff8104c8dc>] do_page_fault+0xc/0x10 > [ 98.823663] [<ffffffff81648b62>] page_fault+0x22/0x30 > [ 98.823663] [<ffffffff8126da50>] ? sysfs_do_create_link_sd.isra.2+0x40/0xd0 > [ 98.823663] [<ffffffff8126da50>] ? sysfs_do_create_link_sd.isra.2+0x40/0xd0 > [ 98.823663] [<ffffffff8135931b>] ? disk_part_iter_next+0x2b/0x280 > [ 98.823663] [<ffffffff8126db05>] sysfs_create_link+0x25/0x50 > [ 98.823663] [<ffffffff81359fc2>] add_disk+0x252/0x4e0 > > > there are lots of places in add_disk() that can potentially fail. it's > probably time > to "fix" it. including numerous add_disk() callers, that don't check for > errors. > > -ss -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

