On Tue, Sep 27, 2016 at 2:27 PM, Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com> wrote: > On 09/27/2016 05:58 PM, Michael Chan wrote: >> >> On Tue, Sep 27, 2016 at 1:05 PM, Guilherme G. Piccoli >> <gpicc...@linux.vnet.ibm.com> wrote: >>> >>> From: Milton Miller <milt...@us.ibm.com> >>> >>> While the driver is probing the adapter, an error may occur before the >>> netdev structure is allocated and attached to pci_dev. In this case, >>> not only netdev isn't available, but the tg3 private structure is also >>> not available as it is just math from the NULL pointer, so dereferences >>> must be skipped. >>> >>> The following trace is seen when the error is triggered: >>> >>> [1.402247] Unable to handle kernel paging request for data at address >>> 0x00001a99 >>> [1.402410] Faulting instruction address: 0xc0000000007e33f8 >>> [1.402450] Oops: Kernel access of bad area, sig: 11 [#1] >>> [1.402481] SMP NR_CPUS=2048 NUMA PowerNV >>> [1.402513] Modules linked in: >>> [1.402545] CPU: 0 PID: 651 Comm: eehd Not tainted 4.4.0-36-generic >>> #55-Ubuntu >>> [1.402591] task: c000001fe4e42a20 ti: c000001fe4e88000 task.ti: >>> c000001fe4e88000 >>> [1.402742] NIP: c0000000007e33f8 LR: c0000000007e3164 CTR: >>> c000000000595ea0 >>> [1.402787] REGS: c000001fe4e8b790 TRAP: 0300 Not tainted >>> (4.4.0-36-generic) >>> [1.402832] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: >>> 28000422 XER: 20000000 >>> [1.403058] CFAR: c000000000008468 DAR: 0000000000001a99 DSISR: >>> 42000000 SOFTE: 1 >>> GPR00: c0000000007e3164 c000001fe4e8ba10 c0000000015c5e00 >>> 0000000000000000 >>> GPR04: 0000000000000001 0000000000000000 0000000000000039 >>> 0000000000000299 >>> GPR08: 0000000000000000 0000000000000001 c000001fe4e88000 >>> 0000000000000006 >>> GPR12: 0000000000000000 c00000000fb40000 c0000000000e6558 >>> c000003ca1bffd00 >>> GPR16: 0000000000000000 0000000000000000 0000000000000000 >>> 0000000000000000 >>> GPR20: 0000000000000000 0000000000000000 0000000000000000 >>> c000000000d52768 >>> GPR24: c000000000d52740 0000000000000100 c000003ca1b52000 >>> 0000000000000002 >>> GPR28: 0000000000000900 0000000000000000 c00000000152a0c0 >>> c000003ca1b52000 >>> [1.404226] NIP [c0000000007e33f8] tg3_io_error_detected+0x308/0x340 >>> [1.404265] LR [c0000000007e3164] tg3_io_error_detected+0x74/0x340 >>> >>> This patch avoids the NULL pointer dereference by moving the access after >>> the netdev NULL pointer check on tg3_io_error_detected(). >>> >>> Fixes: 0486a063b1ff ("tg3: prevent ifup/ifdown during PCI error >>> recovery") >>> Fixes: dfc8f370316b ("net/tg3: Release IRQs on permanent error") >>> Tested-by: Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com> >>> Signed-off-by: Milton Miller <milt...@us.ibm.com> >>> Signed-off-by: Guilherme G. Piccoli <gpicc...@linux.vnet.ibm.com> >> >> >> Looks good. Do we need to add !netdev check in tg3_io_resume()? > > > Thanks Michael. It's a good point - I didn't trigger any error without the > check, but looking at error handlers, every one seems to have this check > except tg3_io_resume(). > > Do you want us to send a v2 including this check? Or maybe another patch? >
I think v2 should be fine. The additional check is very much related to the v1 patch. I will ACK it once you send v2. Thanks.