Re: ARC show_regs() triggers preempt debug splat, lockdep
On Tue, Jul 31, 2018 at 02:26:32PM -0700, Vineet Gupta wrote: > Hi Peter, Al, > > Reaching out about a problem I understand, but not quite sure how to fix it. > Its the weird feeling of how was this working all along, if at all. > > With print-fatal-signals enabled, there's CONFIG_DEBUG_PREEMPT splat all over, > even with a simple single threaded segv inducing program (console log below). > This > originally came to light with a glibc test suite tst-tls3-malloc which is a > multi-threaded monster. > > ARC show_regs() is a bit more fancy as it tries to print the executable path, > faulting vma name (in case it was a shared lib etc). This involves taking a > bunch > of customary locks which seems to be tripping the debug infra. Right, so I think that that is a fairly dodgy thing to do. As shown in your subsequent email, if a pagefault generates a signal we might already be holding the mmap_sem. The thing you could do is maybe use down_read_trylock() there. diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c index 783b20354f8b..bb7bde11d2c8 100644 --- a/arch/arc/kernel/troubleshoot.c +++ b/arch/arc/kernel/troubleshoot.c @@ -92,7 +92,10 @@ static void show_faulting_vma(unsigned long address, char *buf) /* can't use print_vma_addr() yet as it doesn't check for * non-inclusive vma */ - down_read(&active_mm->mmap_sem); + if (!down_read_trylock(&active_mm->mmap_sem)) { + pr_info("@Trylock failed\n"); + return; + } vma = find_vma(active_mm, address); /* check against the find_vma( ) behaviour which returns the next VMA ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: ARC show_regs() triggers preempt debug splat, lockdep
On 08/01/2018 12:53 AM, Peter Zijlstra wrote: > On Tue, Jul 31, 2018 at 02:26:32PM -0700, Vineet Gupta wrote: >> Hi Peter, Al, >> >> Reaching out about a problem I understand, but not quite sure how to fix it. >> Its the weird feeling of how was this working all along, if at all. >> >> With print-fatal-signals enabled, there's CONFIG_DEBUG_PREEMPT splat all >> over, >> even with a simple single threaded segv inducing program (console log >> below). This >> originally came to light with a glibc test suite tst-tls3-malloc which is a >> multi-threaded monster. >> >> ARC show_regs() is a bit more fancy as it tries to print the executable path, >> faulting vma name (in case it was a shared lib etc). This involves taking a >> bunch >> of customary locks which seems to be tripping the debug infra. > Right, so I think that that is a fairly dodgy thing to do. As shown in > your subsequent email, if a pagefault generates a signal we might > already be holding the mmap_sem. > > The thing you could do is maybe use down_read_trylock() there. > > diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c > index 783b20354f8b..bb7bde11d2c8 100644 > --- a/arch/arc/kernel/troubleshoot.c > +++ b/arch/arc/kernel/troubleshoot.c > @@ -92,7 +92,10 @@ static void show_faulting_vma(unsigned long address, char > *buf) > /* can't use print_vma_addr() yet as it doesn't check for >* non-inclusive vma >*/ > - down_read(&active_mm->mmap_sem); > + if (!down_read_trylock(&active_mm->mmap_sem)) { > + pr_info("@Trylock failed\n"); > + return; > + } > vma = find_vma(active_mm, address); > > /* check against the find_vma( ) behaviour which returns the next VMA That's not the only issue here. We also call page allocator in show_regs code path and that barfs as well with __might_sleep. I think for us, it would make sense to re-enable preemption inside ARC show_regs() - undoing the generic disable from get_signal() The rationale for that in first place, per commit 3a9f84d354ce1 was some arch (x86?) show_regs() calling smp_processor_id(). Arguable that could call the raw_smp variant, but we don't want to needlessly bother rest of the world. Do you see any pitfall with my proposal ? -Vineet ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] ARC: Improve handling of fatal signals in do_page_fault()
Hi Alexey, I was finally forced to revisit this for my glibc tst-tls3-malloc deadlock. And indeed with this change we don'tsee the deadlock. But see below.. > @@ -139,12 +139,16 @@ void do_page_fault(unsigned long address, struct > pt_regs *regs) >*/ > fault = handle_mm_fault(vma, address, flags); > > - /* If Pagefault was interrupted by SIGKILL, exit page fault "early" */ > + /* If we need to retry but a fatal signal is pending, handle the > + * signal first. We do not need to release the mmap_sem because > + * it would already be released in __lock_page_or_retry in > + * mm/filemap.c. */ Right and we were already doing that: up_read() was called for !VM_FAULT_RETRY meaning we relied on the core mm to do that already for VM_FAULT_RETRY case. The issue here was additional check for VM_FAULT_ERROR. Typically this is not set by handle_mm_fault() meaning for common user faults with signal pending, we were not calling up_read, hence the ensuing deadlock. > if (unlikely(fatal_signal_pending(current))) { > - if ((fault & VM_FAULT_ERROR) && !(fault & VM_FAULT_RETRY)) > - up_read(&mm->mmap_sem); > - if (user_mode(regs)) > + if (fault & VM_FAULT_RETRY) { > + if (!user_mode(regs)) > + goto no_context; Given this code is really tricky, lets only solve one problem with 1 one patch. > return; > + } > } The fault handling is spaghetti mess of checks and more checks and has not really been touched since upstreaming. I need to clean it up and essentially rewrite it for v4.19 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc