[Bug fortran/58175] [OOP] Incorrect warning message on scalar finalizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58175 Jonathan Hogg changed: What|Removed |Added CC||jhogg41 at gmail dot com --- Comment #9 from Jonathan Hogg --- Still present in 6.1.1, please fix.
[Bug fortran/58175] [OOP] Incorrect warning message on scalar finalizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58175 --- Comment #11 from Jonathan Hogg --- It looks like there's already a patch there, if you point me at a list of what needs doing to get it into the code base, I'm happy to take a look. Thanks, Jonathan. On Thu, Jul 7, 2016 at 4:14 PM, dominiq at lps dot ens.fr < gcc-bugzi...@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58175 > > --- Comment #10 from Dominique d'Humieres --- > > Still present in 6.1.1, please fix. > > You're welcome to do it! > > -- > You are receiving this mail because: > You are on the CC list for the bug.
[Bug libgomp/71781] Severe performance degradation of short parallel for loop on hardware with lots of cores
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71781 Jonathan Hogg changed: What|Removed |Added CC||jhogg41 at gmail dot com --- Comment #1 from Jonathan Hogg --- We also see similar behaviour when running task-based (as opposed to parallel for) code. When the number of tasks is much smaller than the number of cores, most time is spent in libgomp spinning. Presumably there is too much contention on the work-queue lock. We're running on 28 real cores (2x14 core intel haswell-EP chips). If we look out our task profile, we see very little of the time is spent inside our task code, and this is confirmed by profile data from perf: 27.60% spral_ssids libgomp.so.1.0.0 [.] gomp_mutex_lock_slow 6.96% spral_ssids libgomp.so.1.0.0 [.] gomp_team_barrier_wait_end 3.78% spral_ssids [kernel.kallsyms] [k] _spin_lock_irq 2.91% spral_ssids [kernel.kallsyms] [k] smp_invalidate_interrupt 2.21% spral_ssids spral_ssids [.] __CreateCoarseGraphNoMask 2.18% spral_ssids [kernel.kallsyms] [k] _spin_lock 2.05% spral_ssids libmkl_avx2.so[.] mkl_blas_avx2_dgemm_kernel_0 1.99% spral_ssids spral_ssids [.] __FM_2WayNodeRefine_OneSided 1.78% spral_ssids libgomp.so.1.0.0 [.] gomp_sem_wait_slow 1.64% spral_ssids libc-2.12.so [.] __GI_strtod_l_internal Here's an example of what we're seeing: Small problems (much less work than cores): 4 cores / 28 cores times in seconds 0.02 / 0.17 0.20 / 0.60 0.20 / 0.58 0.14 / 0.63 0.75 / 2.37 Bigger problems (sufficient work exists): 4 cores / 28 cores times in seconds 48.52 / 22.16 153.49 / 61.77 140.89 / 54.51 189.75 / 71.43