https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784

--- Comment #10 from Chen Baozi <cbz at baozis dot org> ---
I have attached the testcase I used to benchmark synchronization of OpenMP on
AArch64, which is extracted from EPCC OpenMP micro-benchmark suite.

The operating system I use is ubuntu 16.04 with 4.4.0 kernel. The hardware I
use is an experimental 16-core aarch64 platform. There are 4 clusters of cpu
cores interconnected with L3 cache, in each of which contains 4 cores. And the
thrashing seems to be more severely when the threads are distributed in one
cluster, e.g., 4 threads distributed 4 different clusters looks much better
than the case when 4 threads distributed in one cluster.

Reply via email to