Hi Nathan! On Fri, 29 Apr 2016 10:00:43 -0400, Nathan Sidwell <[email protected]> wrote: > currently automatic loop partitioning assigns from the innermost loop > outwards > -- that was the simplest thing to implement. A better algorithm is to assign > the outermost loop to the outermost available axis, and then assign from the > innermost loop outwards. That way we (generally) get gang partitioning on > the > outermost loop. Just inside that we'll get non-partitioned loops if the nest > is > too deep, and the two innermost nested loops will get worker and vector > partitioning.
> gcc/
> * omp-low.c (struct oacc_loop): Add 'inner' field.
> (new_oacc_loop_raw): Initialize it to zero.
> (oacc_loop_fixed_partitions): Initialize it.
> (oacc_loop_auto_partitions): Partition outermost loop to outermost
> available partitioning.
I'm now observing the sporadic failures (that you had mentioned before)
of libgomp.oacc-c-c++-common/atomic_capture-1.c and
libgomp.oacc-fortran/atomic_capture-1.f90. I suppose the problem is that
constructs such as libgomp.oacc-c-c++-common/atomic_capture-1.c:
fgot = 1.0;
fexp = 0.0;
#pragma acc data copy (fgot, fdata[0:N])
{
#pragma acc parallel loop
for (i = 0; i < N; i++)
{
float expr = 32.0;
#pragma acc atomic capture
fdata[i] = fgot = expr - fgot;
}
}
for (i = 0; i < N; i++)
if (i % 2 == 0)
{
if (fdata[i] != 31.0)
abort ();
}
else
{
if (fdata[i] != 1.0)
abort ();
}
... are no longer executed in stable/ascending order, and instead of the
exact "i % 2 == 0" classifier, we should now instead verify what the 31.0
and 1.0 cases each appear with probability 0.5? Are you looking into
resolving that, or should somebody else have a look?
I'm also seeing the following regression for C and C++,
libgomp.oacc-c-c++-common/loop-auto-1.c with -O2:
source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: In
function 'vector_1._omp_fn.0':
source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c:104:9:
internal compiler error: Segmentation fault
#pragma acc parallel num_workers (32) vector_length(32) copy(ary[0:size])
firstprivate (size)
^
#4 0x0000000000f73d46 in internal_error (gmsgid=gmsgid@entry=0x105be63
"%s")
at [...]/source-gcc/gcc/diagnostic.c:1270
#5 0x00000000009fccb0 in crash_signal (signo=<optimized out>)
at [...]/source-gcc/gcc/toplev.c:333
#6 <signal handler called>
#7 0x0000000000beaf2e in same_succ_flush_bb (bb=<optimized out>,
bb=<optimized out>)
at [...]/source-gcc/gcc/hash-table.h:919
#8 0x0000000000bec499 in same_succ_flush_bbs (bbs=<optimized out>)
at [...]/source-gcc/gcc/tree-ssa-tail-merge.c:823
#9 update_worklist () at [...]/source-gcc/gcc/tree-ssa-tail-merge.c:870
#10 tail_merge_optimize (todo=todo@entry=32)
at [...]/source-gcc/gcc/tree-ssa-tail-merge.c:1716
#11 0x0000000000b99057 in (anonymous namespace)::pass_pre::execute
(this=<optimized out>, fun=<optimized out>)
at [...]/source-gcc/gcc/tree-ssa-pre.c:4818
#12 0x0000000000937e9d in execute_one_pass (pass=pass@entry=0x1530970)
at [...]/source-gcc/gcc/passes.c:2348
#13 0x00000000009384b8 in execute_pass_list_1 (pass=0x1530970)
at [...]/source-gcc/gcc/passes.c:2432
#14 0x00000000009384ca in execute_pass_list_1 (pass=0x152fa10)
at [...]/source-gcc/gcc/passes.c:2433
#15 0x0000000000938515 in execute_pass_list (fn=0x7ffff69a5930,
pass=<optimized out>)
at [...]/source-gcc/gcc/passes.c:2443
#16 0x00000000005fdded in cgraph_node::expand
(this=this@entry=0x7ffff6990170)
at [...]/source-gcc/gcc/cgraphunit.c:1982
#17 0x00000000005ff8c4 in expand_all_functions ()
at [...]/source-gcc/gcc/cgraphunit.c:2118
#18 symbol_table::compile (this=0x7ffff68d2000) at
[...]/source-gcc/gcc/cgraphunit.c:2474
#19 0x0000000000561db8 in lto_main () at [...]/source-gcc/gcc/lto/lto.c:3328
#20 0x00000000009fccef in compile_file () at
[...]/source-gcc/gcc/toplev.c:463
#21 0x000000000052e5ba in do_compile () at
[...]/source-gcc/gcc/toplev.c:1987
#22 toplev::main (this=this@entry=0x7fffffffcc80, argc=argc@entry=18,
argv=0x150aec0, argv@entry=0x7fffffffcd88)
at [...]/source-gcc/gcc/toplev.c:2095
#23 0x0000000000530247 in main (argc=18, argv=0x7fffffffcd88)
at [...]/source-gcc/gcc/main.c:39
Are you seeing that, too? I can't remember seeing that on
gomp-4_0-branch, so it may be due to a recent trunk change, independent
of your omp-low change. Are you going to have a look, or want me to?
Grüße
Thomas
signature.asc
Description: PGP signature
