[Bug libgomp/98738] task-detach-6.f90 hangs intermittently

2021-03-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98738

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |kcy at codesourcery dot 
com

--- Comment #13 from Thomas Schwinge  ---
Kwok, it seems -- at least in my testing -- that your latest commit
d656bfda2d8316627d0bbb18b10954e6aaf3c88c "openmp: Fix intermittent hanging of
task-detach-6 libgomp tests [PR98738]" has broken things with nvptx offloading
enabled: because of hanging
'libgomp.c/../libgomp.c-c++-common/task-detach-6.c', I manually terminated
testing after:

  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
24507 thomas20   0 9157400 130444 122980 R 100.0  0.4  41:13.59
./task-detach-6.exe

..., and another run after:

  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
32365 thomas20   0 9159472 140040 128136 R 100.0  0.4  59:05.48
./task-detach-6.exe

I had 100 % GPU utilization in this state.

Is there something wrong (are you or anyone reproducing that with nvptx
offloading?), or is something wrong on my side?

[Bug libgomp/98738] task-detach-6.f90 hangs intermittently

2021-03-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98738

--- Comment #16 from Thomas Schwinge  ---
Ugh.  :-( Where are we tracking this, and is anyone working on this?  It's
clearly not useful to have (nvptx offloading) testing to run into known
TIMEOUTs?

[Bug target/99555] New: [OpenMP/nvptx] Execution-time hang for simple nested OpenMP 'target'/'parallel'/'task' constructs

2021-03-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99555

Bug ID: 99555
   Summary: [OpenMP/nvptx] Execution-time hang for simple nested
OpenMP 'target'/'parallel'/'task' constructs
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, kcy at codesourcery dot com,
vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

Discovered during OpenMP 'task' 'detach' development.  See PR98738,
;
when offloaded to nvptx, '-O0', the following hangs consistently:

#pragma omp target
#pragma omp parallel
#pragma omp task
  ;

This doesn't hang when offloaded to GCN or the host device, or if
'num_threads(1)' is specified on the 'parallel'.

---

Not yet determined if this is a regression, when this started.

[Bug libgomp/98738] task-detach-6.f90 hangs intermittently

2021-03-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98738

--- Comment #17 from Thomas Schwinge  ---
I've filed PR99555 "[OpenMP/nvptx] Execution-time hang for simple nested OpenMP
'target'/'parallel'/'task' constructs".

[Bug tree-optimization/90591] Avoid unnecessary data transfer out of OMP construct

2021-03-22 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90591

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|sandra at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org

--- Comment #6 from Thomas Schwinge  ---
(In reply to Richard Biener from comment #3)
> I think while relying on a robust IPA analysis and optimization framework
> sounds appealing the problem is _much_ easier solved before OMP/OACC
> lowering and I would strongly suggest to tackle the problem from that side
> if you want a
> workable solution in a timely manner.

ACK.  WIP:
.

> I realize that has a plethora of its own issues, first of all it seems
> the respective lowering is done _very_ early - aka the optimization would
> need to be part of omplower? (I see .omp_data_i constructed there)
> 
> So what you need is liveness and def/use analysis on high GIMPLE which I
> think is straight-forward enough.  You have no SSA form at your hands
> (actually SSA names can appear and there'll be use->def links but
> no immediate uses).

[Bug target/99932] New: OpenACC/nvptx offloading execution regressions starting with CUDA 11.2-era Nvidia Driver 460.27.04

2021-04-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99932

Bug ID: 99932
   Summary: OpenACC/nvptx offloading execution regressions
starting with CUDA 11.2-era Nvidia Driver 460.27.04
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

We're seeing OpenACC/nvptx offloading execution regressions (including a lot of
timeouts) starting with CUDA 11.2-era Nvidia Driver 460.27.04.  Confirmed with:
CUDA 11.2-era 460.27.04, 460.32.03, 460.39, 460.56, 460.67, and CUDA 11.3-era
465.19.01, across several variants of GPU hardware.

Explicitly (re-)confirmed good are older versions such as CUDA 9.1-era 390.12,
and CUDA 11.1-era 455.38, 455.45.01.

Most of these are in the 'vector_length > 32' testcases, but also a few others.

@@ -6147,7 +6147,7 @@ PASS:
libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c -DACC_DEVICE_T
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   (test
for warnings, line 596)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   (test
for warnings, line 618)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test
for excess errors)
[-PASS:-]{+FAIL:+}
libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2 
execution test

@@ -6581,7 +6581,8 @@ PASS:
libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c -DACC_DE
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-10.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test
for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-10.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0 
execution test
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-10.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test
for excess errors)
[-PASS:-]{+WARNING: program timed out.+}
{+FAIL:+}
libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-10.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2 
execution test
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-2.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test
for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-2.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0 
execution test
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-2.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  output
pattern test
@@ -6599,32 +6600,32 @@ PASS:
libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-3.c -DACC_DE
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-3.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  output
pattern test
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-3.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  
scan-offload-tree-dump oaccdevlow "__attribute__\\(\\(oacc function \\(1, 1,
32\\)"
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test
for excess errors)
[-PASS:-]{+WARNING: program timed out.+}
{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0 
execution test[-PASS:
libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  output
pattern test-]
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  
scan-offload-tree-dump oaccdevlow "__attribute__\\(\\(oacc function \\(1, 2,
128\\)"
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test
for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2 
execution test
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nv

[Bug middle-end/99857] [11 Regression] FAIL: libgomp.c/declare-variant-1.c (test for excess errors) by r11-7926

2021-04-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99857

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hubicka at gcc dot 
gnu.org
   Last reconfirmed|2021-04-01 00:00:00 |2021-4-6
 CC||tschwinge at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from Thomas Schwinge  ---
Honza stated that he's "looking into it",
.

---

With offloading enabled, there are more similar failures:

[-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/task-detach-6.c (test
for excess errors)

[-PASS:-]{+FAIL:+} libgomp.c/pr99555-1.c (test for excess errors)

[-PASS:-]{+FAIL:+} libgomp.c/target-42.c (test for excess errors)

[-PASS:-]{+FAIL:+} libgomp.c++/../libgomp.c-c++-common/task-detach-6.c
(test for excess errors)

[-PASS:-]{+FAIL:+} libgomp.fortran/task-detach-6.f90   -O0  (test for
excess errors)
[-PASS:-]{+FAIL:+} libgomp.fortran/task-detach-6.f90   -O1  (test for
excess errors)
[-PASS:-]{+FAIL:+} libgomp.fortran/task-detach-6.f90   -O2  (test for
excess errors)
[-PASS:-]{+FAIL:+} libgomp.fortran/task-detach-6.f90   -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions 
(test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.fortran/task-detach-6.f90   -O3 -g  (test for
excess errors)
[-PASS:-]{+FAIL:+} libgomp.fortran/task-detach-6.f90   -Os  (test for
excess errors)

..., and:

during IPA pass: modref
[...]/libgomp/testsuite/libgomp.c/target-32.c:55:1: internal compiler
error: in omp_lto_output_declare_variant_alt, at omp-general.c:2368

... seen for:

[-PASS:-]{+FAIL: libgomp.c/target-32.c (internal compiler error)+}
{+FAIL:+} libgomp.c/target-32.c (test for excess errors)

[-PASS:-]{+FAIL: libgomp.c/thread-limit-2.c (internal compiler error)+}
{+FAIL:+} libgomp.c/thread-limit-2.c (test for excess errors)

[Bug target/100001] New: [GCN offloading] Occasional C++ 'libgomp.oacc-c-c++-common/static-variable-1.c' execution failure

2021-04-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

Bug ID: 11
   Summary: [GCN offloading] Occasional C++
'libgomp.oacc-c-c++-common/static-variable-1.c'
execution failure
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, jules at gcc dot gnu.org
  Target Milestone: ---
Target: gcn

My recent commit ffa0ae6eeef3ad15d3f288283e4c477193052f1a "Add
'libgomp.oacc-c-c++-common/static-variable-1.c' [PR84991, PR84992, PR90779]"
for 'libgomp.oacc-c' -- so far ;-) -- never but for 'libgomp.oacc-c++'
occasionally/"randomly" fails with GCN offloading, for '-O0' and/or '-O2'. 
It's certainly possible that something's wrong with my verification logic, but
in quite some testing why has the failure ever only shown up for the C++ but
never the C variant?  On the other hand, why is C++ behaving different from C
at all?  (I haven't spent any time on understanding that.)

A few examples of failures with GCN offloading:

static-variable-1.exe:
[...]/libgomp/testsuite/libgomp.oacc-c++/../libgomp.oacc-c-c++-common/static-variable-1.c:355:
void t2(): Assertion `result_1_ == (((var_init_1 + num_gangs_actual_1 * (1 +
i)) * (1 + var_init_1 + num_gangs_actual_1 * (1 + i)) / 2) - ((var_init_1 +
num_gangs_actual_1 * (0 + i)) * (1 + var_init_1 + num_gangs_actual_1 * (0 + i))
/ 2))' failed.

static-variable-1.exe:
[...]/libgomp/testsuite/libgomp.oacc-c++/../libgomp.oacc-c-c++-common/static-variable-1.c:368:
void t2(): Assertion `result_2_ == (((t2_var_init_2 + num_gangs_actual_2 * (1 +
i)) * (1 + t2_var_init_2 + num_gangs_actual_2 * (1 + i)) / 2) - ((t2_var_init_2
+ num_gangs_actual_2 * (0 + i)) * (1 + t2_var_init_2 + num_gangs_actual_2 * (0
+ i)) / 2))' failed.

static-variable-1.exe:
[...]/libgomp/testsuite/libgomp.oacc-c++/../libgomp.oacc-c-c++-common/static-variable-1.c:381:
void t2(): Assertion `result_3_ == (((var_init_3 + num_gangs_actual_3 * (1 +
i)) * (1 + var_init_3 + num_gangs_actual_3 * (1 + i)) / 2) - ((var_init_3 +
num_gangs_actual_3 * (0 + i)) * (1 + var_init_3 + num_gangs_actual_3 * (0 + i))
/ 2))' failed.

I've -- so far ;-) -- not seen any failures with nvptx offloading.

[Bug middle-end/84992] [openacc] function static var in parallel

2021-04-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84992

Thomas Schwinge  changed:

   What|Removed |Added

   See Also||https://github.com/OpenACC/
   ||openacc-spec/issues/372
 Ever confirmed|0   |1
   Last reconfirmed||2021-04-09
 Status|UNCONFIRMED |SUSPENDED
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org

--- Comment #5 from Thomas Schwinge  ---
(In reply to Tom de Vries from comment #2)
> (In reply to Tom de Vries from comment #0)
> > When compiling this openacc testcase:
> > [...]

... this now works as expected.

Current behavior documented via commit ffa0ae6eeef3ad15d3f288283e4c477193052f1a
"Add 'libgomp.oacc-c-c++-common/static-variable-1.c' [PR84991, PR84992,
PR90779]".


> get this clarified with the OpenACC standard people?

See ;
waiting for OpenACC specification, thus setting "suspended" here.

[Bug middle-end/84991] [openacc] Misleading error message for function static var in routine

2021-04-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84991

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
   See Also||https://github.com/OpenACC/
   ||openacc-spec/issues/372
 Status|UNCONFIRMED |SUSPENDED
   Last reconfirmed||2021-04-09

--- Comment #3 from Thomas Schwinge  ---
(In reply to Tom de Vries from comment #0)
> The openacc spec tells us at 2.15.1. Routine Directive:
> ...
> In C and C++, function static variables are not supported in functions to
> which a routine directive applies.
> ...
> 
> When compiling such an example with -fopenacc:
> [...]

... this now works as expected.

Current behavior documented via commit ffa0ae6eeef3ad15d3f288283e4c477193052f1a
"Add 'libgomp.oacc-c-c++-common/static-variable-1.c' [PR84991, PR84992,
PR90779]".


See ;
waiting for OpenACC specification, thus setting "suspended" here.

[Bug middle-end/90115] OpenACC: predetermined private levels for variables declared in blocks

2021-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90115

Thomas Schwinge  changed:

   What|Removed |Added

 Blocks||90114

--- Comment #1 from Thomas Schwinge  ---
Very much related is privatization via the corresponding 'private' clauses, at
the respective level.  (Thus not filing a new PR for that.)

OpenACC 3.1, 2.5.12 "private clause" (similar 2.5.13 "firstprivate clause")
states that on compute constructs, "The 'private' clause [...] declares that a
copy of each item on the list will be created for each gang", and OpenACC 3.1,
2.9.10 "private clause" states:

| The 'private' clause on a 'loop' construct specifies that a copy of each item
in var-list will be created. If the body of the loop is executed in
'vector-partitioned' mode, a copy of the item is created for each thread
associated with each vector lane. If the body of the loop is executed in
'worker-partitioned' 'vector-single' mode, a copy of the item is created for
and shared across the set of threads associated with all the vector lanes of
each worker. Otherwise, a copy of the item is created for and shared across the
set of threads associated with all the vector lanes of all the workers of each
gang.


Also related is PR90114 "Predetermined private levels for variables declared in
OpenACC accelerator routines".


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90114
[Bug 90114] Predetermined private levels for variables declared in OpenACC
accelerator routines

[Bug target/100181] hot-cold partitioned code doesn't assemble

2021-04-21 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords|assemble-failure|
 Depends on|94278   |
 CC||ams at gcc dot gnu.org,
   ||burnus at gcc dot gnu.org,
   ||jules at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #5 from Thomas Schwinge  ---
Isn't that a duplicate of what I once filed as PR94278 "[amdgcn] Offloading
build failures due to 'llvm-mc' SIGSEGV" -- with "improved error reporting":
error diagnostic instead of SIGSEGV?

PR94278 includes a patch that I've locally been using ever since, but Andrew
has not been able to reproduce that problem.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94278
[Bug 94278] [amdgcn] Offloading build failures due to 'llvm-mc' SIGSEGV

[Bug target/100181] hot-cold partitioned code doesn't assemble

2021-04-21 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181

--- Comment #8 from Thomas Schwinge  ---
(In reply to Tobias Burnus from comment #7)
> (I could not reproduce the LLVM 9 issue in PR94278 back then.)

Hmm, but didn't you say in the LLVM issue
 (liked to in _See Also_ in
PR94278) that you _did_ reproduce this back then?  (Just making sure that it's
not actually a different issue that we're discussing now?)

[Bug fortran/97390] Error compiling acc data present

2020-10-13 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97390

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||burnus at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org
   Last reconfirmed||2020-10-13
 Status|UNCONFIRMED |WAITING
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org

--- Comment #1 from Thomas Schwinge  ---
Would it be the case that this line is longer than 132 characters?  After this,
"characters are ignored in typical free-form lines in the source file"; see the
'-ffree-line-length' option.  Thus, does '-ffree-line-length-none' help?

(I cannot comment on (a) why that's a useful limit to have, and (b) whether the
error reporting shouldn't be improved.)

[Bug fortran/97390] Error compiling acc data present

2020-10-13 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97390

--- Comment #3 from Thomas Schwinge  ---
Thanks for checking/confirming that.


Next try:

(In reply to afernandez from comment #0)
>  6475 |   !$acc data present([...]) async(counter+1)
>   |1
> Error: Unclassifiable OpenACC directive at (1)

> I'm not the developer and the app

Is the source code freely available?

> is supposed to
> compile smoothly with the PGI compiler.

Per the current versions of the OpenACC specification, the 'data' construct
doesn't allow an 'async' clause, so this isn't valid OpenACC code.  If you
know, can you tell (is there some documentation) how PGI are implementing this?

[Bug fortran/97390] [OpenACC] 'async' clause on 'data' construct

2020-10-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97390

Thomas Schwinge  changed:

   What|Removed |Added

   See Also||https://github.com/OpenACC/
   ||openacc-spec/issues/318
 Status|WAITING |SUSPENDED
Summary|[OpenACC] Error compiling   |[OpenACC] 'async' clause on
   |acc data present|'data' construct

--- Comment #7 from Thomas Schwinge  ---
(In reply to Tobias Burnus from comment #5)
> I claim that GCC is right: 'acc data' does not have a 'async' clause in
> OpenACC neither in 2.6 (= supported by GCC 10) nor in 3.0; see
> https://www.openacc.org/specification

Thanks for confirming.

I've filed  "'async' clause
on 'data' construct" (need to be a member to see).  Thus setting this PR to
SUSPENDED until a decision is made.

[Bug target/97436] [nvptx] -m32 support

2020-10-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97436

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org
   Last reconfirmed||2020-10-15
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Thomas Schwinge  ---
We originally had been asked to support OpenACC/nvptx offloading with both
64-bit x86_64 and 32-bit x86, but the idea of a 32-bit variant got scrapped
very early in the project.  I don't think we've ever tested that, and it
doesn't serve any practical use.  I suggest to remove.

See also PR65099.

[Bug testsuite/80219] relative line numbers only working if gcc_{error,warning}_prefix defined

2020-11-02 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80219

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||tschwinge at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-11-02

--- Comment #5 from Thomas Schwinge  ---
Fixed (only) for libgomp.

[Bug testsuite/85303] [testsuite, libgomp] dg-message not supported

2020-11-02 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85303

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
 CC||tschwinge at gcc dot gnu.org

--- Comment #5 from Thomas Schwinge  ---
Fixed.

[Bug debug/97718] New: [11 regression] Excessive GDB memory usage after GCC "Save some memory at debug stream-in time"

2020-11-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97718

Bug ID: 97718
   Summary: [11 regression] Excessive GDB memory usage after GCC
"Save some memory at debug stream-in time"
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
  Target Milestone: ---

As of commit r11-4664-g104ca9cfa60aa1d5ddd3574bed012d394e8c "Save some
memory at debug stream-in time", I notice excessive GDB memory usage for
certain testcases, for example:

  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
26165 tschwing  20   0 6514108 6.087g   5752 R  99.1 39.0   1:07.45 gdb -nx
-nw -quiet -batch -x pr54519-4.gdb ./pr54519-4.exe

(Growing much bigger, rendering the GCC testsuite unusable due to thrashing,
timeouts.)

This is on an up-to-date Ubuntu 14.04 x86_64 GNU/Linux system using:

  - GNU assembler (GNU Binutils for Ubuntu) 2.24
  - GNU ld (GNU Binutils for Ubuntu) 2.24
  - GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.3) 7.7.1

[Bug fortran/97782] New: [Fortran] Confused location information for OpenACC compute constructs

2020-11-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97782

Bug ID: 97782
   Summary: [Fortran] Confused location information for OpenACC
compute constructs
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: burnus at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Created attachment 49539
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49539&action=edit
testcase

Tobias, please see the '-fdump-tree-original-lineno' dump of the testcase I'm
attaching.  Instead of expected line 9, the '#pragma acc kernels' has location
information for line 18, which happens to be the second '#pragma acc loop'. 
The same happens for other OpenACC compute constructs.  I have not checked
OpenMP.

Amongst other things, this confuses the '-fopt-info-omp-all' checking (that
one's specific to OpenACC 'kernels').

[Bug tree-optimization/97623] [9/10/11 Regression] Extremely slow O2 compile (>>O(n^2))

2020-11-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97623

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #18 from Thomas Schwinge  ---
Noting the following in case there's something unexpected there:

(In reply to CVS Commits from comment #17)
> commit r11-4913-gbd87cc14ebdb6789e067fb1828d5808407c308b3
> Author: Richard Biener 
> Date:   Wed Nov 11 11:51:59 2020 +0100
> 
> tree-optimization/97623 - Avoid PRE hoist insertion iteration

On x84_64 GNU/Linux native, this introduced an ICE regression in
'libgomp/plugin/plugin-gcn.c' build (only! -- all other target libraries built
fine):

during GIMPLE pass: pre
[...]/libgomp/plugin/plugin-gcn.c: In function ‘run_kernel’:
[...]/libgomp/plugin/plugin-gcn.c:2066:1: internal compiler error:
Segmentation fault
 2066 | run_kernel (struct kernel_info *kernel, void *vars,
  | ^~

Program received signal SIGSEGV, Segmentation fault.
bitmap_bit_p (head=0x20, bit=879) at [...]/gcc/bitmap.c:989
989   if (!head->tree_form)
(gdb) bt
#0  bitmap_bit_p (head=0x20, bit=879) at [...]/gcc/bitmap.c:989
#1  0x01057bb3 in bitmap_set_contains_value (value_id=879, set=0x0)
at [...]/gcc/tree-ssa-pre.c:899
#2  bitmap_value_replace_in_set (set=0x0, expr=expr@entry=0x2d7f990) at
[...]/gcc/tree-ssa-pre.c:920
#3  0x01057f35 in create_expression_by_pieces
(block=block@entry=0x763a51a0, expr=expr@entry=0x2d798f0,
stmts=stmts@entry=0x7fffc440, type=) at
[...]/gcc/tree-ssa-pre.c:3003
#4  0x0105e3e9 in do_hoist_insertion (block=) at
[...]/gcc/tree-ssa-pre.c:3648
#5  insert () at [...]/gcc/tree-ssa-pre.c:3764
#6  (anonymous namespace)::pass_pre::execute (this=,
fun=0x76221da8) at [...]/gcc/tree-ssa-pre.c:4299
#7  0x00d50820 in execute_one_pass (pass=pass@entry=0x2c132f0) at
[...]/gcc/passes.c:2564
#8  0x00d51198 in execute_pass_list_1 (pass=0x2c132f0) at
[...]/gcc/passes.c:2653
#9  0x00d511aa in execute_pass_list_1 (pass=0x2c12220) at
[...]/gcc/passes.c:2654
#10 0x00d511f5 in execute_pass_list (fn=,
pass=) at [...]/gcc/passes.c:2664
#11 0x0094ae6d in cgraph_node::expand (this=0x76220dd0) at
[...]/gcc/cgraphunit.c:1829
#12 0x0094c69d in expand_all_functions () at
[...]/gcc/cgraphunit.c:1997
#13 symbol_table::compile (this=this@entry=0x767a8100) at
[...]/gcc/cgraphunit.c:2361
#14 0x0094fdde in symbol_table::compile (this=0x767a8100) at
[...]/gcc/cgraphunit.c:2274
#15 symbol_table::finalize_compilation_unit (this=0x767a8100) at
[...]/gcc/cgraphunit.c:2542
#16 0x00e866b1 in compile_file () at [...]/gcc/toplev.c:485
#17 0x00742f4c in do_compile () at [...]/gcc/toplev.c:2320
#18 toplev::main (this=this@entry=0x7fffc6e0, argc=argc@entry=62,
argv=argv@entry=0x7fffc7e8) at [...]/gcc/toplev.c:2459
#19 0x00746f87 in main (argc=62, argv=0x7fffc7e8) at
[...]/gcc/main.c:39

I find this ICE is then again cured by your follow-on commit
r11-4921-g86cca5cc14602814b98e55aae313fbe237af1b04 "Fix PRE topological
expression set sorting", but I can't easily tell if that's now all
fine/expected, or if there may be some underlying problem here, which makes the
ICE just hidden again?  If relevant, please tell if you'd like me to attach a
pre-processed source file.

[Bug tree-optimization/97970] New: [11 regression] 'gcc.dg/gomp/pr82374.c scan-tree-dump-times vect "vectorized 1 loops" 2' for 32-bit x86

2020-11-24 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97970

Bug ID: 97970
   Summary: [11 regression] 'gcc.dg/gomp/pr82374.c
scan-tree-dump-times vect "vectorized 1 loops" 2' for
32-bit x86
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, uweigand at gcc dot gnu.org
  Target Milestone: ---
Target: 32-bit x86

Seen for 32-bit x86 (x86_64-pc-linux-gnu with 'm32'):

PASS: gcc.dg/gomp/pr82374.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/gomp/pr82374.c scan-tree-dump-times vect
"vectorized 1 loops" 2

[...]/gcc.dg/gomp/pr82374.c:18:9: note: vectorized [-1-] {+0+} loops in
function.

[...]/gcc.dg/gomp/pr82374.c:24:1: note: vectorized [-1-] {+0+} loops in
function.

Per my testing, this appears with recent commit
r11-5310-gc4fa3728ab4f78984a549894e0e8c4d6a253e540 "Fix -ffast-math flags
handling inconsistencies".

(I don't know whether that's a latent issue, or whether the testcase has any
issues.)

[Bug regression/97981] New: [11 regression] 32-bit x86 'gcc.dg/atomic/c11-atomic-exec-1.c' execution test

2020-11-24 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97981

Bug ID: 97981
   Summary: [11 regression] 32-bit x86
'gcc.dg/atomic/c11-atomic-exec-1.c' execution test
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: regression
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Per my testing, for 32-bit x86 (x86_64-pc-linux-gnu '-m32'), something in
71e234a5c94ddaef4070b3a74cf6d867dfe1a24b..fbd4553d99bc918b645194da1dba9e8f5f1cdece
causes:

@@ -209352,17 +209628,17 @@ PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O0 
execution test
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O1  (test for excess errors)
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O1  execution test
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O2  (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/atomic/c11-atomic-exec-1.c   -O2  execution test
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/atomic/c11-atomic-exec-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/atomic/c11-atomic-exec-1.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  execution test
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
errors)
[-PASS:-]{+FAIL:+} gcc.dg/atomic/c11-atomic-exec-1.c   -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions 
execution test
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -O3 -g  (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/atomic/c11-atomic-exec-1.c   -O3 -g  execution
test
PASS: gcc.dg/atomic/c11-atomic-exec-1.c   -Os  (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/atomic/c11-atomic-exec-1.c   -Os  execution test

No further details in the '*.log* file.

[Bug c++/97996] New: [OMP] Missing 'omp_mappable_type' error diagnostic inside C++ template

2020-11-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97996

Bug ID: 97996
   Summary: [OMP] Missing 'omp_mappable_type' error diagnostic
inside C++ template
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc, openmp
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 49628
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49628&action=edit
'map-1_.C'

As discussed in  (for
OpenMP, and now reproduced for OpenACC, too), see XFAILs in attached testcase:
missing "error: ‘a’ referenced in target region does not have a mappable type"
diagnostics inside C++ template.

[Bug c++/98006] New: [OpenACC] 'gcc/cp/decl.c:check_goto' should consider 'flag_openacc' in addition to 'flag_openmp'?

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98006

Bug ID: 98006
   Summary: [OpenACC] 'gcc/cp/decl.c:check_goto' should consider
'flag_openacc' in addition to 'flag_openmp'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Need to locate relevant OpenMP testcases, and verify/translate them for
OpenACC.


Relatedly, how does this C++ front end checking relate to the middle end
checking done in 'gcc/omp-low.c:pass_diagnose_omp_blocks'?

[Bug c++/98007] New: [OpenACC] 'gcc/cp/semantics.c:finish_return_stmt' should consider 'flag_openacc' in addition to 'flag_openmp' for 'gcc/cp/decl.c:check_omp_return'?

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98007

Bug ID: 98007
   Summary: [OpenACC] 'gcc/cp/semantics.c:finish_return_stmt'
should consider 'flag_openacc' in addition to
'flag_openmp' for 'gcc/cp/decl.c:check_omp_return'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Need to locate relevant OpenMP testcases, and verify/translate them for
OpenACC.


Is this related to PR98006 "[OpenACC] 'gcc/cp/decl.c:check_goto' should
consider 'flag_openacc' in addition to 'flag_openmp'?"


Relatedly, how does this C++ front end checking relate to the middle end
checking done in 'gcc/omp-low.c:pass_diagnose_omp_blocks'?

[Bug fortran/98009] New: [OpenACC] 'gcc/fortran/match.c:gfc_match_type_spec' should consider 'flag_openacc' in addition to 'flag_openmp'?

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98009

Bug ID: 98009
   Summary: [OpenACC] 'gcc/fortran/match.c:gfc_match_type_spec'
should consider 'flag_openacc' in addition to
'flag_openmp'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

'gcc/fortran/match.c:gfc_match_type_spec':

/* Found leading colon in REAL::, a trailing ')' in for example
   TYPE IS (REAL), or REAL, for an OpenMP list-item.  */
if (c == ':' || c == ')' || (flag_openmp && c == ','))
 return MATCH_YES;

If that indeed doesn't apply to OpenACC, too, then let's please add some "dummy
handling" to make this explicit, to show that we did consider this.

[Bug fortran/98010] New: [OpenACC] 'gcc/fortran/options.c:gfc_post_options' should consider 'flag_openacc' in addition to 'flag_openmp'?

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98010

Bug ID: 98010
   Summary: [OpenACC] 'gcc/fortran/options.c:gfc_post_options'
should consider 'flag_openacc' in addition to
'flag_openmp'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

'gcc/fortran/options.c:gfc_post_options':

[...]
else if (!flag_automatic && flag_openmp)
  gfc_warning_now (0, "Flag %<-fno-automatic%> overwrites %<-frecursive%>
implied by "
  "%<-fopenmp%>");
else if (flag_max_stack_var_size != -2 && flag_recursive)
  gfc_warning_now (0, "Flag %<-frecursive%> overwrites
%<-fmax-stack-var-size=%d%>",
  flag_max_stack_var_size);
else if (flag_max_stack_var_size != -2 && flag_openmp)
  gfc_warning_now (0, "Flag %<-fmax-stack-var-size=%d%> overwrites
%<-frecursive%> "
  "implied by %<-fopenmp%>", flag_max_stack_var_size);
[...]
/* Implied -frecursive; implemented as -fmax-stack-var-size=-1.  */
if (flag_max_stack_var_size == -2 && flag_openmp && flag_automatic)
  {
flag_recursive = 1;
flag_max_stack_var_size = -1;
[...]


If that indeed doesn't apply to OpenACC, too, then let's please add some "dummy
handling" to make this explicit, to show that we did consider this.

[Bug fortran/98011] New: [OpenACC] 'gcc/fortran/scanner.c:load_line' should consider 'flag_openacc' in addition to 'flag_openmp' (and vice versa?)?

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98011

Bug ID: 98011
   Summary: [OpenACC] 'gcc/fortran/scanner.c:load_line' should
consider 'flag_openacc' in addition to 'flag_openmp'
(and vice versa?)?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

It's not obvious (to me, at least), why in 'gcc/fortran/scanner.c:load_line'
'flag_openmp' and 'flag_openacc' are handled differently:

/* For truncation and tab warnings, set seen_comment to false if one has
   either an OpenMP or OpenACC directive - or a !GCC$ attribute.  If
   OpenMP is enabled, use '!$' as as conditional compilation sentinel
   and OpenMP directive ('!$omp').  */
if (seen_comment && first_comment && flag_openmp && comment_ix + 1 == i
&& c == '$')
  first_comment = seen_comment = false;
if (seen_comment && first_comment && comment_ix + 4 == i)
  {
if (((*pbuf)[comment_ix+1] == 'g' || (*pbuf)[comment_ix+1] == 'G')
&& ((*pbuf)[comment_ix+2] == 'c' || (*pbuf)[comment_ix+2] == 'C')
&& ((*pbuf)[comment_ix+3] == 'c' || (*pbuf)[comment_ix+3] == 'C')
&& c == '$')
  first_comment = seen_comment = false;
if (flag_openacc
&& (*pbuf)[comment_ix+1] == '$'
&& ((*pbuf)[comment_ix+2] == 'a' || (*pbuf)[comment_ix+2] == 'A')
&& ((*pbuf)[comment_ix+3] == 'c' || (*pbuf)[comment_ix+3] == 'C')
&& (c == 'c' || c == 'C'))
  first_comment = seen_comment = false;
  }

Shouldn't this also be handled vice versa?

If that indeed is meant to be different, then let's please add some "dummy
handling"/commentary to make this explicit, to show that we did consider this.

[Bug fortran/98012] New: [OpenACC] 'gcc/fortran/scanner.c:include_line' should consider 'flag_openacc' in addition to 'flag_openmp'?

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98012

Bug ID: 98012
   Summary: [OpenACC] 'gcc/fortran/scanner.c:include_line' should
consider 'flag_openacc' in addition to 'flag_openmp'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

If that indeed is meant to be different, then let's please add some "dummy
handling"/commentary to make this explicit, to show that we did consider this.

[Bug fortran/98013] New: [OpenACC] 'gcc/fortran/trans-decl.c:gfc_generate_function_code' should consider 'flag_openacc' in addition to 'flag_openmp'?

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98013

Bug ID: 98013
   Summary: [OpenACC]
'gcc/fortran/trans-decl.c:gfc_generate_function_code'
should consider 'flag_openacc' in addition to
'flag_openmp'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

'gcc/fortran/trans-decl.c:gfc_generate_function_code':

/* Reset recursion-check variable.  */
if ((gfc_option.rtcheck & GFC_RTCHECK_RECURSION)
&& !is_recursive && !flag_openmp && recurcheckvar != NULL_TREE)
  {
gfc_add_modify (&cleanup, recurcheckvar, logical_false_node);
recurcheckvar = NULL;

If that indeed is meant to be different, then let's please add some "dummy
handling"/commentary to make this explicit, to show that we did consider this.

[Bug fortran/98014] New: [Fortran OpenACC] Empty '!$acc' continuation line rejected

2020-11-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98014

Bug ID: 98014
   Summary: [Fortran OpenACC] Empty '!$acc' continuation line
rejected
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

A while ago, I noticed that given an empty '!$acc' continuation line:

!$acc parallel &
!$acc   vector_length (1) &
!$acc ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target
openacc_nvidia_accel_selected } }

..., GCC doesn't like that:

[...]: Error: Failed to match clause at (1)

If I remember correctly (haven't verified now), same thing if placing a
continuation '&' after the sentinel '!$acc'.

If I remember correctly (haven't verified now), we do accept all this for the
corresponding OpenMP constructs, and it is legal for OpenACC and OpenMP.


Is that maybe related to PR98011 "[OpenACC] 'gcc/fortran/scanner.c:load_line'
should consider 'flag_openacc' in addition to 'flag_openmp' (and vice
versa?)?", or PR98012 "[OpenACC] 'gcc/fortran/scanner.c:include_line' should
consider 'flag_openacc' in addition to 'flag_openmp'?".

[Bug c++/98072] [9/10/11 Regression] ICE in cp_parser_omp_var_list_no_open, at cp/parser.c:34843

2020-12-01 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98072

Thomas Schwinge  changed:

   What|Removed |Added

 CC||dmalcolm at gcc dot gnu.org

--- Comment #3 from Thomas Schwinge  ---
(In reply to Arseny Solokha from comment #0)
> g++-11.0.0-alpha20201129 snapshot
> (g:bb67ad5cff58a707aaae645d4f45a913d8511c86) ICEs when compiling the
> following testcase, reduced from test/OpenMP/depobj_ast_print.cpp from the
> clang 11.0.0 test suite, w/ -fopenmp:

Thanks for doing such testing.

> % g++-11.0.0 -fopenmp -c wmttbiko.cpp
> wmttbiko.cpp: In function 'void dh(int*, int, int)':
> wmttbiko.cpp:4:65: internal compiler error: in
> cp_parser_omp_var_list_no_open, at cp/parser.c:34843
> 4 | #pragma omp depobj (pm) depend (iterator (ca = 0 : *vp), in: vp[ca])
>   | ^~
> 0x9e4b1b cp_parser_omp_var_list_no_open
>   
> /var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201129/work/gcc-11-20201129/
> gcc/cp/parser.c:34843
> 0x9ec406 cp_parser_omp_clause_depend
>   
> /var/tmp/portage/sys-devel/gcc-11.0.0_alpha20201129/work/gcc-11-20201129/

> The failing assert has been introduced in
> g:c0c7270cc4efd896fe99f8ad5409dbef089a407f.

ACK, and that means (a) missing testsuite coverage, and (b) likely mis-handling
these clauses (at least concerning validity checking, diagnostics) -- so, I'm
glad that the 'assert' has found this additional case.

(In reply to Jakub Jelinek from comment #2)
> Created attachment 49656 [details]
> gcc11-pr98072.patch
> 
> Untested fix.

ACK.  I suggest you also add the standard source code comment that we use in
other similar places.


Alternatively (separate change), I wondered whether we might actually move the
sentinel into 'cp_parser_omp_var_list_no_open', supposing that's the one only
place where this matters?  (But I haven't verified that.)


> The ultimate right fix is of course get rid of these
> sentinels from OpenMP/OpenACC parsing and deal with location wrappers when
> handling the clauses.

ACK.  (Or, get rid of (this implementation of) these explicit location wrapper
nodes.  I understand the intention behind the location wrappers, but maybe this
should be implemented differently?)

[Bug c++/98080] New: Need 'auto_suppress_location_wrappers sentinel' in 'cp_parser_omp_scan_loop_body'?

2020-12-01 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98080

Bug ID: 98080
   Summary: Need 'auto_suppress_location_wrappers sentinel' in
'cp_parser_omp_scan_loop_body'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Do we need an 'auto_suppress_location_wrappers sentinel' in
'cp_parser_omp_scan_loop_body', given its 'cp_parser_omp_var_list' usage?

Similar to PR98072; found by code inspection, don't have testcase.

[Bug c++/98081] New: Need 'auto_suppress_location_wrappers sentinel' in 'cp_parser_omp_declare_target'?

2020-12-01 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98081

Bug ID: 98081
   Summary: Need 'auto_suppress_location_wrappers sentinel' in
'cp_parser_omp_declare_target'?
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Do we need an 'auto_suppress_location_wrappers sentinel' in
'cp_parser_omp_declare_target', given its 'cp_parser_omp_var_list' usage?

Similar to PR98072; found by code inspection, don't have testcase.

[Bug target/98321] New: [nvptx] 'atom.add.f32' for atomic add of 32-bit 'float'

2020-12-16 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98321

Bug ID: 98321
   Summary: [nvptx] 'atom.add.f32' for atomic add of 32-bit
'float'
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

Consider:

TYPE f(TYPE a, TYPE b)
{
  #pragma acc atomic update
  a += b;

  return a;
}

Compiling always with '-fopenacc', for '-DTYPE=int'/'-DTYPE=long' I do see the
expected 'atom.add.u32'/'atom.add.u64', but for '-DTYPE=float' I do not see the
expected 'atom.add.f32' but instead an 'atom.cas.b32' loop.  (I understand that
'-DTYPE=double': 'atom.add.f64' depends on PTX 5.0, SM 6.0 support.)

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2020-12-17 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #35 from Thomas Schwinge  ---
In an '--enable-werror' configuration (assuming that's relevant), I'm seeing
new code from commit r11-4691-g93e79ed391b9c636f087e6eb7e70f14963cd10ad
"libstdc++: Rewrite std::call_once to use futexes [PR 66146]" fail to build:

[...]/source-gcc/libstdc++-v3/src/c++11/mutex.cc: In member function ‘void
std::once_flag::_M_finish(bool)’:
[...]/source-gcc/libstdc++-v3/src/c++11/mutex.cc:77:11: error: unused
variable ‘prev’ [-Werror=unused-variable]
   77 |   int prev = __atomic_exchange_n(&_M_once, newval,
__ATOMIC_RELEASE);
  |   ^~~~
cc1plus: all warnings being treated as errors
Makefile:648: recipe for target 'mutex.lo' failed
make[5]: *** [mutex.lo] Error 1
make[5]: Leaving directory
'[...]/build-gcc/x86_64-pc-linux-gnu/libstdc++-v3/src/c++11'

Should a '(void) prev;' be added (my current workaround), or 'prev' get some
attribute 'used' added, or something else?

[Bug target/96428] [nvptx] nvptx_gen_shuffle does not handle V2DI mode – Fails with an ICE

2020-10-01 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96428

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---
 CC||tschwinge at gcc dot gnu.org
   Last reconfirmed||2020-10-01

--- Comment #7 from Thomas Schwinge  ---
First: Tobias, Tom, thanks for fixing this issue!


(In reply to Tobias Burnus from comment #3)
> Created attachment 48988 [details]
> Test case (as diff – two files)

These attachment 48988 testcases got included in commit
344f09a756ebd50510cc1eb3db111fd61c527702.  I don't understand
'libgomp.oacc-fortran/pr96628-part1.f90':

module m2
  real*8 :: mysum
  !$acc declare device_resident(mysum)

So 'mysum' lives in device-global memory.

contains
SUBROUTINE one(t)
  !$acc routine
  REAL*8,  INTENT(IN):: t(:)
  mysum = sum(t)
END SUBROUTINE one

This now writes into device-global 'mysum', potentially from several
gang/worker/vector threads in parallel, race condition?

SUBROUTINE two(t)
  !$acc routine seq
  REAL*8, INTENT(INOUT) :: t(:)
  t = (100.0_8*t)/sum
END SUBROUTINE two
end module m2

source-gcc/libgomp/testsuite/libgomp.oacc-fortran/pr96628-part1.f90: In
function ‘__m2_MOD_two’:
source-gcc/libgomp/testsuite/libgomp.oacc-fortran/pr96628-part1.f90:18:
warning: ‘sum’ is used uninitialized [-Wuninitialized]
   18 |   t = (100.0_8*t)/sum


So, is this really testing what it means to be testing?


Should the testcase get some 'target openacc_nvidia_accel_selected'
'scan-offload-rtl-dump' added to make sure that we're actually generating the
expected PTX instructions?


Also, the testcase files should be renamed 'libgomp.oacc-fortran/pr96428-*' to
match the PR ID.


(In reply to Tom de Vries from comment #4)
> FTR, this is not the leanest solution.

> followup patch: [...]

> we have instead: [simpler]

Any plans to apply that as a follow-up?

[Bug testsuite/96519] [11 regression] new test case gcc.dg/ia64-sync-5.c fails

2020-10-01 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96519

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |kcy at codesourcery dot 
com
 CC||tschwinge at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Thomas Schwinge  ---
As far as I can tell, fixed with Kwok's patch to use explicit 'signed char'.

[Bug target/95864] [11 Regression] GCN offloading execution regressions after commit f062c3f11505b70c5275e5bc0e52f3e441f8afbc "amdgcn: Switch to HSACO v3 binary format"

2020-10-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95864

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Assignee|ams at gcc dot gnu.org |jules at gcc dot gnu.org

--- Comment #2 from Thomas Schwinge  ---
This apparently got fixed by Julian's commit
r11-3057-g3aee3aaf48be2d3d81e381690ae9dd305d8b505f "openacc: Fix mkoffload
SGPR/VGPR count parsing for HSACO v3".

[Bug middle-end/90861] OpenACC 'declare' not cleaning up for VLAs

2020-10-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90861

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |vries at gcc dot gnu.org
 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED
 CC||burnus at gcc dot gnu.org

--- Comment #8 from Thomas Schwinge  ---
Tom, thanks for resolving this!  Just one more item: let's please also update
'c-c++-common/goacc/declare-pr90861.c' (current XFAIL) to match reality.

[Bug testsuite/97168] [11 Regression] FAIL: gcc.dg/plugin/diagnostic-test-expressions-1.c, diagnostic-test-paths-2.c, location-overflow-test-1.c

2020-10-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97168

Thomas Schwinge  changed:

   What|Removed |Added

 CC||dmalcolm at gcc dot gnu.org,
   ||sandra at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org
   Last reconfirmed||2020-10-09
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |dmalcolm at gcc dot 
gnu.org

--- Comment #2 from Thomas Schwinge  ---
Heh, I just wanted to push a fix for 'gcc.dg/plugin/diagnostic-test-paths-2.c',
but the cherry-pick from my development branch turned into Git telling me
"nothing to commit": David beat me to it, who included the expected output
update in commit r11-3700-g7345c89ecb1a31ce96c6789bffc7183268a040b3 "Add
-fdiagnostics-path-format=separate-events to -fdiagnostics-plain-output".  ;-)

For reference, this update should've been part of Sandra's commit
r11-3302-g3696a50beeb73f4ded8a584e76ee16f0bde109b9 "Change C front end to emit
structured loop and switch tree nodes".


(In reply to Martin Sebor from comment #0)
> FAIL: gcc.dg/plugin/diagnostic-test-expressions-1.c
> -fplugin=./diagnostic_plugin_test_tree_expression_range.so  1 blank line(s)
> in output
> FAIL: gcc.dg/plugin/diagnostic-test-expressions-1.c
> -fplugin=./diagnostic_plugin_test_tree_expression_range.so  expected
> multiline pattern lines 550-551 not found: "   
> __builtin_types_compatible_p \\(long, int\\) \\+ f \\(i\\)\\);.*\\n 
> ~\\^~~\\n"
> FAIL: gcc.dg/plugin/diagnostic-test-expressions-1.c
> -fplugin=./diagnostic_plugin_test_tree_expression_range.so (test for excess
> errors)

These FAILs I'm not seeing in my testing.


> FAIL: gcc.dg/plugin/location-overflow-test-1.c
> -fplugin=./location_overflow_plugin.so adding '-flarge-source-files' (test
> for warnings, line 16)

But this one I still see, too.

[Bug target/98321] [nvptx] 'atom.add.f32' for atomic add of 32-bit 'float'

2020-12-17 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98321

--- Comment #2 from Thomas Schwinge  ---
Thanks for having a look.


(In reply to Tom de Vries from comment #1)
> Ok, let's first make a runnable test-case:
> ...
> $ cat src/libgomp/testsuite/libgomp.oacc-c/test.c
> [...]
> Indeed we see the cas, but that has nothing to do with support in the nvptx
> port:
> ...
> atom.cas.b32%r29, [%r25], %r22, %r28;   
> 
> ...
> 
> This appears already at ompexp on the host, where we expand:
> [...]
> This is part of a generic problem with offloading, where choices are made in
> the host compiler which are suboptimal or even unsupported in the offload
> compiler.

Yes, I'm aware of that problem -- and we should do something about it.

> Ideally this should be addressed in the host compiler.

(Strike the "ideally"?)

> It may be possible to address this in the nvptx port by trying to detect the
> unoptimal pattern and converting it to the optimal atom.add.f32.  But
> ultimately that's a workaround, and it's better to fix this at the source.

I agree; don't see much point in investing effort in such a workaround (which
doesn't sound easy either).


However, my report was specifically for the nvptx target compiler.  Just
compile with 'nvptx-gcc -fopenacc -S' the code I posed, and compare
'-DTYPE=int'/'-DTYPE=long' vs. '-DTYPE=float'.

[Bug target/98321] [nvptx] 'atom.add.f32' for atomic add of 32-bit 'float'

2020-12-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98321

--- Comment #4 from Thomas Schwinge  ---
(In reply to Tom de Vries from comment #3)
> (In reply to Thomas Schwinge from comment #2)
> > However, my report was specifically for the nvptx target compiler.  Just
> > compile with 'nvptx-gcc -fopenacc -S' the code I posed, and compare
> > '-DTYPE=int'/'-DTYPE=long' vs. '-DTYPE=float'.
> 
> Ah, I was not aware of usage of openacc beyond the offloading setup.

;-D

> For my understanding, is this just a way for you to easily reproduce some
> problem really occurring elsewhere, or is this actually used for something?

No, I don't think this has any practical use other than testing.


I had been looking into how/when PTX 'atom' is used for reductions, and first
had a look what the back end currently might emit at all, found SDIM
'atomic_fetch_add', and SF 'atomic_fetch_addsf'.  I tried to get these
used via '(void) __atomic_fetch_add (&a, b, __ATOMIC_RELAXED);', which works
fine for integer types, but 'error: operand type ‘float *’ is incompatible with
argument 1 of ‘__atomic_fetch_add’' (didn't research the rationale behind
that), so resorted to 'acc atomic'.  Further analysis to be done.  (Can
floating-point type atomic generally not be supported, given that
'__atomic_fetch_add' rejects it?  Is OMP atomic handling doing something wrong
for these even for nvptx target (real, not via offloading)?  Is something wrong
in the nvptx back end?)

This isn't important right now; I just filed the issue as I'd found it.

[Bug target/97348] [nvptx] Make -misa=sm_35 the default

2021-01-13 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97348

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #7 from Thomas Schwinge  ---
Documentation later updated in commit 91e4e16b550540723cca824b9674c7d8c43f4849
"nvptx - invoke.texi: Update default of -misa".

[Bug middle-end/90859] [OMP] Mappings for VLA different depending on 'target { c && { ! lp64 } }'

2021-01-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90859

Thomas Schwinge  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org

--- Comment #8 from Thomas Schwinge  ---
(In reply to David Edelsohn from comment #7)
> I'm seeing a new failure that this string no longer appears in the dump for
> 32 bit AIX.

According to  posts, this has been FAILing forever
for powerpc-ibm-aix7.2.0.0 (so, not a "new failure" as far as I can tell?),
ever since my 2019-06-19 r272452 '[PR90859] Document status quo for "[OMP]
Mappings for VLA different depending on 'target { c && { ! lp64 } }'"'.  The
problem here relates to front end 'sizetype' internals, for which 'c && { !
lp64 }' doesn't seem to be quite the right conditional.

In PR95002, there is discussion about the actual C/C++ front end and/or
gimplifier changes assumed to be necessary to fix this for good.  Spending time
on that would be more useful than figuring out the right testsuite-level
conditional -- but I don't have a good understanding of the relevant C/C++
language-level requirements, and the relevant front end vs. gimple-level
folding.

[Bug c/99137] ICE in gimplify_scan_omp_clauses, at gimplify.c:9833

2021-02-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99137

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords|openmp  |openacc
 Target|x86_64-pc-linux-gnu |

--- Comment #2 from Thomas Schwinge  ---
First need to clarify if this is really 'ice-on-invalid-code' or maybe
'ice-on-valid-code'?  This depends on interpretation of the comma: is 'async(1,
2)' an invalid list, or is this a C/C++ comma operator, where '1, 2' simply
evaluates to '2'?

There are other clauses that do take a list (for example: 'wait(1, 2)'), so I
would assume that the intention is not that different clauses have different
behavior regarding interpretation of the comma, so indeed 'async(1, 2)' should
be rejected at parse-time.  I have however not yet looked up what the OpenACC
specification says about this.

[Bug c/99137] ICE in gimplify_scan_omp_clauses, at gimplify.c:9833

2021-02-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99137

--- Comment #4 from Thomas Schwinge  ---
(In reply to Jakub Jelinek from comment #3)
> ice-on-invalid-code is when an error should be reported and instead of that
> the compiler crashes.
> ice-on-valid-code is when the code should compile without errors (perhaps
> with warnings, and not considering warnings promoted to errors) but the
> compiler crashes on it instead.

Sure, I understand that.  ICE is certainly bad, but I did wonder if this is
'ice-on-invalid-code' (should get error diagnostic instead of ICE), or
'ice-on-valid-code' (should accept this code; 'async(1, 2)' evaluates to
'async(2)').

> I have no idea what OpenACC says about this if anything

I've filed  "What does
'async(1, 2)' mean?" (only visible to members of OpenACC GitHub).

> in OpenMP [...]

Thanks, that makes much sense to me.

[Bug fortran/100276] [12 regression] Many failures after r12-119

2021-04-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100276

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
   Assignee|unassigned at gcc dot gnu.org  |burnus at gcc dot 
gnu.org
 Target|powerpc64-linux-gnu |
  Build|powerpc64-linux-gnu |
   Host|powerpc64-linux-gnu |
 CC|kischde at gmx dot net |tschwinge at gcc dot 
gnu.org
   Keywords||openacc

--- Comment #1 from Thomas Schwinge  ---
Yeah, sorry for that -- unfortunate difference in diagnostics for GCC build
configuration without vs. with offloading enabled.  See

and following; now resolved.

[Bug fortran/64763] [OpenACC] !$acc region not implemented

2021-05-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64763

Thomas Schwinge  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Keywords|rejects-valid   |

--- Comment #2 from Thomas Schwinge  ---
There's some '!$acc routine' vs. '!$acc region' confusion here.  ;-) The latter
is a non-standard thing (pre-dating standard OpenACC, and certainly not in wide
use, as far as I can tell), and the former has now been available for OpenACC
C/C++/Fortran for quite some time.

[Bug c/100500] New: ICE with local label in OpenACC 'loop' region

2021-05-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100500

Bug ID: 100500
   Summary: ICE with local label in OpenACC 'loop' region
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code, openacc
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Created attachment 50786
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50786&action=edit
'c-c++-common/goacc/prN.c'

Discovered in a builld of commit 3bc0d418a5d214a8ba57857656ca5c618df1a4bb
sources, both for C, C++:

[...]/prN.c: In function ‘f._omp_fn.0’:
[...]/prN.c:10:1: error: label ‘l’ in the middle of basic block 9
   10 | }
  | ^
during GIMPLE pass: fixup_cfg
[...]/prN.c:10:1: internal compiler error: verify_flow_info failed

#0  error (gmsgid=gmsgid@entry=0x1f5b4d0 "label %qD in the middle of basic
block %d") at [...]/gcc/diagnostic.c:1697
#1  0x00f0d266 in gimple_verify_flow_info () at
[...]/gcc/tree-cfg.c:5612
#2  0x00925fe4 in verify_flow_info () at [...]/gcc/cfghooks.c:267
#3  0x00d81d13 in execute_function_todo (fn=0x776ad0b8,
data=) at [...]/gcc/passes.c:2054
#4  0x00d825fa in execute_todo (flags=64) at
[...]/gcc/passes.c:2096
#5  0x00d8552f in execute_one_pass (pass=pass@entry=0x2d19f30) at
[...]/gcc/passes.c:2604
#6  0x00d85b68 in execute_pass_list_1 (pass=0x2d19f30) at
[...]/gcc/passes.c:2656
#7  0x00d85bc5 in execute_pass_list (fn=,
pass=) at [...]/gcc/passes.c:2667
#8  0x00d86c31 in do_per_function_toporder (callback=0xd85bb0
, data=0x2d19f30) at
[...]/gcc/passes.c:1773
#9  0x00d86d67 in execute_ipa_pass_list (pass=0x2d19ed0) at
[...]/gcc/passes.c:3003
#10 0x00969bbf in ipa_passes () at [...]/gcc/cgraphunit.c:2154
#11 symbol_table::compile (this=this@entry=0x77543000) at
[...]/gcc/cgraphunit.c:2289
#12 0x0096db5e in symbol_table::compile (this=0x77543000) at
[...]/gcc/cgraphunit.c:2269
#13 symbol_table::finalize_compilation_unit (this=0x77543000) at
[...]/gcc/cgraphunit.c:2537
#14 0x00ec3bce in compile_file () at [...]/gcc/toplev.c:482
#15 0x0075a4ec in do_compile () at [...]/gcc/toplev.c:2201
#16 toplev::main (this=this@entry=0x7fffd4f0, argc=argc@entry=32,
argv=argv@entry=0x7fffd608) at [...]/gcc/toplev.c:2340
#17 0x0075d0f7 in main (argc=32, argv=0x7fffd608) at
[...]/gcc/main.c:39

#0  internal_error (gmsgid=gmsgid@entry=0x1ec3d7b "verify_flow_info
failed") at [...]/gcc/diagnostic.c:1803
#1  0x009260d8 in verify_flow_info () at [...]/gcc/cfghooks.c:269
#2  0x00d81d13 in execute_function_todo (fn=0x776ad0b8,
data=) at [...]/gcc/passes.c:2054
#3  0x00d825fa in execute_todo (flags=64) at
[...]/gcc/passes.c:2096
#4  0x00d8552f in execute_one_pass (pass=pass@entry=0x2d19f30) at
[...]/gcc/passes.c:2604
#5  0x00d85b68 in execute_pass_list_1 (pass=0x2d19f30) at
[...]/gcc/passes.c:2656
#6  0x00d85bc5 in execute_pass_list (fn=,
pass=) at [...]/gcc/passes.c:2667
#7  0x00d86c31 in do_per_function_toporder (callback=0xd85bb0
, data=0x2d19f30) at
[...]/gcc/passes.c:1773
#8  0x00d86d67 in execute_ipa_pass_list (pass=0x2d19ed0) at
[...]/gcc/passes.c:3003
#9  0x00969bbf in ipa_passes () at [...]/gcc/cgraphunit.c:2154
#10 symbol_table::compile (this=this@entry=0x77543000) at
[...]/gcc/cgraphunit.c:2289
#11 0x0096db5e in symbol_table::compile (this=0x77543000) at
[...]/gcc/cgraphunit.c:2269
#12 symbol_table::finalize_compilation_unit (this=0x77543000) at
[...]/gcc/cgraphunit.c:2537
#13 0x00ec3bce in compile_file () at [...]/gcc/toplev.c:482
#14 0x0075a4ec in do_compile () at [...]/gcc/toplev.c:2201
#15 toplev::main (this=this@entry=0x7fffd4f0, argc=argc@entry=32,
argv=argv@entry=0x7fffd608) at [...]/gcc/toplev.c:2340
#16 0x0075d0f7 in main (argc=32, argv=0x7fffd608) at
[...]/gcc/main.c:39

Haven't tried very hard, but couldn't reproduce with OpenMP, so possibly a
problem in the OpenACC 'loop' handling.

[Bug libgomp/100390] FAIL: libgomp.fortran/depobj-1.f90 -O execution test

2021-05-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100390

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-05-11
 CC||burnus at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org
 Status|UNCONFIRMED |WAITING

--- Comment #3 from Thomas Schwinge  ---
Tom, are you still seeing that after PR100397 "New test case
libgomp.fortran/depobj-1.f90 fails erratically since its introduction in
r12-20" resolved with commit r12-399-g08fff201c92109b5476a4cc211c71de557ec87b1
"OpenMP/Fortran - fix pasto + testcase in depobj [PR100397]"?

[Bug testsuite/100655] New: 'g++.dg/tsan/pthread_cond_clockwait.C' FAILs due to 'pthread_cond_clockwait' missing

2021-05-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100655

Bug ID: 100655
   Summary: 'g++.dg/tsan/pthread_cond_clockwait.C' FAILs due to
'pthread_cond_clockwait' missing
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: kingoipo at gmail dot com, marxin at gcc dot gnu.org
  Target Milestone: ---

This new testcase recently added in commit
r12-794-g80b4ce1a5190ebe764b1009afae57dcef45f92c2 "TSAN: add new test" FAILs on
one of my testing systems:

+FAIL: g++.dg/tsan/pthread_cond_clockwait.C   -O0  (test for excess errors)
+UNRESOLVED: g++.dg/tsan/pthread_cond_clockwait.C   -O0  compilation failed
to produce executable
+FAIL: g++.dg/tsan/pthread_cond_clockwait.C   -O2  (test for excess errors)
+UNRESOLVED: g++.dg/tsan/pthread_cond_clockwait.C   -O2  compilation failed
to produce executable

[...]/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C: In function 'int
main()':
[...]/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C:26:5: error:
'pthread_cond_clockwait' was not declared in this scope; did you mean
'pthread_cond_wait'?

Apparently 'pthread_cond_clockwait' has been added in glibc 2.30, released
2019-08-01.

Leave it as it is, or conditionalize the testcase in some way?

[Bug target/100657] New: [GCN offloading] 'libgomp.c-c++-common/reduction-6.c' execution times out

2021-05-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100657

Bug ID: 100657
   Summary: [GCN offloading] 'libgomp.c-c++-common/reduction-6.c'
execution times out
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, burnus at gcc dot gnu.org,
jakub at gcc dot gnu.org, jules at gcc dot gnu.org
  Target Milestone: ---
Target: gcn

For GCN '-foffload=amdgcn-amdhsa=-march=gfx908' (only), the
'libgomp.c-c++-common/reduction-6.c' testcase recently added with commit
r12-614-g33b647956caa977d1ae489f9baed9cef70b4f382 "OpenMP: Fix SIMT for
complex/float reduction with && and ||" seems to consistently time out, both C
and C++ execution testing.  On other GCN/other offloading configurations it
seems almost instantaneous to execute.  I've confirmed this on both our
amd-instinct1 and amd-instinct2 systems.

[Bug middle-end/100669] [OpenACC] ICE with array-reduction variable & related issues

2021-05-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100669

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2021-05-19
   Keywords|ice-on-valid-code   |ice-on-invalid-code
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Thomas Schwinge  ---
We're currently implementing OpenACC 2.6, and only in OpenACC 2.7 we have: 1.12
"Changes from Version 2.6 to 2.7": "Arrays, subarrays and composite variables
are now allowed in 'reduction' clauses; [...]".  (Thus 'ice-on-valid-code' ->
'ice-on-invalid-code'.)


That said, of course we shouldn't run into an ICE.  I wonder if that's a
regression, or has been like that "forever"?

[Bug target/83812] nvptx-run: error getting kernel result: operation not supported on global/shared address space

2021-05-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83812

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||tschwinge at gcc dot gnu.org
   Last reconfirmed||2021-05-19

--- Comment #2 from Thomas Schwinge  ---
See also PR97444.

[Bug target/83812] nvptx-run: error getting kernel result: operation not supported on global/shared address space

2021-05-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83812

--- Comment #3 from Thomas Schwinge  ---
By the way, curious why this isn't caught at compile time ('ptxas'
verification) but only at run time (CUDA Driver/PTX JIT).

[Bug target/100678] New: [OpenACC/nvptx] 'libgomp.oacc-c-c++-common/private-atomic-1.c' FAILs (differently) in certain configurations

2021-05-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100678

Bug ID: 100678
   Summary: [OpenACC/nvptx]
'libgomp.oacc-c-c++-common/private-atomic-1.c' FAILs
(differently) in certain configurations
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jules at gcc dot gnu.org, vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

For OpenACC/nvptx offloading, the testcase
'libgomp.oacc-c-c++-common/private-atomic-1.c' that I've just pushed as commit
r12-908-g1467100fc72562a59f70cdd4e05f6c810d1fadcc "Add
'libgomp.oacc-c-c++-common/private-atomic-1.c' [PR83812]" has been expected to
fail with "operation not supported on global/shared address space" (see
PR83812).  However, I now found that on an x86_64 GNU/Linux system, Nvidia
TITAN V GPU, CUDA Driver 455.23.05, it *doesn't* fail in that way: the device
kernel execution completes normally -- but it instead returns a wrong reduction
result: zero.

At this point, it's (a) unclear whether the PR83812 restriction indeed is
supposed to be lifted for certain modern GPU hardware/SM levels/CUDA Driver
releases, and (b) what is then instead going wrong so that we don't compute the
expected reduction result.


Assuming that (a) has been done in good faith, I can see how (b) might happen
if the 'v' variable would in fact *not* be thread-private (but instead
device-global, I suppose), thus all threads atomically incrementing the
device-global variable concurrently, thus the '(v == -222 + 121)' expression
never being true?

[Bug target/100678] [OpenACC/nvptx] 'libgomp.oacc-c-c++-common/private-atomic-1.c' FAILs (differently) in certain configurations

2021-05-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100678

--- Comment #2 from Thomas Schwinge  ---
I ran into this in a different OpenACC context (OpenACC privatization levels),
where in testcases we're trying to use 'atomic' on 'private' variables.  ...
which for nvptx offloading only works for gang-private (PTX '.shared'), but
fails for everything else, per PR83812 ("old" failure mode).

(In reply to Tom de Vries from comment #1)
> I'm not sure what you aim to achieve with the test-case.

To document the current failure mode -- a deficiency in the OpenACC/nvptx
implementation.  So that there is precedence for this, and it doesn't appear
just in the upcoming OpenACC privatization levels testcases.


And then, as reported in this issue here, I found the "wrong reduction result"
problem ("new" failure mode), which seems to be another deficiency in the
OpenACC/nvptx implementation.


> My inclination would be to skip it for nvptx, which AFAIU is opposite to
> your intent when adding it.  So, perhaps just remove it?

I'd rather conditionalize the nvptx offloading XFAIL appropriately for both the
"old" and "new" failure modes, and then eventually un-XFAIL, once these issues
have been addressed.

[Bug other/100695] New: Format decoder, quoting in 'dump_printf' etc.

2021-05-20 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100695

Bug ID: 100695
   Summary: Format decoder, quoting in 'dump_printf' etc.
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: dmalcolm at gcc dot gnu.org
  Target Milestone: ---

For user-visible '-fopt-info' diagnostics ('dump_printf' etc.), I'm trying to
pretty-print a TREE DECL, using the '%T' format code.  As in other compiler
diagnostics, I'd like the output to be quoted, so I tried using '%qT' but that
prints the same as '%T' (unexpected?).  Using '%<%T%>' prints the desired
output -- but GCC bootstrap doesn't like that one:

[...]/gcc/omp-low.c:10171:33: error: ‘T’ conversion used within a quoted
sequence [-Werror=format=]
10171 |"variable %<%T%> ", decl);
  | ^

I'll work around that, but something seems inconsistent here?

[Bug target/107453] [13 Regression] New stdarg tests in r13-3549-g4fe34cdcc80ac2 fail

2023-01-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107453

--- Comment #8 from Thomas Schwinge  ---
(In reply to Jakub Jelinek from comment #7)
> No testing on nvptx

Thanks, confirming fixed for nvptx target, too.

[Bug fortran/108349] LTO mismatch for __builtin_realloc between glibc and gfortran frontend

2023-01-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108349

Thomas Schwinge  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org,
   ||vries at gcc dot gnu.org

--- Comment #6 from Thomas Schwinge  ---
As nvptx target is known to be sensitive to such mismatches (outside of the LTO
context reported here), I individually did test this commit
r13-5100-g0986c351aa8a9f08b3cb614baec13564dd62c114 "fortran: Fix up function
types for realloc and sincos{,f,l} builtins [PR108349]", and found that it also
resolves the following nvptx target compilation failures:

'gfortran.dg/pr35662.f90':

ptxas /tmp/ccYNgEEN.o, line 44; error   : Illegal operand type to
instruction 'st'
ptxas /tmp/ccYNgEEN.o, line 51; error   : Type of argument does not match
formal parameter '%in_ar0'
ptxas /tmp/ccYNgEEN.o, line 51; error   : Alignment of argument does not
match formal parameter '%in_ar0'
ptxas /tmp/ccYNgEEN.o, line 44; error   : Unknown symbol '%stack'
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status

'gfortran.fortran-torture/compile/pr37236.f':

ptxas [...]/build-gcc/gcc/testsuite/gfortran/pr37236.o, line 269; error   :
Illegal operand type to instruction 'st'
ptxas [...]/build-gcc/gcc/testsuite/gfortran/pr37236.o, line 275; error   :
Type of argument does not match formal parameter '%in_ar0'
ptxas [...]/build-gcc/gcc/testsuite/gfortran/pr37236.o, line 269; error   :
Unknown symbol '%stack'
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status

These are now all-PASS.


In the nvptx target gfortran test suite logs remain however dozens more similar
instances.  I've not checked if what's underlying those would also be exposing
the same kind of LTO problem.

[Bug modula2/108373] New: Update 'contrib/gcc_update:files_and_dependencies' for Modula-2

2023-01-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108373

Bug ID: 108373
   Summary: Update 'contrib/gcc_update:files_and_dependencies' for
Modula-2
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Given the amount of generated Auto* files that it brought in, I suppose we need
to update 'contrib/gcc_update:files_and_dependencies' for Modula-2.

[Bug tree-optimization/108377] New: Unexpected 'exceeds maximum object size' diagnostic, wrong-code?

2023-01-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108377

Bug ID: 108377
   Summary: Unexpected 'exceeds maximum object size' diagnostic,
wrong-code?
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Created attachment 54249
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54249&action=edit
1.c

Am I confused (it's late), or is GCC?  For '-O2' and higher:

1.c: In function ‘f’:
1.c:22:12: warning: argument 1 value ‘18446744073709551615’ exceeds maximum
object size 9223372036854775807 [-Walloc-size-larger-than=]
   22 |   needle = __builtin_malloc(n); /* { dg-bogus {exceeds maximum
object size} } */
  |^~~
1.c:22:12: note: in a call to built-in allocation function
‘__builtin_malloc’

Manually reduced from some other test case.

Same issue for actual 'malloc', and 'size_t'.

This supposedly bogus 'needle' diagnostic disappears if I disable the
'haystack' allocation of 'n + 1'.

Actually, is this wrong-code?

1.c.128t.sra:  _2 = __builtin_malloc (_1);
1.c.128t.sra:  _5 = __builtin_malloc (n_14);

1.c.129t.thread1:  _2 = __builtin_malloc (_1);
1.c.129t.thread1:  _10 = __builtin_malloc (n_14);
1.c.129t.thread1:  _5 = __builtin_malloc (n_14);

1.c.130t.dom2:  _2 = __builtin_malloc (_1);
1.c.130t.dom2:  _10 = __builtin_malloc (18446744073709551615);
1.c.130t.dom2:  _5 = __builtin_malloc (n_14);

[...]

1.c.194t.fre5:  _2 = __builtin_malloc (_1);
1.c.194t.fre5:  _10 = __builtin_malloc (18446744073709551615);
1.c.194t.fre5:  _5 = __builtin_malloc (n_14);

1.c.195t.thread2:  _2 = __builtin_malloc (_1);
1.c.195t.thread2:  _10 = __builtin_malloc (18446744073709551615);
1.c.195t.thread2:  _33 = __builtin_malloc (n_14);
1.c.195t.thread2:  _5 = __builtin_malloc (n_14);

1.c.196t.dom3:  _2 = __builtin_malloc (_1);
1.c.196t.dom3:  _10 = __builtin_malloc (18446744073709551615);
1.c.196t.dom3:  _33 = __builtin_malloc (i_51);
1.c.196t.dom3:  _5 = __builtin_malloc (0);

[...]

1.c.254t.optimized:  _2 = __builtin_malloc (_1);
1.c.254t.optimized:  _10 = __builtin_malloc (18446744073709551615);
1.c.254t.optimized:  _33 = __builtin_malloc (i_51);
1.c.254t.optimized:  _5 = __builtin_malloc (0);

[Bug tree-optimization/108377] Unexpected 'exceeds maximum object size' diagnostic, wrong-code?

2023-01-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108377

--- Comment #1 from Thomas Schwinge  ---
That's x86_64-pc-linux-gnu at today's commit
de99049f6fe5341024d4d939ac50d063280f90db.

[Bug tree-optimization/108377] Unexpected 'exceeds maximum object size' diagnostic, wrong-code?

2023-01-13 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108377

--- Comment #4 from Thomas Schwinge  ---
Thanks, Andrew, for looking into this.

(In reply to Andrew Pinski from comment #2)
> If calc_n(259) returns (__SIZE_TYPE__)-1 (aka 18446744073709551615) [...]
> the warning is correct ([...]) in some sense of correctness.

So, when GCC makes such "special" assumptions, should that really lead into a
generic diagnostic being emitted (as reported), where that diagnostic does not
spell out those assumptions?

[Bug tree-optimization/108392] New: Unexpected optimization in presence of '__attribute__((noipa))'

2023-01-13 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108392

Bug ID: 108392
   Summary: Unexpected optimization in presence of
'__attribute__((noipa))'
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Consider:

__attribute__((noipa))
int f(int n)
{
  if (n < 0)
n = f(-n);
  return n;
}

For x86_64-pc-linux-gnu at recent commit
de99049f6fe5341024d4d939ac50d063280f90db, for '-O2' I see:

--- 1.c.044t.phiopt1   2023-01-13 12:26:20.196625109 +0100
+++ 1.c.045t.tailr12023-01-13 12:26:20.196625109 +0100
[...]
 __attribute__((noipa, noinline, noclone, no_icf))
 int f (int n)
 {
   int _1;

:
-  gimple_cond 
+  # gimple_phi 
+  gimple_cond 
 goto ; [INV]
   else
 goto ; [INV]

:
-  gimple_assign 
-  gimple_call 
+  gimple_assign 
+  goto ; [INV]

:
-  # gimple_phi 
+  # gimple_phi 
   gimple_return 

 }

I was assuming that '__attribute__((noipa))' would inhibit optimizations such
as replacing a 'gimple_call' by knowledge of the body of the callee?

(This caused confusion in a different test case, where I didn't expect the
recursive function call to disappear.)

[Bug fortran/108558] New: OpenMP/Fortran 'has_device_addr' clause getting lost?

2023-01-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108558

Bug ID: 108558
   Summary: OpenMP/Fortran 'has_device_addr' clause getting lost?
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

It's certainly possible that I'm doing something wrong here (first use of
OpenMP 'has_device_addr' clause), but please consider:

subroutine vectorAdd(a, b, M)
  implicit none
  integer(4)::a(M), b(M)
  integer:: i, M
  !$omp target teams distribute parallel do has_device_addr(a, b)
  do i = 1, M
b(i) = a(i) + b(i)
  end do
end subroutine

(..., which is called from a '!$omp target data use_device_addr(a, b)' inside a
'!$omp target data map(tofrom:a(1:M), b(1:M))'.)

If I 'diff' the '-fopenmp -fdump-tree-all' without vs. with the
'has_device_addr(a, b)' clause, I -- unexpectedly -- get no differences (aside
from minor ones due to what seems to be different order of compiler
temporaries):

'pr.f90.005t.original':

#pragma omp target
[...]

(Decomposed combined construct.  Is that perhaps where the problem lies?)

'pr.f90.006t.gimple':

#pragma omp target num_teams(0) thread_limit(0) firstprivate(m)
map(tofrom:*b [len: D.4283][implicit]) map(alloc:b [pointer assign, bias: 0])
map(tofrom:*a [len: D.4280][implicit]) map(alloc:a [pointer assign, bias: 0])
[...]

That is, 'map' instead of 'has_device_addr'.


In contrast, for the translated C code:

void vectorAdd(int *a, int *b, int M)
{
  #pragma omp target teams distribute parallel for has_device_addr(a, b)
  for (int i = 1; i < M; ++i)
b[i] = a[i] + b[i];
}

..., I see the expected 'diff' of 'pr.c.005t.original':

-  #pragma omp target
+  #pragma omp target has_device_addr(a) has_device_addr(b)

..., and 'diff' of 'pr.c.006t.gimple':

-  #pragma omp target num_teams(0) thread_limit(0) firstprivate(M)
map(alloc:MEM[(char *)b] [len: 0]) map(firstprivate:b [pointer assign, bias:
0]) map(alloc:MEM[(char *)a] [len: 0]) map(firstprivate:a [pointer assign,
bias: 0])
+  #pragma omp target num_teams(0) thread_limit(0) has_device_addr(a)
has_device_addr(b) firstprivate(M)

(Have not examined that one any further.)


Cross-checking with corresponding OpenACC/Fortran 'deviceptr' clause ('!$acc
parallel loop deviceptr(a, b)'), that seems to work as expected (from a quick
look, not futher examined):

'pr.f90.005t.original':

-#pragma acc parallel
+#pragma acc parallel map(force_deviceptr:*a) map(alloc:a [pointer
assign, bias: 0]) map(force_deviceptr:*b) map(alloc:b [pointer assign, bias:
0])

'pr.f90.006t.gimple':

-#pragma omp target oacc_parallel firstprivate(D.4291) map(tofrom:*b
[len: D.4298]) map(alloc:b [pointer assign, bias: 0]) map(tofrom:*a [len:
D.4295]) map(alloc:a [pointer assign, bias: 0])
+a.8_9 = a;
+b.9_10 = b;
+#pragma omp target oacc_parallel map(force_deviceptr:(*a.8_9) [len:
D.4298]) map(alloc:a [pointer assign, bias: 0]) map(force_deviceptr:(*b.9_10)
[len: D.4295]) map(alloc:b [pointer assign, bias: 0]) firstprivate(D.4291)


That's with GCC based on fairly recent commit
de99049f6fe5341024d4d939ac50d063280f90db (2023-01-11).

[Bug fortran/108558] OpenMP/Fortran 'has_device_addr' clause getting lost?

2023-01-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108558

--- Comment #2 from Thomas Schwinge  ---
(In reply to Tobias Burnus from comment #1)
> I bet that this is a problem of 'gfc_split_omp_clauses': [...]

Heh, so indeed as I suspected:

(In reply to myself from comment #0)
> (Decomposed combined construct.  Is that perhaps where the problem lies?)

:-)

With your patch (thanks!) applied, I do get what I suspect are the expected
changes:

'pr.f90.005t.original':

-  #pragma omp target
+  #pragma omp target has_device_addr(a) has_device_addr(b)

'pr.f90.006t.gimple':

-  #pragma omp target num_teams(0) thread_limit(0) firstprivate(m)
map(tofrom:*b [len: D.4283][implicit]) map(alloc:b [pointer assign, bias: 0])
map(tofrom:*a [len: D.4280][implicit]) map(alloc:a [pointer assign, bias: 0])
+  #pragma omp target num_teams(0) thread_limit(0) has_device_addr(a)
has_device_addr(b) firstprivate(D.4283) firstprivate(D.4280) firstprivate(m)

..., and my original test case behaves as expected; OpenMP/Fortran
'has_device_addr' works.

[Bug fortran/85364] Fortran '-fopenmp'/'-fopenacc' should not put array in program on the stack

2023-01-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85364

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords||openacc
   Last reconfirmed|2018-04-13 00:00:00 |2023-1-27
Summary|-fopenmp should not put |Fortran
   |array in program on the |'-fopenmp'/'-fopenacc'
   |stack   |should not put array in
   ||program on the stack
 CC||burnus at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #7 from Thomas Schwinge  ---
This issue is likewise applicable to '-fopenacc'; see commit
r11-5571-gf4e7ea81d1369d4d6cb6d8e440aefb3407142e05 "Fortran: -fno-automatic and
-fopenacc / recusion check cleanup",
.

[Bug sanitizer/108106] [13 Regression] /usr/bin/ld: .libs/hwasan_setjmp_x86_64.o: relocation R_X86_64_PC32 against symbol `__interceptor_sigsetjmp' can not be used when making a shared object; recompi

2023-02-01 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108106

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hjl at gcc dot gnu.org
 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   See Also||https://github.com/llvm/llv
   ||m-project/issues/60426

--- Comment #12 from Thomas Schwinge  ---
Thanks, H.J. Lu!

[Bug rtl-optimization/108713] New: ICE during RTL pass: into_cfglayout for x86_64-pc-linux-gnu '-m32', C++ 'libgomp.c-c++-common/for-11.c'

2023-02-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108713

Bug ID: 108713
   Summary: ICE during RTL pass: into_cfglayout for
x86_64-pc-linux-gnu '-m32', C++
'libgomp.c-c++-common/for-11.c'
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

With GCC sources based on 2023-01-27 commit
84eb39556cc8449e04b5f48bd5c131941a7a2529, with a bunch of local OMP changes on
top (but those shouldn't be touching the relevant area of code), standard
bootstrap build, I've observed an ICE as follows on our x86_64-pc-linux-gnu
testing system amd_ryzen1, in routine libgomp testing for '-m32':

[...]
spawn -ignore SIGHUP gcc -x c++
../source-gcc/libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c
-m32 -foffload-options=amdgcn-amdhsa=-march=gfx900
-I../source-gcc/libgomp/testsuite/../../include
-I../source-gcc/libgomp/testsuite/.. -fmessage-length=0
-fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -O2 -lstdc++ -lm
-o ./for-11.exe
during RTL pass: into_cfglayout
In file included from
../source-gcc/libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-1.h:13,
 from
../source-gcc/libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c:114,
 from
../source-gcc/libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c:4:
   
../source-gcc/libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.h: In
function 'f34_ttdpfs_ds128_auto() [clone ._omp_fn.1]':
   
../source-gcc/libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.h:520:17:
internal compiler error: Segmentation fault
0x16a482f crash_signal
[...]/source-gcc/gcc/toplev.cc:314
0x268a3e2 compact_blocks()
[...]/source-gcc/gcc/cfg.cc:182
0x26951cf cleanup_cfg(int)
[...]/source-gcc/gcc/cfgcleanup.cc:3131
0x11b682a execute
[...]/source-gcc/gcc/cfgrtl.cc:3703
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
compiler exited with status 1
FAIL: libgomp.c++/../libgomp.c-c++-common/for-11.c (internal compiler
error: Segmentation fault)
[...]

Nothing interesting in 'dmesg', as far as I can tell.

'gcc/cfgrtl.cc':

3693 class pass_into_cfg_layout_mode : public rtl_opt_pass
3694 {
3695 public:
3696   pass_into_cfg_layout_mode (gcc::context *ctxt)
3697 : rtl_opt_pass (pass_data_into_cfg_layout_mode, ctxt)
3698   {}
3699 
3700   /* opt_pass methods: */
3701   unsigned int execute (function *) final override
3702 {
3703   cfg_layout_initialize (0);

'gcc/cfgcleanup.cc':

3109 bool
3110 cleanup_cfg (int mode)
3111 {
[...]
3131   compact_blocks ();

'gcc/cfg.cc':

167 void
168 compact_blocks (void)
169 {
170   int i;
171 
172   SET_BASIC_BLOCK_FOR_FN (cfun, ENTRY_BLOCK, ENTRY_BLOCK_PTR_FOR_FN
(cfun));
173   SET_BASIC_BLOCK_FOR_FN (cfun, EXIT_BLOCK, EXIT_BLOCK_PTR_FOR_FN
(cfun));
174 
175   if (df)
176 df_compact_blocks ();
177   else
178 {
179   basic_block bb;
180 
181   i = NUM_FIXED_BLOCKS;
182   FOR_EACH_BB_FN (bb, cfun)
183 {
184   SET_BASIC_BLOCK_FOR_FN (cfun, i, bb);
185   bb->index = i;
186   i++;
187 }

This is the first/only time I've seen this; the ICE doesn't reproduce now for a
few manual re-invocations.  Maybe just "cosmic rays"...

A run with '-wrapper valgrind' did find a lot of stuff in IRA and LRA, but it's
a GCC build without '--enable-valgrind-annotations', so I'm not sure what that
means.

[...]
==24856== Conditional jump or move depends on uninitialised value(s)
==24856==at 0x14B18E5: mark_pseudo_regno_live(int) (in
[...]/install/libexec/gcc/x86_64-pc-linux-gnu/13.0.1/cc1plus)
==24856==by 0x14B3260: process_bb_node_lives(ira_loop_tree_node*) (in
[...]/install/libexec/gcc/x86_64-pc-linux-gnu/13.0.1/cc1plus)
==24856==by 0x1491F28: ira_traverse_loop_tree(bool,
ira_loop_tree_node*, void (*)(ira_loop_tree_node*), void
(*)(ira_loop_tree_node*)) (in
[...]/install/libexec/gcc/x86_64-pc-linux-gnu/13.0.1/cc1plus)
==24856==by 0x14B45BF: ira_create_allocno_live_ranges() (in
[...]/install/libexec/gcc/x86_64-pc-linux-gnu/13.0.1/cc1plus)
==24856==by 0x1496424: ira_build() (in
[...]/install/libexec/gcc/x86_64-pc-linux-gnu/13.0.1/cc1plus)
==24856==by 0x148DB0E: (anonymous
namespace)::pass_ira::execute(function*) (in

[Bug rtl-optimization/108713] ICE during RTL pass: into_cfglayout for x86_64-pc-linux-gnu '-m32', C++ 'libgomp.c-c++-common/for-11.c'

2023-02-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108713

--- Comment #2 from Thomas Schwinge  ---
(In reply to Jakub Jelinek from comment #1)
> If you can reproduce it with vanilla trunk, it is worth it, sure, but
> without a reliable reproducer there isn't much to do.

I'll attempt to reproduce with clean sources, and Valgrind enabled.

> Is the ICE in an
> offload compiler or on the host?

Host; as of "Remove support for Intel MIC offloading", there's no offloading
configurations anymore for '-m32'...  ;-\

[Bug c++/87656] Useful flags to enable with -Wall or -Wextra

2023-02-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87656

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #19 from Thomas Schwinge  ---
(In reply to David Binderman from comment #6)
> I'd like to vote for -Wduplicated-cond being in either -Wextra or -Wall.
> 
> [...] it is proving useful in finding bugs [...]

Generally ACK, but note that '-Wduplicated-cond' once has been in '-Wall', but
then again was removed; PR67819,
.

[Bug c/108753] New: '-Wduplicated-cond' doesn't diagnose duplicated subexpressions

2023-02-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108753

Bug ID: 108753
   Summary: '-Wduplicated-cond' doesn't diagnose duplicated
subexpressions
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: mpolacek at gcc dot gnu.org
  Target Milestone: ---

Created attachment 54448
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54448&action=edit
pr.c

Shouldn't '-Wduplicated-cond' be able to diagnose the XFAILed duplicated
subexpressions?  (In the attached 'pr.c', 'f2' is reduced from real-world
code.)

This works:

if (a == 5) // { dg-note {previously used here} }
  return 30;
else if (a == 5) // { dg-warning {duplicated 'if' condition} }
  return 40;

..., but this and similar ones don't:

if (a == 5) // { dg-note {previously used here} TODO { xfail *-*-* } }
  return 30;
else if (a == 5 // { dg-warning {duplicated 'if' condition} TODO { xfail
*-*-* } }
 || a == 6)
  return 40;

[Bug translation/108890] Translation mistakes 2023

2023-02-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108890

Thomas Schwinge  changed:

   What|Removed |Added

   See Also||https://github.com/Rust-GCC
   ||/gccrs/issues/1916
 CC||gcc-rust at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #2 from Thomas Schwinge  ---
I've filed as *good-first-pr* 
'Diagnostics and option help texts: GCC PR108890 "Translation mistakes 2023"'.

[Bug libgomp/108895] [13.0.1 (exp)] Fortran + gfx90a !$acc update device produces a segfault.

2023-02-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108895

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2023-02-23
 CC||burnus at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Thomas Schwinge  ---
Thanks for the detailed report.

(In reply to Henry Le Berre from comment #0)
> > program p_main
> > 
> > real(kind(0d0)), allocatable, dimension(:) :: arrs
> > !$acc declare create(arrs)
> > 
> > allocate(arrs(1000))
> > !$acc enter data create(arrs(1000))
> > !$acc update device(arrs(1:1000))
> > 
> > end program
> 
> Compiled with:
> 
> > gfortran -g -fopenacc -foffload-options=-march=gfx90a sample.f90 -o sample
> 
> Produces:
> 
> > [hberre3@102:instinct]:gcc-acc-test $ ./sample
> > 
> > Program received signal SIGSEGV: Segmentation fault - invalid memory 
> > reference.
> >
> > Backtrace for this error:
> > [...]

Confirmed for GCN offloading.

For nvptx offloading, we get "similarly":

libgomp: cuMemGetAddressRange_v2 error: named symbol not found

libgomp: Copying of host object [0x10b6090..0x10b7fd0) to dev object
[0x7fb75effe2c8..0x7fb75f000208) failed

For nvfortran 23.1-0, there's no execution failure.

> Observations:
> 
> 1) If the length/size of the array were smaller (say 10 or 100) no
> segmentation fault is observed, possibly indicating silent R/W operations to
> memory we don't own.
> 
> 2) On ORNL Summit's GCC 8.3.1 (nvptx), this sample does not produce a
> segfault.

Can't really recommend using five years old GCC 8 for code offloading anymore,
but as a cross-check that's been valuable, of course.

I see this execution failure also for GCC 12, 11, 10 -- don't have 9, 8 builds
around anymore.  Not sure if it's a regression really, of if this GCC 8 just
didn't run into this for other reasons.

I however do see no execution failure for devel/omp/gcc-12, devel/omp/gcc-11,
devel/omp/gcc-10 branches nvptx offloading builds.

This may be due to different code paths being taken as the latter branches
contain preliminary support for OpenACC "Changes from Version 2.0 to 2.5": "The
'declare create' directive with a Fortran 'allocatable' has new behavior",
which GCC release branches and master branch don't support yet.  (As I also
mentioned in your PR106643 "[gfortran + OpenACC] Allocate in module causes
refcount error", for example.)  However, that's no excuse for the execution
failure seen here, of course.

Will need to investigate.

> 3) If I translate this sample to C, no matter how large the array is, a
> segfault is not produced.

Very likely it's due to some Fortran-specific code paths.

[Bug libgomp/90596] 'GOACC_parallel_keyed' should use 'GOMP_MAP_VARS_TARGET'

2023-03-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90596

Thomas Schwinge  changed:

   What|Removed |Added

   Target Milestone|--- |13.0
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org

--- Comment #2 from Thomas Schwinge  ---
.

[Bug target/100001] [GCN offloading] Occasional 'libgomp.oacc-c-c++-common/static-variable-1.c' execution failure

2023-03-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
Summary|[GCN offloading] Occasional |[GCN offloading] Occasional
   |C++ |'libgomp.oacc-c-c++-common/
   |'libgomp.oacc-c-c++-common/ |static-variable-1.c'
   |static-variable-1.c'|execution failure
   |execution failure   |
 Ever confirmed|0   |1
   Last reconfirmed||2023-03-14

--- Comment #1 from Thomas Schwinge  ---
At some point, the 'libgomp.oacc-c-c++-common/static-variable-1.c' execution
test also started FAILing for C, not just C++, as initially noted above.

[Bug target/100001] [GCN offloading] Occasional 'libgomp.oacc-c-c++-common/static-variable-1.c' execution failure

2023-03-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11

--- Comment #2 from Thomas Schwinge  ---
With my recent commit r13-6590-gf8332e52a498df480f72303de32ad0751ad899fe "Use
'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]", the frequency
of those FAILs has increased to (almost) always.

[Bug libgomp/93030] [OpenACC] libgomp.oacc-c-c++-common/deep-copy-10.c FAILS on AMDGCN – invalid 'async' usage?

2023-03-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93030

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed|2021-07-27 00:00:00 |2023-3-14
   Keywords||openacc

--- Comment #3 from Thomas Schwinge  ---
With my recent commit r13-6590-gf8332e52a498df480f72303de32ad0751ad899fe "Use
'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]", the frequency
of those FAILs has increased.


(In reply to Thomas Schwinge from comment #2)
> See  for 
> the latest discussion on this one.

That I still need to look into.

[Bug libgomp/109124] New: 'libgomp.oacc-c-c++-common/data-2-lib.c', 'libgomp.oacc-c-c++-common/data-2.c' GCN offloading execution test FAILs

2023-03-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109124

Bug ID: 109124
   Summary: 'libgomp.oacc-c-c++-common/data-2-lib.c',
'libgomp.oacc-c-c++-common/data-2.c' GCN offloading
execution test FAILs
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, jakub at gcc dot gnu.org, jules at 
gcc dot gnu.org
  Target Milestone: ---

With my recent commit r13-6590-gf8332e52a498df480f72303de32ad0751ad899fe "Use
'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]", the
'libgomp.oacc-c-c++-common/data-2-lib.c', 'libgomp.oacc-c-c++-common/data-2.c'
GCN offloading execution test started FAILing (almost) always.  (Exception:
(default) '-march=fiji', but probably rather due to differences in
hardware/parallelism capabilities?)


There's been some discussion about those test cases in

"Improve async serialize implementation for AMD GCN libgomp plugin" that I need
to look into.

[Bug middle-end/109128] [Offload][OpenMP][OpenACC] Static linking with unused offload function will lead to mismatch number of offload fn/symbols

2023-03-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109128

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2023-03-14
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||tschwinge at gcc dot gnu.org

--- Comment #1 from Thomas Schwinge  ---
See also

"Allow the accelerator to have more offloaded functions than the host".

[Bug target/100657] [GCN offloading] 'libgomp.c-c++-common/reduction-6.c' execution times out

2022-02-21 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100657

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #1 from Thomas Schwinge  ---
Same timeouts then also seen for 'libgomp.c-c++-common/reduction-5.c', again
only for gfx908.

But now, I'm happy to report that these issues disappeared after we got the
gfx908 GPU cards replaced, after they'd generally gotten very unreliable.

[Bug tree-optimization/104717] [9/10/11/12 Regression] ICE: verify_ssa failed (Error: type mismatch between an SSA_NAME and its symbol)

2022-02-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104717

--- Comment #2 from Thomas Schwinge  ---
Reproduced, thanks for the report.


The problem disappears when adding '-fno-ipa-pta' to '-O1 -fopenacc
-fstack-arrays'.  Not yet analyzed the differences.


The problem does not reproduce with '-O1 -fopenmp -fipa-pta -fstack-arrays' and
code changes as per:

-!$acc parallel copyout(array)
+!$omp target map(from:array)
 array = [(-i, i = 1, nn)]
-!$acc end parallel
+!$omp end target

..., so really points to something specific to OpenACC handling.

[Bug middle-end/104757] [12 Regression] ICE (segfault) GIMPLE pass: walloca - in gimple_range_global

2022-03-02 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104757

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-03-02
 CC||tschwinge at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Thomas Schwinge  ---
Confirmed ever since commit 48c6cac9caea1dc7c5f50ad3a736f6693e74a11b
"Fortran/openmp: Fix '!$omp end'", but only in an offloading-enabled build but
not without offloading.  (Specifically nvptx offloading, as you've now
determined.)

[Bug middle-end/104757] [12 Regression][OpenMP] ICE in GIMPLE pass: walloca - in gimple_range_global / segfault as SSA_NAME_DEF_STMT is NULL for 'if' clause arg

2022-03-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104757

Thomas Schwinge  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Thomas Schwinge  ---
(In reply to Jakub Jelinek from comment #12)
> not closing this just yet until confirmed it works for offloading compiler 
> too.

ACK, thanks.  New test cases PASS, and additionally the expected:

[-FAIL: gfortran.dg/gomp/clauses-1.f90   -O  (internal compiler error:
Segmentation fault)-]
[-FAIL:-]{+PASS:+} gfortran.dg/gomp/clauses-1.f90   -O  (test for excess
errors)

[Bug middle-end/104774] New: OpenACC 'kernels' decomposition: internal compiler error: 'verify_gimple' failed, with 'loop' with explicit 'seq' or 'independent'

2022-03-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104774

Bug ID: 104774
   Summary: OpenACC 'kernels' decomposition: internal compiler
error: 'verify_gimple' failed, with 'loop' with
explicit 'seq' or 'independent'
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code, openacc
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: tschwinge at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Another one in addition to PR104132, PR104133, that is *not* fixed by my WIP
patches for those:

int arr_0;

void
foo (void)
{
#pragma acc kernels
  {
int k;

#pragma acc loop seq
for (k = 0; k < 2; k++)
  arr_0 = k;

#pragma acc loop independent reduction(+: arr_0)
for (k = 0; k < 2; k++)
  arr_0 += k;
  }
}

With '-fopenacc --param openacc-kernels=decompose -O0 -g0' (so, not involving
'GIMPLE_DEBUG's), for both C and C++, we run into:

pr.c: In function ‘foo._omp_fn.0’:
pr.c:18:1: error: non-register as LHS of binary operation
   18 | }
  | ^
# .MEM_21 = VDEF <.MEM_3>
k = 0 + .offset.24_2;
pr.c:18:1: error: invalid RHS for gimple memory store: ‘var_decl’
*_23;

k

# .MEM_24 = VDEF <.MEM_21>
*_23 = k;
pr.c:18:1: error: non-register as LHS of binary operation
# .MEM_27 = VDEF <.MEM_4>
k = 0 + 2;
during GIMPLE pass: ssa
pr.c:18:1: internal compiler error: verify_gimple failed

Only with both 'loop's changed to implicit or explicit 'auto' (that is, both
'seq' and 'independent' removed), we succeed to compile (with my PR104132,
PR104133 WIP patches applied).

[Bug middle-end/104784] New: OpenACC 'kernels' decomposition: C vs. C++ differences

2022-03-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104784

Bug ID: 104784
   Summary: OpenACC 'kernels' decomposition: C vs. C++ differences
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: openacc
  Severity: minor
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

We have found cases where OpenACC 'kernels' decomposition handles C vs. C++
differently.  That's not a problem per se, but it causes different diagnostics
(if enabled), and we'd generally at least like to understand the reason for the
differences.

I'll push a commit demonstrating this for a few test cases, pointing to this
PR.

[Bug middle-end/104132] OpenACC 'kernels' decomposition: internal compiler error: verify_gimple failed, error: non-register as LHS of binary operation

2022-03-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104132

Thomas Schwinge  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=100280
 Resolution|--- |FIXED

--- Comment #4 from Thomas Schwinge  ---
(In reply to myself from comment #1)
> The following fix fell out of me analyzing PR102330:
> 
> --- gcc/omp-expand.cc
> +++ gcc/omp-expand.cc
> @@ -7776,7 +7776,9 @@ expand_oacc_for (struct omp_region *region, struct 
> omp_for_data *fd)
>  
>expr = build2 (plus_code, iter_type, b,
>  fold_convert (plus_type, offset));
> -  expr = force_gimple_operand_gsi (&gsi, expr, false, NULL_TREE,
> +  expr = force_gimple_operand_gsi (&gsi, expr,
> +  DECL_P (v) && TREE_ADDRESSABLE (v),
> +  NULL_TREE,
>true, GSI_SAME_STMT);
>ass = gimple_build_assign (v, expr);
>gsi_insert_before (&gsi, ass, GSI_SAME_STMT);
> @@ -7966,7 +7968,9 @@ expand_oacc_for (struct omp_region *region, struct 
> omp_for_data *fd)
>expr = fold_build2 (TRUNC_DIV_EXPR, diff_type, expr, s);
>expr = fold_build2 (MULT_EXPR, diff_type, expr, s);
>expr = build2 (plus_code, iter_type, b, fold_convert (plus_type, 
> expr));
> -  expr = force_gimple_operand_gsi (&gsi, expr, false, NULL_TREE,
> +  expr = force_gimple_operand_gsi (&gsi, expr,
> +  DECL_P (v) && TREE_ADDRESSABLE (v),
> +  NULL_TREE,
>true, GSI_SAME_STMT);
>ass = gimple_build_assign (v, expr);
>gsi_insert_before (&gsi, ass, GSI_SAME_STMT);
> 
> ... as used in a number of other similar places (when they're not using
> 'expand_omp_build_assign').
> 
> That change seems correct, right?  (Jakub?)

Such a change actually wasn't necessary, with the underlying problem being
fixed earlier in the pass pipeline.

[Bug middle-end/104133] OpenACC 'kernels' decomposition: internal compiler error: 'verify_gimple' failed, error: invalid operands in binary operation

2022-03-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104133

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=100280

--- Comment #3 from Thomas Schwinge  ---
.

[Bug other/104791] [12 regression] libgomp.oacc-c++/../libgomp.oacc-c-c++-common/kernels-decompose-1.c fails

2022-03-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104791

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
   Last reconfirmed||2022-03-04
 CC||tschwinge at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Thomas Schwinge  ---
Mine, obviously, sorry.  Can you please tell what the excess errors are?

[Bug other/104791] [12 regression] libgomp.oacc-c++/../libgomp.oacc-c-c++-common/kernels-decompose-1.c fails

2022-03-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104791

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords||openacc

--- Comment #2 from Thomas Schwinge  ---
OK, I now see:

   
[...]/source-gcc/libgomp/testsuite/libgomp.oacc-c/../libgomp.oacc-c-c++-common/kernels-decompose-1.c:38:9:
note: variable 'f1.1' declared in block isn't candidate for adjusting OpenACC
privatization level: not addressable
   
[...]/source-gcc/libgomp/testsuite/libgomp.oacc-c/../libgomp.oacc-c-c++-common/kernels-decompose-1.c:38:9:
note: variable 'f1.2' declared in block isn't candidate for adjusting OpenACC
privatization level: not addressable

(Why didn't I see that in my testing, huh.)

[Bug testsuite/104791] [12 regression] libgomp.oacc-c++/../libgomp.oacc-c-c++-common/kernels-decompose-1.c fails

2022-03-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104791

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
  Component|other   |testsuite
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=100280

--- Comment #4 from Thomas Schwinge  ---
Fixed, sorry.

[Bug middle-end/102330] [12 Regression] ICE in expand_gimple_stmt_1, at cfgexpand.c:3932 since r12-980-g29a2f51806c

2022-03-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102330

Thomas Schwinge  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=90115
  Component|fortran |middle-end
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #14 from Thomas Schwinge  ---
.

[Bug middle-end/104774] OpenACC 'kernels' decomposition: internal compiler error: 'verify_gimple' failed, with 'loop' with explicit 'seq' or 'independent'

2022-03-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104774

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||jules at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=102330,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=90115
 Resolution|--- |FIXED

--- Comment #3 from Thomas Schwinge  ---
.

[Bug middle-end/90115] OpenACC: predetermined private levels for variables declared in blocks

2022-03-10 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90115

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2022-03-10
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #15 from Thomas Schwinge  ---
(In reply to Eric Gallager from comment #12)
> So, is it fixed now?

No, more work to be done here.

[Bug middle-end/104086] ICE in lower_omp_target, at omp-low.c:13075

2022-03-12 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104086

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
  Component|tree-optimization   |middle-end
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #3 from Thomas Schwinge  ---
.

[Bug middle-end/104892] New: OpenACC 'kernels' decomposition: wrong-code cases unless manually making certain variables addressable

2022-03-12 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104892

Bug ID: 104892
   Summary: OpenACC 'kernels' decomposition: wrong-code cases
unless manually making certain variables addressable
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: openacc, wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: tschwinge at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

Even with the PR100280 etc. ICEs fixed, we still have the problem that we
generate wrong code in certain scenarios.  For example, see the remark in
current 'libgomp.oacc-c-c++-common/kernels-decompose-1.c' that "Without making
'[...]' addressable, [...] we will not see the expected value copied out".

  1   2   3   4   5   6   7   >