https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104815
--- Comment #1 from Tom de Vries ---
With the tentative patch, I'm running into:
...
ptxas 2224-1.o, line 72; error : Result discard mode is not allowed for
instruction 'ld'
nvptx-as: ptxas terminated with signal 11 [Segmentation fault], c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104840
Bug ID: 104840
Summary: [nvptx] Can't set predicable attribute to true
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #3 from Tom de Vries ---
Created attachment 52584
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52584&action=edit
Tentative patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104841
Bug ID: 104841
Summary: [nvptx] Multi-version ptx
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #4 from Tom de Vries ---
The patch I have works for target boards unix and unix/-foffload=-mptx=3.1, but
I run into the hang for --target_board=unix/-foffload=-misa=sm_75.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104857
Bug ID: 104857
Summary: [nvptx] Add macro specifying ptx isa version
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104857
--- Comment #1 from Tom de Vries ---
Created attachment 52592
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52592&action=edit
Tentative patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104758
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Status|REOPENED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104815
Tom de Vries changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104840
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
--- Comment #6 from Tom de Vries ---
Created attachment 52593
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52593&action=edit
Tentative patch
(In reply to Tom de Vries from comment #4)
> The patch I have works for target boards unix and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104879
Bug ID: 104879
Summary: [nvptx] Use .common directive (available starting ptx
isa version 5.0)
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
--- Comment #2 from Tom de Vries ---
Created attachment 52606
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52606&action=edit
Tentative patch
With this patch and:
- current trunk
- misa default set to sm_75 (so 3.1 multilib disabled, beca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
--- Comment #3 from Tom de Vries ---
With this additionally:
...
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 1a89c1bc77f..2e1a2dad9fe 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -968,7 +9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104893
Bug ID: 104893
Summary: [nvptx] Handle Independent Thread Scheduling for
sm_70+ with -msoft-stack
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: norm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
Bug ID: 104916
Summary: [nvptx] Handle Independent Thread Scheduling for
sm_70+ with -muniform-simt
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: no
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
--- Comment #1 from Tom de Vries ---
We could try the same solution as for atomic: predicate ld/st to only execute
in lane 0, and propagate ld result.
Another solution might be to wrap each ld/st in two bar.warp.sync.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104925
Bug ID: 104925
Summary: [nvptx] Use "%" as register prefix
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: trivial
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104926
Bug ID: 104926
Summary: [nvptx] Use human-readable register names
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
--- Comment #2 from Tom de Vries ---
Created attachment 52629
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52629&action=edit
Attempt, runs into driver internal error
FTR, this is an attempt at a fix.
It does the "predicate ld/st to onl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104932
Bug ID: 104932
Summary: [nvptx] Subreg validation error for V2SI
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
--- Comment #3 from Tom de Vries ---
Anyway, having reread the volta architecture whitepaper again, I think it's ok
to use the solution I already found that does work (see PR104783): add a warp
sync at simt exit.
The tricky bit is that we rely
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104933
Bug ID: 104933
Summary: [nvptx] muniform-simt optimization: determine
inside/outside SIMT region at compile time
Product: gcc
Version: 12.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104893
--- Comment #1 from Tom de Vries ---
(In reply to Tom de Vries from comment #0)
> The per-thread call stack is handled for .local memory by the CUDA driver.
>
> For the 'soft stack' that's not the case.
Hmm, actually there's .local memory used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104768
--- Comment #1 from Tom de Vries ---
Hmm, reading about it a bit more, it's more about enabling algorithms that were
not possible before, than about performance improvements.
So, we should aim at having test-cases, both openacc and openmp that
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104936
Bug ID: 104936
Summary: [nvptx] Handle weak decl/def distinction in common
code
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Prior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
--- Comment #4 from Tom de Vries ---
This:
...
$ cat alias.c
void __f ()
{
__builtin_printf ("hello\n");
}
void f () __attribute__ ((alias ("__f")));
int
main (void)
{
f ();
return 0;
}
...
works fine at -O0 and -O1:
...
$ ./gcc.sh -O0 ./
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
Bug ID: 104957
Summary: [nvptx] Use .alias directive (available starting ptx
isa version 6.3)
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
--- Comment #1 from Tom de Vries ---
Created attachment 52636
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52636&action=edit
Tentative patch
Patch that I'm currently working on.
Adds -malias, off by default.
It's off by default becaus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
--- Comment #2 from Tom de Vries ---
So, what do we get after specifying -malias -mptx=6.3?
Alias attribute only for functions, not variables.
No support for weak alias (allowing this does compile, but we run into
execution fails in gcc.dg/glo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
--- Comment #3 from Tom de Vries ---
The OvO testsuite, when run at -O2 passes, because it inlines all .alias
instances.
But at -O0, it doesn't. With -foffload=-malias that's fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
Tom de Vries changed:
What|Removed |Added
Keywords||openmp
--- Comment #1 from Tom de Vries
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #2 from Tom de Vries ---
I think the problem can be seen already at omp-lower, in the body of the
butterfly loop.
Let's first look at what we have if we use reduction op '|':
...
D.2173 = .GOMP_SIMT_VF ();
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #3 from Tom de Vries ---
Hmm, that seems to be actually due to:
...
if (sctx.is_simt)
{
if (!simt_lane)
simt_lane = create_tmp_var (u
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #4 from Tom de Vries ---
This fixes it:
...
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index d932d74cb03..f2ac8f98e32 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -6734,7 +6734,21 @@ lower_rec_input_clauses (tree clauses, gi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #7 from Tom de Vries ---
Alternative fix that doesn't require fiddling with the 'code' var:
...
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index d932d74cb03..d0ddd4a6142 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -6734,7 +
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #8 from Tom de Vries ---
(In reply to Jakub Jelinek from comment #6)
> And yes, #c1 is valid.
Thanks for confirming.
> But would be nice to have similar test with && and
> initial result = 2; and arr[] say { 1, 2, 3, 4, 5, 6, 7, ..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
--- Comment #9 from Tom de Vries ---
Created attachment 52647
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52647&action=edit
Tentative patch with test-cases, rationale and changelog
I'll put this through testing, and submit if no proble
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
--- Comment #1 from Tom de Vries ---
Can't reproduce.
It this not fixed by:
...
commit 7862f6ccd85a001e4d70abb00bb95d8c7846ba80
Author: Tom de Vries
Date: Wed Feb 23 09:33:33 2022 +0100
[nvptx] Fix dummy location in gen_comment
...
?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
--- Comment #2 from Tom de Vries ---
(In reply to Tom de Vries from comment #1)
> Can't reproduce.
>
> It this not fixed by:
> ...
> commit 7862f6ccd85a001e4d70abb00bb95d8c7846ba80
> Author: Tom de Vries
> Date: Wed Feb 23 09:33:33 2022 +010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
--- Comment #3 from Tom de Vries ---
(In reply to Tom de Vries from comment #2)
> (In reply to Tom de Vries from comment #1)
> > Can't reproduce.
> >
> > It this not fixed by:
> > ...
> > commit 7862f6ccd85a001e4d70abb00bb95d8c7846ba80
> > Auth
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
--- Comment #4 from Tom de Vries ---
This ( https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591912.html )
proposed patch fixes this ICE, pinged again.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
Tom de Vries changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
Tom de Vries changed:
What|Removed |Added
Severity|normal |enhancement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
--- Comment #6 from Tom de Vries ---
(In reply to Tom de Vries from comment #5)
> This patch fixes the ICE at openmp level:
> ...
> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
> index 139a0de6100..19af384c634 100644
> --- a/gcc/gimplify.cc
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104952
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104968
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104991
Bug ID: 104991
Summary: [nvptx] Simplify muniform-simt transformation
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Componen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104936
Tom de Vries changed:
What|Removed |Added
Severity|normal |enhancement
Keywords|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105001
--- Comment #1 from Tom de Vries ---
Interesting.
Can you compare dump files to see where the difference comes from?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105011
Bug ID: 105011
Summary: [nvptx] FAIL: gcc.dg/atomic/stdatomic-flag-2.c -O1
execution test
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105011
--- Comment #1 from Tom de Vries ---
(In reply to Tom de Vries from comment #0)
> It should probably do something like:
> ...
> return (woldval & wval) != 0;
> ...
Indeed, that fixes the FAILs.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105011
--- Comment #2 from Tom de Vries ---
Even better:
...
diff --git a/libatomic/tas_n.c b/libatomic/tas_n.c
index d0d8c283b495..65eaa7753a51 100644
--- a/libatomic/tas_n.c
+++ b/libatomic/tas_n.c
@@ -73,7 +73,7 @@ SIZE(libat_test_and_set) (UTYPE *m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
Bug ID: 105014
Summary: [nvptx] FAIL: gcc.dg/pr97459-1.c execution test
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
--- Comment #1 from Tom de Vries ---
First FAIL minimizes to:
...
typedef __uint128_t T;
union u {
T t;
struct {
unsigned long long x;
unsigned long long y;
} xy;
};
#define PRINT(VAR) \
do
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105016
Bug ID: 105016
Summary: [libgcc, TARGET_HAS_NO_HW_DIVIDE] Incorrect result for
__udivmodti4
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105016
--- Comment #1 from Tom de Vries ---
Created attachment 52662
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52662&action=edit
test-case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105016
--- Comment #3 from Tom de Vries ---
In libgcc.h, I see:
...
#define __udivmoddi4__NDW(udivmod,4)
...
and for LIBGCC2_UNITS_PER_WORD == 8 we have:
...
#define __NDW(a,b) __ ## a ## ti ## b
...
So, AFAICT it's possible that __udivmoddi4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104925
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104957
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
Tom de Vries changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98215
Tom de Vries changed:
What|Removed |Added
Severity|normal |enhancement
Keywords|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97102
Tom de Vries changed:
What|Removed |Added
Target Milestone|--- |12.0
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
Bug 97106 depends on bug 97102, which changed state.
Bug 97102 Summary: [nvptx] PTX JIT compilation failed when using aliases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97102
What|Removed |Added
--
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106
Tom de Vries changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105018
Bug ID: 105018
Summary: [nvptx] Need better alias support
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105018
--- Comment #1 from Tom de Vries ---
(In reply to Tom de Vries from comment #0)
> Aliases to aliases are not supported (see libgomp.c-c++-common/pr96390.c).
> This is currently not prohibited by the compiler, but with the driver link we
> run in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105018
--- Comment #2 from Tom de Vries ---
As mentioned before by amonakov, a possibility is to add alias support to the
nvptx-tools linker, and use that.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
--- Comment #2 from Tom de Vries ---
(In reply to Tom de Vries from comment #0)
> On a quadro k2000 with driver 470.103.01, I run into:
So, sm_30.
> ...
> FAIL: gcc.dg/pr97459-1.c execution test
Reproduced on geforce gt710 (sm_35), with same
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
--- Comment #3 from Tom de Vries ---
(In reply to Tom de Vries from comment #2)
> (In reply to Tom de Vries from comment #0)
> > On a quadro k2000 with driver 470.103.01, I run into:
>
> So, sm_30.
>
> > ...
> > FAIL: gcc.dg/pr97459-1.c execut
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105019
Bug ID: 105019
Summary: [nvptx] malias in libgomp results in "Internal error:
reference to deleted section"
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Seve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105019
--- Comment #1 from Tom de Vries ---
To trigger:
...
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 87efc23bd96..8bf9ea90a77 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -245,6 +245,9 @@ def
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105019
--- Comment #2 from Tom de Vries ---
Aliases in libgomp.a:
...
$ grep "\.alias"
build-gcc-offload-nvptx-none/nvptx-none/mgomp/libgomp/.libs/libgomp.a
.alias gomp_ialias_GOMP_loop_runtime_next,GOMP_loop_runtime_next;
.alias gomp_ialias_GOMP_loop_
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105019
--- Comment #3 from Tom de Vries ---
Aliases in failing .exe:
...
$ strings declare_target-1.exe | grep "\.alias"
.alias gomp_ialias_GOMP_taskgroup_start,GOMP_taskgroup_start;
.alias gomp_ialias_GOMP_taskgroup_end,GOMP_taskgroup_end;
.alias
gomp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105019
--- Comment #4 from Tom de Vries ---
OK, I think this is the pattern:
...
$ cat gcc/testsuite/gcc.target/nvptx/alias-5.c
/* { dg-do link } */
/* { dg-do run { target runtime_ptx_isa_version_6_3 } } */
/* { dg-options "-save-temps -malias -mptx=6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105019
--- Comment #5 from Tom de Vries ---
Creating a CUDA example is hampered by the fact that there's no symbol alias
support, AFAICT.
I'd like to write something like:
...
__device__ void __foo ()
{
printf ("__foo\n");
}
__device__ void foo () _
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105019
--- Comment #6 from Tom de Vries ---
(In reply to Tom de Vries from comment #4)
> OK, I think this is the pattern:
> ...
> $ cat gcc/testsuite/gcc.target/nvptx/alias-5.c
FTR, same thing if I use static functions:
...
static void __attribute__((
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105042
Bug ID: 105042
Summary: [libgomp, GOMP_NVPTX_JIT=-O0] Openacc testsuite
failures when X runs on nvidia driver
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Se
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105011
--- Comment #3 from Tom de Vries ---
Submitted fix :
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592211.html
Though without changelog, apparently.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
--- Comment #4 from Tom de Vries ---
(In reply to Tom de Vries from comment #1)
> With -O0 JIT instead:
Also OK with JIT -O1, problems start at JIT -O2.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105042
--- Comment #2 from Tom de Vries ---
(In reply to Richard Biener from comment #1)
> Doesn't whatever driver/library API we use from libgomp to invoke workloads
> report actual errors? Maybe we need to improve there.
Good point, it reported som
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105011
Tom de Vries changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Target Milestone|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
--- Comment #5 from Tom de Vries ---
Minimal test-case:
...
void __attribute__((noinline)) foo (unsigned long long d0) {
unsigned long long __a;
__a = 0x38;
for (; __a > 0; __a -= 8)
if (((d0 >> __a) & 0xff) != 0)
break;
__bu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105042
--- Comment #3 from Tom de Vries ---
(In reply to Tom de Vries from comment #2)
> (In reply to Richard Biener from comment #1)
> > Doesn't whatever driver/library API we use from libgomp to invoke workloads
> > report actual errors? Maybe we ne
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105042
--- Comment #4 from Tom de Vries ---
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592275.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105042
--- Comment #5 from Tom de Vries ---
(In reply to Richard Biener from comment #1)
> Doesn't whatever driver/library API we use from libgomp to invoke workloads
> report actual errors? Maybe we need to improve there.
This:
...
libgomp: cuStream
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105075
Bug ID: 105075
Summary: [nvptx] Generate sad insn (sum of absolute
differences)
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: enhancement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105075
--- Comment #1 from Tom de Vries ---
Created attachment 52693
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52693&action=edit
Demonstrator patch
This patch adds:
- modeling of insn sad.u32 in the .md file
- peephole2 to generate it
(wh
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105075
--- Comment #2 from Tom de Vries ---
AFAIU, at gimple level support is limited to vectors, so that doesn't help to
generate the insn for the simple, scalar case.
It would be nice if combine could generate it and we wouldn't have to use a
peepho
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105075
Tom de Vries changed:
What|Removed |Added
Attachment #52693|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105014
--- Comment #6 from Tom de Vries ---
Reproducer filed at https://github.com/vries/nvidia-bugs/tree/master/shift-and
PR filed at nvidia ( https://developer.nvidia.com/nvidia_bug/3585290 ).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105075
--- Comment #6 from Tom de Vries ---
Created attachment 52698
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52698&action=edit
Demonstrator patch with stepping stone patterns for combine
(In reply to Tom de Vries from comment #2)
> Also,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105042
Tom de Vries changed:
What|Removed |Added
Severity|normal |enhancement
--- Comment #8 from Tom de V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104818
Tom de Vries changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81728
Tom de Vries changed:
What|Removed |Added
Resolution|--- |WORKSFORME
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81909
Tom de Vries changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53037
Tom de Vries changed:
What|Removed |Added
CC||vries at gcc dot gnu.org
--- Comment #43
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105096
Bug ID: 105096
Summary: --target-help not an alias for --help=target
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: trivial
Priority: P3
Component: dr
201 - 300 of 418 matches
Mail list logo