https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474
--- Comment #9 from Andrew Stubbs ---
This patch fixes the -O1 failure, for *this* testcase:
diff --git a/gcc/tree.cc b/gcc/tree.cc
index eccfcc89da40..4bfdb7a938e7 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -7085,11 +7085,8 @@ build_pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119369
--- Comment #5 from Andrew Stubbs ---
A post-linker could be included as part of the mkoffload process (or maybe we
could fix up the weak directives in the assembler as part of the pre-assembler
step we already have).
Either way, there's no mko
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119369
--- Comment #2 from Andrew Stubbs ---
We used to have work-arounds for ROCm runtime linker deficiencies, but these
were removed in 2020, as they were no longer necessary when we moved to
HSACOv3:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474
--- Comment #8 from Andrew Stubbs ---
This patch fixes the ICE and produces working code at -O2 and -O3:
diff --git a/gcc/omp-offload.cc b/gcc/omp-offload.cc
index da2b54b76485..1778a70bf755 100644
--- a/gcc/omp-offload.cc
+++ b/gcc/omp-offload
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474
--- Comment #6 from Andrew Stubbs ---
The address space has to be introduced "late" because it's done in the
accelerator compiler, so post-IPA. The pass is "oaccdevlow" (currently
no.103).
The address space is selected via the TARGET_GOACC_ADJ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474
--- Comment #1 from Andrew Stubbs ---
In the -O1 case, the problem seems to be that the "ivopts" pass has identified
an item-in-an-array-in-a-struct as the IV, and that struct is in a different
address space:
Type: REFERENCE ADDRESS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
--- Comment #20 from Andrew Stubbs ---
I tried the memcpy solution with the following testcase:
v2sf
smaller (v64sf in)
{
v2sf out = RESIZE_VECTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
--- Comment #17 from Andrew Stubbs ---
Oops, that has __to and __from backwards ... you get the idea.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
--- Comment #16 from Andrew Stubbs ---
Perhaps:
asm ("mov %0, %1" : "=v"(__from), "v"(__to));
or maybe
asm ("; no-op cast %0" : "=v"(__from), "0"(__to));
Is there a downside to that in the optimizer(s)?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
--- Comment #10 from Andrew Stubbs ---
The libm vector routines are pretty much just the scalar routine translated
into vector extension statements wrapped in preprocessor macros.
They should be unaffected by the vectorizer (and most of the opt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286
--- Comment #3 from Andrew Stubbs ---
The RDNA consumer devices, such as gfx1100, support permute for V32 and
smaller, but not V64. Gather/scatter should be able to load from arbitrary
addresses, but synthesising a vector with those addresses ma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
--- Comment #3 from Andrew Stubbs ---
Supposedly, the non-openmp equivalent test is gcc.target/gcn/simd-math-1.c, but
that seems to be passing still.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544
--- Comment #15 from Andrew Stubbs ---
BTW, if you're calling "new" in the offload kernel then you're probably "doing
it wrong", even when we do implement full C++ support. Offload kernels are for
hot code, executed many times, and memory alloc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544
--- Comment #14 from Andrew Stubbs ---
"printf" exists and has been working on both AMDGCN and NVPTX devices since
forever. "fputs", "puts", and "write", etc. should all work too.
If the FORTIFY_SOURCE trick doesn't get rid of __printf_chk, or
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117709
--- Comment #6 from Andrew Stubbs ---
Yes, that fixes the issue, thanks.
The only diff in the assembly now, compared to before the "else" patch, is the
zero-initialization is gone. This is good; the mysterious extra code seemed
like a step back
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117709
--- Comment #4 from Andrew Stubbs ---
The mask is a 64-bit integer value in the "exec" register.
I agree that I cannot see the problem staring at it. Like I said, I changed the
backend so that it generated the zero-initializers anyway, and the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657
Andrew Stubbs changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ams at gcc dot gnu.org
CC: rdapp at gcc dot gnu.org
Target Milestone: ---
The following testcase aborts on amdgcn since the maskload else patches were
added (https
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657
--- Comment #9 from Andrew Stubbs ---
That commit should fix the build failure.
However, I'm now seeing a wrong-code regression in gcc.dg/vect/vect-simd-17.c
that I can't prove isn't related. The testcase now aborts, at least on gfx90a,
where i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657
--- Comment #6 from Andrew Stubbs ---
The patch changed the wrong operand on the gen_gather_insn_1offset_exec
call. It sets one of the offsets undefined instead of setting the else value
undefined.
I'm testing a fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657
--- Comment #1 from Andrew Stubbs ---
This appears to have been caused by the recent maskload patches, which is weird
because I thought I already tested the patches that were posted.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116955
--- Comment #2 from Andrew Stubbs ---
Compared to gfx908, gfx1100 lacks 64-lane permute and vector reductions.
Permute works with 32 lanes or fewer, but reductions are unimplemented in the
backend. Otherwise it should vectorize the same.
That m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116571
--- Comment #6 from Andrew Stubbs ---
(In reply to Richard Biener from comment #5)
> (In reply to Thomas Schwinge from comment #4)
> > The GCN target FAILs that I originally had reported here:
> >
> > > [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104
Andrew Stubbs changed:
What|Removed |Added
Resolution|FIXED |---
Status|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103
--- Comment #8 from Andrew Stubbs ---
(In reply to Thomas Schwinge from comment #4)
> (In reply to Richard Biener from comment #2)
> > if (VECTOR_BOOLEAN_TYPE_P (type)
> > && SCALAR_INT_MODE_P (TYPE_MODE (type)))
> > return true;
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104
--- Comment #4 from Andrew Stubbs ---
The problem insn is this:
(insn 31 30 32 2 (set (reg:V2SI 711)
(ashift:V2SI (reg:V2SI 161 v1)
(const_vector:V2SI [
(const_int 3 [0x3]) repeated x2
]))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104
--- Comment #3 from Andrew Stubbs ---
(In reply to Jeffrey A. Law from comment #1)
> So, how am I supposed to reproduce this? I don't have an assembler/binutils
> for amdgcn and thus libgcc won't configure. Thus I can't extract a testcase.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
--- Comment #18 from Andrew Stubbs ---
That should fix the broken validation check. All V32 permutations should work
now on RDNA GPUs, I think. V16 and smaller were already working fine.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
--- Comment #16 from Andrew Stubbs ---
On 26/06/2024 14:41, rguenther at suse dot de wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
>
> --- Comment #15 from rguenther at suse dot de ---
>>> Btw, the above looks quite odd for nelt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
--- Comment #14 from Andrew Stubbs ---
On 26/06/2024 13:34, rguenth at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
>
> --- Comment #13 from Richard Biener ---
> (In reply to Richard Biener from comment #12)
>>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
--- Comment #10 from Andrew Stubbs ---
On 26/06/2024 12:05, rguenth at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
>
> --- Comment #8 from Richard Biener ---
> (In reply to Richard Biener from comment #7)
>> I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640
--- Comment #3 from Andrew Stubbs ---
(In reply to Richard Biener from comment #2)
> If you force GCN to use fixed length vectors (how?), does it work? How's
> it behaving on aarch64 with SVE? (the CI was happy, but maybe doesn't
> enable SVE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115631
--- Comment #1 from Andrew Stubbs ---
It was writing 0 to s12 (scalar register) and then moving the zero to lane zero
of v0 (vector register).
Now it's writing the 0 directly to v0, of which all but lane zero is masked.
These should be identic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304
--- Comment #11 from Andrew Stubbs ---
(In reply to rguent...@suse.de from comment #10)
> On Mon, 3 Jun 2024, ams at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304
> >
> > --- Comme
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304
--- Comment #9 from Andrew Stubbs ---
(In reply to Richard Biener from comment #6)
> The best strathegy for GCN would be to gather V4QImode aka SImode into the
> V64QImode (or V16SImode) vector. For pix2 we have a gap of 28 elements,
> doing co
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717
--- Comment #3 from Andrew Stubbs ---
Can this be filtered (safely) in mkoffload? That tool is
offload-target-specific, so no problem with "if offload target were to support
it".
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302
--- Comment #4 from Andrew Stubbs ---
Yes, that's what the simd-math-3* tests do.
The simd-math-5* tests are explicitly supposed to be doing this in the context
of the autovectorizer.
If these tests are being compiled as (newly) intended then
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302
--- Comment #2 from Andrew Stubbs ---
The execution test checks that each of the libgcc routines work correctly, and
the scan assembler tests make sure that we're getting coverage of all of them.
In this case, the failure indicates that we're n
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085
--- Comment #8 from Andrew Stubbs ---
(In reply to seurer from comment #7)
> On the BE machine:
>
> seurer@nilram:~/gcc/git/build/gcc-test$ ulimit -a
> real-time non-blocking time (microseconds, -R) unlimited
> ...
> max locked memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085
--- Comment #6 from Andrew Stubbs ---
(In reply to seurer from comment #5)
> I should note that pinned-2 also fails on powerpc64 LE.
>
> make -k check-target-libgomp RUNTESTFLAGS="c.exp=libgomp.c/alloc-pinned-*"
> FAIL: libgomp.c/alloc-pinned-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615
--- Comment #3 from Andrew Stubbs ---
I did see these, but I hadn't had time to chase them up.
The proposed patch is exactly the sort of solution I was expecting to find,
short term. Have you confirmed that it fixes all the cases?
A proper sol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113199
--- Comment #5 from Andrew Stubbs ---
I can confirm that I can now build the amdgcn toolchain once more. :-)
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163
Andrew Stubbs changed:
What|Removed |Added
CC||ams at gcc dot gnu.org
--- Comment #11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085
--- Comment #4 from Andrew Stubbs ---
It's going to be difficult to make this test work when only one page of locked
memory is available. :-(
I will look at making it "unsupported".
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085
--- Comment #1 from Andrew Stubbs ---
That is a typo.
I don't want to make it pass on machines that have insufficient memory
configured because it will mask the case where it fails for another reason.
However, the testcase was originally suppo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113022
--- Comment #1 from Andrew Stubbs ---
This is what I get for trying to get this done before vacation. :(
Yes, there's probably something in mkoffload that has to match the default
change from -mxnack=any to -mxnack=off on the older ISAs.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112937
--- Comment #2 from Andrew Stubbs ---
Flat addressing *should* be the safe option that always works (although using
"global" address space permits slightly more efficient offset options).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481
Andrew Stubbs changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481
--- Comment #7 from Andrew Stubbs ---
Simply changing to OPTAB_WIDEN solves the ICE, but I don't know if it does so
in a sensible way, for RISC V.
@@ -7489,7 +7489,7 @@ store_constructor (tree exp, rtx target, int cleared,
poly_int64 size,
||2023-11-13
Ever confirmed|0 |1
Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
--- Comment #4 from Andrew Stubbs ---
It fails because optab_handler fails to find an instruction for "and_optab" in
SImode.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112308
Andrew Stubbs changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
|RESOLVED
Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
--- Comment #2 from Andrew Stubbs ---
This is now fixed.
||2023-11-09
Ever confirmed|0 |1
Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112088
Andrew Stubbs changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
|1
Last reconfirmed||2023-10-27
Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
--- Comment #1 from Andrew Stubbs ---
I'm testing a fix for this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313
--- Comment #5 from Andrew Stubbs ---
One thing that is unusual about the GCN stack pointer is that it's actually two
registers. Could this be breaking some cprop assumptions?
GCN can't fit an address in one (SImode) register so all (DImode) po
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313
--- Comment #3 from Andrew Stubbs ---
It's curious that this affects the Fiji target only, and not the newer targets
at all.
There are some additional register options for multiply instructions, some
differences to atomics, but mostly the diffe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313
--- Comment #1 from Andrew Stubbs ---
This ICE also affect the following standalone test failures (raw amdgcn, no
offloading):
gfortran.dg/assumed_rank_21.f90
gfortran.dg/finalize_38.f90
gfortran.dg/finalize_38a.f90
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108898
--- Comment #4 from Andrew Stubbs ---
I did not know there was a way to do that! I'll add this to my to-do list.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108898
--- Comment #1 from Andrew Stubbs ---
I tested it on i686-pc-linux-gnu before I posted the patch, and it was working
then. Can you be more specific what configuration you were testing, please?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510
Andrew Stubbs changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863
Bug 89863 depends on bug 107510, which changed state.
Bug 107510 Summary: gcc/config/gcn/gcn.cc:4930:9: style: Same expression on
both sides of '||'. [duplicateExpression]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510
What|R
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510
Andrew Stubbs changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107096
--- Comment #4 from Andrew Stubbs ---
I don't understand rgroups, but I can say that GCN masks are very simply
one-bit-one-lane. There are always 64-lanes, regardless of the type, so V64QI
mode has fewer bytes and bits than V64DImode (when writt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107088
--- Comment #9 from Andrew Stubbs ---
I can confirm that the patch fixes the amdgcn build.
-*-*
CC||ams at gcc dot gnu.org
--- Comment #7 from Andrew Stubbs ---
I get the same failure on amdgcn building newlib/libm/math/kf_rem_pio2.c
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ams at gcc dot gnu.org
CC: rguenther at suse dot de
Target Milestone: ---
Target: amdgcn-amdhsa
Commit 8f4d9c1deda "amdgcn: 64-bit not" exposed an ICE in tree-vect_stmts.cc
when
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105873
--- Comment #4 from Andrew Stubbs ---
I think unused threads should be given a no-op function to run, not a null
pointer. The GCN implementation cannot tell the difference between a null
pointer and an unset pointer (which is what happens when t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105246
--- Comment #2 from Andrew Stubbs ---
When we first coded this we only had the GCN3 ISA manual, which says nothing
about the accuracy.
Now I look in the Vega manual (GCN5) I see:
Square root with perhaps not the accuracy you were hoping for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181
--- Comment #13 from Andrew Stubbs ---
I've updated the LLVM version documentation at
https://gcc.gnu.org/wiki/Offloading#For_AMD_GCN:
It's LLVM 9 or 13.0.1 now (nothing in between), and will be 13.0.1+ for the
next release (dropping LLVM 9 bec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104026
Andrew Stubbs changed:
What|Removed |Added
CC||ams at gcc dot gnu.org
--- Comment #6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103396
Andrew Stubbs changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #4 from Andrew Stubbs ---
I think I have a fix for this. It happens when the link register has to be
saved because it is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103201
--- Comment #3 from Andrew Stubbs ---
I did some preliminary testing on your patch: the libgomp.c/target-teams-1.c
testcase runs fine on amdgcn. I presume that that covers most of the existing
features of those runtime calls?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544
--- Comment #8 from Andrew Stubbs ---
Did you get the C version to return anything other than "-1"? (The expected
result is "2".)
I'm still trying to determine if the device is compatible, but the mapping
problem looks like a different issue.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544
--- Comment #5 from Andrew Stubbs ---
Sorry, I should have said to compile with -fopenacc.
If you did do that, please post the GCN_DEBUG output.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544
--- Comment #3 from Andrew Stubbs ---
That output shows that we have the correct libgomp and rocm is installed and
working. Libgomp initialized the GCN plugin, but did not attempt to initialize
the device (the next message in the output should h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544
--- Comment #1 from Andrew Stubbs ---
Please set "export GCN_DEBUG=1", try it again, and post the output.
||2021-09-09
Ever confirmed|0 |1
Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
--- Comment #1 from Andrew Stubbs ---
In addition to changing the amdgcn_target syntax in LLVM 13, the LLVM GCN guys
have also renamed the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544
--- Comment #5 from Andrew Stubbs ---
[Note: all of my comments refer to the amdgcn case. nvptx has somewhat
different support in this area.]
(In reply to Jonathan Wakely from comment #4)
> But it's a waste of space in the .so to build lots of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100208
Andrew Stubbs changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544
--- Comment #3 from Andrew Stubbs ---
The standalone amdgcn configuration does not support C++. There are a number of
technical reasons why it doesn't Just Work, but basically it comes down to
no-one ever working on it. Our customers were primar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101484
Andrew Stubbs changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827
Andrew Stubbs changed:
What|Removed |Added
CC||xw111luoye at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95023
Andrew Stubbs changed:
What|Removed |Added
CC||ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418
Andrew Stubbs changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418
--- Comment #13 from Andrew Stubbs ---
I found a lot more ICEs when testing my patch. They look to be unrelated
(TImode come back to haunt us), but it makes it hard to be sure.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418
--- Comment #9 from Andrew Stubbs ---
I found a couple of other places to put force_operand and the full case works
now.
Running more tests
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418
Andrew Stubbs changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418
--- Comment #4 from Andrew Stubbs ---
Alexandre's patch has this:
emit_move_insn (rem, plus_constant (ptr_mode, rem, -blksize));
Is that generally a valid thing to do? It seems like other places do similar
things...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100208
--- Comment #1 from Andrew Stubbs ---
LLVM changed the default parameters, so we either have to change the
expectations in the ".amdgcn_target" string (which is basically an assert), or
set the attributes be want explicitly on the assembler comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521
--- Comment #22 from Andrew Stubbs ---
(In reply to Andrew Stubbs from comment #21)
> (In reply to Richard Biener from comment #19)
> > GCN also uses MODE_INT for the mask mode and thus may be similarly affected.
> > Andrew - are the bits in the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521
--- Comment #21 from Andrew Stubbs ---
(In reply to Richard Biener from comment #19)
> GCN also uses MODE_INT for the mask mode and thus may be similarly affected.
> Andrew - are the bits in the mask dense? Thus for a V4SImode compare
> would th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84958
--- Comment #6 from Andrew Stubbs ---
(In reply to Tom de Vries from comment #5)
> I've removed the xfail for nvptx.
>
> The only remaining xfail is for gcn. Is that one still necessary?
The test still fails for gcn.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97332
Andrew Stubbs changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306
--- Comment #8 from Andrew Stubbs ---
I'm loath to enable TImode if it's going to ICE all over the place, and I can't
just drop everything else and implement working TImode unless there's an easy
solution. It's always been on the nice-to-have lis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95730
--- Comment #4 from Andrew Stubbs ---
In fact default_scalar_mode_supported_p does return *false* for TImode (because
LONG_LONG_TYPE_SIZE == 64, and BITS_PER_WORD == 32).
Therefore int128_t does not exist, as far as users are concerned. I'm not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306
--- Comment #5 from Andrew Stubbs ---
GCC will automatically generate libgcc calls for types up to 2*BITS_PER_WORD,
but no further. Since BITS_PER_WORD is 32 on GCN this means no automatic TImode
support for anything that would go that route (suc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306
--- Comment #3 from Andrew Stubbs ---
TImode was added for use by a few instructions that take two 64-bit values in
consecutive registers. It's also useful for the SLP fake vectorization stuff.
It wasn't intended for use with user types; I proba
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95864
--- Comment #1 from Andrew Stubbs ---
I'm aware of these issues.
I fixed all the test failures that were definitely bugs in the HSACOv3
implementation, and the ones that remain appear to be either latent bugs
uncovered by the new driver configur
1 - 100 of 183 matches
Mail list logo