[Bug target/119474] GCN 'libgomp.oacc-c++/pr96835-1.C' ICE 'during GIMPLE pass: ivopts'

2025-03-31 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474 --- Comment #9 from Andrew Stubbs --- This patch fixes the -O1 failure, for *this* testcase: diff --git a/gcc/tree.cc b/gcc/tree.cc index eccfcc89da40..4bfdb7a938e7 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -7085,11 +7085,8 @@ build_pointer

[Bug target/119369] GCN: weak undefined symbols -> execution test FAIL, 'HSA_STATUS_ERROR_VARIABLE_UNDEFINED'

2025-03-31 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119369 --- Comment #5 from Andrew Stubbs --- A post-linker could be included as part of the mkoffload process (or maybe we could fix up the weak directives in the assembler as part of the pre-assembler step we already have). Either way, there's no mko

[Bug target/119369] GCN: weak undefined symbols -> execution test FAIL, 'HSA_STATUS_ERROR_VARIABLE_UNDEFINED'

2025-03-31 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119369 --- Comment #2 from Andrew Stubbs --- We used to have work-arounds for ROCm runtime linker deficiencies, but these were removed in 2020, as they were no longer necessary when we moved to HSACOv3: https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;

[Bug target/119474] GCN 'libgomp.oacc-c++/pr96835-1.C' ICE 'during GIMPLE pass: ivopts'

2025-03-28 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474 --- Comment #8 from Andrew Stubbs --- This patch fixes the ICE and produces working code at -O2 and -O3: diff --git a/gcc/omp-offload.cc b/gcc/omp-offload.cc index da2b54b76485..1778a70bf755 100644 --- a/gcc/omp-offload.cc +++ b/gcc/omp-offload

[Bug target/119474] GCN 'libgomp.oacc-c++/pr96835-1.C' ICE 'during GIMPLE pass: ivopts'

2025-03-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474 --- Comment #6 from Andrew Stubbs --- The address space has to be introduced "late" because it's done in the accelerator compiler, so post-IPA. The pass is "oaccdevlow" (currently no.103). The address space is selected via the TARGET_GOACC_ADJ

[Bug target/119474] GCN 'libgomp.oacc-c++/pr96835-1.C' ICE 'during GIMPLE pass: ivopts'

2025-03-26 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119474 --- Comment #1 from Andrew Stubbs --- In the -O1 case, the problem seems to be that the "ivopts" pass has identified an item-in-an-array-in-a-struct as the IV, and that struct is in a different address space: Type: REFERENCE ADDRESS

[Bug middle-end/119325] [15 Regression] libgomp.c/simd-math-1.c (gcn offloading): timeout (for fmodf, remainderf) since r15-7257-g54bdeca3c62144

2025-03-19 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325 --- Comment #20 from Andrew Stubbs --- I tried the memcpy solution with the following testcase: v2sf smaller (v64sf in) { v2sf out = RESIZE_VECTO

[Bug middle-end/119325] [15 Regression] libgomp.c/simd-math-1.c (gcn offloading): timeout (for fmodf, remainderf) since r15-7257-g54bdeca3c62144

2025-03-18 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325 --- Comment #17 from Andrew Stubbs --- Oops, that has __to and __from backwards ... you get the idea.

[Bug middle-end/119325] [15 Regression] libgomp.c/simd-math-1.c (gcn offloading): timeout (for fmodf, remainderf) since r15-7257-g54bdeca3c62144

2025-03-18 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325 --- Comment #16 from Andrew Stubbs --- Perhaps: asm ("mov %0, %1" : "=v"(__from), "v"(__to)); or maybe asm ("; no-op cast %0" : "=v"(__from), "0"(__to)); Is there a downside to that in the optimizer(s)?

[Bug middle-end/119325] [15 Regression] libgomp.c/simd-math-1.c (gcn offloading): timeout (for fmodf, remainderf) since r15-7257-g54bdeca3c62144

2025-03-18 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325 --- Comment #10 from Andrew Stubbs --- The libm vector routines are pretty much just the scalar routine translated into vector extension statements wrapped in preprocessor macros. They should be unaffected by the vectorizer (and most of the opt

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-03-18 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 --- Comment #3 from Andrew Stubbs --- The RDNA consumer devices, such as gfx1100, support permute for V32 and smaller, but not V64. Gather/scatter should be able to load from arbitrary addresses, but synthesising a vector with those addresses ma

[Bug middle-end/119325] [15 Regression] libgomp.c/simd-math-1.c (gcn offloading): timeout (for fmodf, remainderf) since r15-7284-g6b56e645a7b481

2025-03-17 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325 --- Comment #3 from Andrew Stubbs --- Supposedly, the non-openmp equivalent test is gcc.target/gcn/simd-math-1.c, but that seems to be passing still.

[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"

2025-01-16 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #15 from Andrew Stubbs --- BTW, if you're calling "new" in the offload kernel then you're probably "doing it wrong", even when we do implement full C++ support. Offload kernels are for hot code, executed many times, and memory alloc

[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"

2025-01-16 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #14 from Andrew Stubbs --- "printf" exists and has been working on both AMDGCN and NVPTX devices since forever. "fputs", "puts", and "write", etc. should all work too. If the FORTIFY_SOURCE trick doesn't get rid of __printf_chk, or

[Bug target/117709] [15 regression] maskload else case generating wrong code

2024-11-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117709 --- Comment #6 from Andrew Stubbs --- Yes, that fixes the issue, thanks. The only diff in the assembly now, compared to before the "else" patch, is the zero-initialization is gone. This is good; the mysterious extra code seemed like a step back

[Bug target/117709] [15 regression] maskload else case generating wrong code

2024-11-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117709 --- Comment #4 from Andrew Stubbs --- The mask is a 64-bit integer value in the "exec" register. I agree that I cannot see the problem staring at it. Like I said, I changed the backend so that it generated the zero-initializers anyway, and the

[Bug target/117657] [15 Regression][gcn] ICE during in-tree newlib build: error: unrecognizable insn

2024-11-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657 Andrew Stubbs changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/117709] New: [15 regression] maskload else case generating wrong code

2024-11-20 Thread ams at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ams at gcc dot gnu.org CC: rdapp at gcc dot gnu.org Target Milestone: --- The following testcase aborts on amdgcn since the maskload else patches were added (https

[Bug target/117657] [15 Regression][gcn] ICE during in-tree newlib build: error: unrecognizable insn

2024-11-19 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657 --- Comment #9 from Andrew Stubbs --- That commit should fix the build failure. However, I'm now seeing a wrong-code regression in gcc.dg/vect/vect-simd-17.c that I can't prove isn't related. The testcase now aborts, at least on gfx90a, where i

[Bug target/117657] [15 Regression][gcn] ICE during in-tree newlib build: error: unrecognizable insn

2024-11-19 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657 --- Comment #6 from Andrew Stubbs --- The patch changed the wrong operand on the gen_gather_insn_1offset_exec call. It sets one of the offsets undefined instead of setting the else value undefined. I'm testing a fix.

[Bug target/117657] [15 Regression][gcn] ICE during in-tree newlib build: error: unrecognizable insn

2024-11-18 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657 --- Comment #1 from Andrew Stubbs --- This appears to have been caused by the recent maskload patches, which is weird because I thought I already tested the patches that were posted.

[Bug target/116955] [15 Regression] GCN '-march=gfx1100': [-PASS:-]{+FAIL:+} gcc.dg/vect/pr81740-2.c execution test

2024-10-04 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116955 --- Comment #2 from Andrew Stubbs --- Compared to gfx908, gfx1100 lacks 64-lane permute and vector reductions. Permute works with 32 lanes or fewer, but reductions are unimplemented in the backend. Otherwise it should vectorize the same. That m

[Bug target/116571] [15 Regression] GCN vs. "lower SLP load permutation to interleaving"

2024-09-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116571 --- Comment #6 from Andrew Stubbs --- (In reply to Richard Biener from comment #5) > (In reply to Thomas Schwinge from comment #4) > > The GCN target FAILs that I originally had reported here: > > > > > [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-11

[Bug target/116104] [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-30 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104 Andrew Stubbs changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103 --- Comment #8 from Andrew Stubbs --- (In reply to Thomas Schwinge from comment #4) > (In reply to Richard Biener from comment #2) > > if (VECTOR_BOOLEAN_TYPE_P (type) > > && SCALAR_INT_MODE_P (TYPE_MODE (type))) > > return true; >

[Bug target/116104] [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104 --- Comment #4 from Andrew Stubbs --- The problem insn is this: (insn 31 30 32 2 (set (reg:V2SI 711) (ashift:V2SI (reg:V2SI 161 v1) (const_vector:V2SI [ (const_int 3 [0x3]) repeated x2 ]))

[Bug target/116104] [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104 --- Comment #3 from Andrew Stubbs --- (In reply to Jeffrey A. Law from comment #1) > So, how am I supposed to reproduce this? I don't have an assembler/binutils > for amdgcn and thus libgcc won't configure. Thus I can't extract a testcase. >

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-28 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #18 from Andrew Stubbs --- That should fix the broken validation check. All V32 permutations should work now on RDNA GPUs, I think. V16 and smaller were already working fine.

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-26 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #16 from Andrew Stubbs --- On 26/06/2024 14:41, rguenther at suse dot de wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 > > --- Comment #15 from rguenther at suse dot de --- >>> Btw, the above looks quite odd for nelt

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-26 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #14 from Andrew Stubbs --- On 26/06/2024 13:34, rguenth at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 > > --- Comment #13 from Richard Biener --- > (In reply to Richard Biener from comment #12) >>

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-26 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #10 from Andrew Stubbs --- On 26/06/2024 12:05, rguenth at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 > > --- Comment #8 from Richard Biener --- > (In reply to Richard Biener from comment #7) >> I

[Bug target/115640] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-25 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640 --- Comment #3 from Andrew Stubbs --- (In reply to Richard Biener from comment #2) > If you force GCN to use fixed length vectors (how?), does it work? How's > it behaving on aarch64 with SVE? (the CI was happy, but maybe doesn't > enable SVE)

[Bug target/115631] [15 Regression] GCN: [-PASS:-]{+FAIL:+} c-c++-common/torture/builtin-arith-overflow-6.c -O2 execution test

2024-06-25 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115631 --- Comment #1 from Andrew Stubbs --- It was writing 0 to s12 (scalar register) and then moving the zero to lane zero of v0 (vector register). Now it's writing the 0 directly to v0, of which all but lane zero is masked. These should be identic

[Bug tree-optimization/115304] gcc.dg/vect/slp-gap-1.c FAILs

2024-06-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304 --- Comment #11 from Andrew Stubbs --- (In reply to rguent...@suse.de from comment #10) > On Mon, 3 Jun 2024, ams at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304 > > > > --- Comme

[Bug tree-optimization/115304] gcc.dg/vect/slp-gap-1.c FAILs

2024-06-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304 --- Comment #9 from Andrew Stubbs --- (In reply to Richard Biener from comment #6) > The best strathegy for GCN would be to gather V4QImode aka SImode into the > V64QImode (or V16SImode) vector. For pix2 we have a gap of 28 elements, > doing co

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717 --- Comment #3 from Andrew Stubbs --- Can this be filtered (safely) in mkoffload? That tool is offload-target-specific, so no problem with "if offload target were to support it".

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302 --- Comment #4 from Andrew Stubbs --- Yes, that's what the simd-math-3* tests do. The simd-math-5* tests are explicitly supposed to be doing this in the context of the autovectorizer. If these tests are being compiled as (newly) intended then

[Bug target/114302] [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302 --- Comment #2 from Andrew Stubbs --- The execution test checks that each of the libgcc routines work correctly, and the scan assembler tests make sure that we're getting coverage of all of them. In this case, the failure indicates that we're n

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2024-02-12 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #8 from Andrew Stubbs --- (In reply to seurer from comment #7) > On the BE machine: > > seurer@nilram:~/gcc/git/build/gcc-test$ ulimit -a > real-time non-blocking time (microseconds, -R) unlimited > ... > max locked memory

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2024-02-08 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #6 from Andrew Stubbs --- (In reply to seurer from comment #5) > I should note that pinned-2 also fails on powerpc64 LE. > > make -k check-target-libgomp RUNTESTFLAGS="c.exp=libgomp.c/alloc-pinned-*" > FAIL: libgomp.c/alloc-pinned-

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-01-29 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615 --- Comment #3 from Andrew Stubbs --- I did see these, but I hadn't had time to chase them up. The proposed patch is exactly the sort of solution I was expecting to find, short term. Have you confirmed that it fixes all the cases? A proper sol

[Bug middle-end/113199] [14 Regression][GCN] ICE (segfault) due to invalid 'loop_mask_46 = VEC_PERM_EXPR' when compiling Newlib's wcsftime.c

2024-01-09 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113199 --- Comment #5 from Andrew Stubbs --- I can confirm that I can now build the amdgcn toolchain once more. :-) Thanks.

[Bug middle-end/113163] [14 Regression][GCN] ICE in vect_peel_nonlinear_iv_init, at tree-vect-loop.cc:9420

2024-01-02 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113163 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org --- Comment #11

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2023-12-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #4 from Andrew Stubbs --- It's going to be difficult to make this test work when only one page of locked memory is available. :-( I will look at making it "unsupported".

[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails

2023-12-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #1 from Andrew Stubbs --- That is a typo. I don't want to make it pass on machines that have insufficient memory configured because it will mask the case where it fails for another reason. However, the testcase was originally suppo

[Bug target/113022] GCN offloading bricked by "amdgcn: Work around XNACK register allocation problem"

2023-12-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113022 --- Comment #1 from Andrew Stubbs --- This is what I get for trying to get this done before vacation. :( Yes, there's probably something in mkoffload that has to match the default change from -mxnack=any to -mxnack=off on the older ISAs.

[Bug target/112937] [14 Regression] GCN: FAILs due to unconditional 'f->use_flat_addressing = true;'

2023-12-11 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112937 --- Comment #2 from Andrew Stubbs --- Flat addressing *should* be the safe option that always works (although using "global" address space permits slightly more efficient offset options).

[Bug target/112481] [14 Regression] RISCV: ICE: Segmentation fault when compiling pr110817-3.c

2023-11-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481 Andrew Stubbs changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/112481] [14 Regression] RISCV: ICE: Segmentation fault when compiling pr110817-3.c

2023-11-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481 --- Comment #7 from Andrew Stubbs --- Simply changing to OPTAB_WIDEN solves the ICE, but I don't know if it does so in a sensible way, for RISC V. @@ -7489,7 +7489,7 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size,

[Bug target/112481] [14 Regression] RISCV: ICE: Segmentation fault when compiling pr110817-3.c

2023-11-13 Thread ams at gcc dot gnu.org via Gcc-bugs
||2023-11-13 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #4 from Andrew Stubbs --- It fails because optab_handler fails to find an instruction for "and_optab" in SImode.

[Bug target/112308] [14 Regression] GCN: 'error: literal operands are not supported' for 'v_add_co_u32'

2023-11-10 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112308 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/112313] [14 Regression] GCN target 'gcc.dg/pr111082.c' ICE, 'during RTL pass: vregs': 'error: unrecognizable insn'

2023-11-10 Thread ams at gcc dot gnu.org via Gcc-bugs
|RESOLVED Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #2 from Andrew Stubbs --- This is now fixed.

[Bug target/112308] [14 Regression] GCN: 'error: literal operands are not supported' for 'v_add_co_u32'

2023-11-09 Thread ams at gcc dot gnu.org via Gcc-bugs
||2023-11-09 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org

[Bug target/112088] [14 Regression] GCN target testing broken by "amdgcn: add -march=gfx1030 EXPERIMENTAL"

2023-10-27 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112088 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/112088] [14 Regression] GCN target testing broken by "amdgcn: add -march=gfx1030 EXPERIMENTAL"

2023-10-27 Thread ams at gcc dot gnu.org via Gcc-bugs
|1 Last reconfirmed||2023-10-27 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #1 from Andrew Stubbs --- I'm testing a fix for this.

[Bug target/110313] [14 Regression] GCN Fiji reload ICE in 'process_alt_operands'

2023-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313 --- Comment #5 from Andrew Stubbs --- One thing that is unusual about the GCN stack pointer is that it's actually two registers. Could this be breaking some cprop assumptions? GCN can't fit an address in one (SImode) register so all (DImode) po

[Bug target/110313] [14 Regression] GCN Fiji reload ICE in 'process_alt_operands'

2023-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313 --- Comment #3 from Andrew Stubbs --- It's curious that this affects the Fiji target only, and not the newer targets at all. There are some additional register options for multiply instructions, some differences to atomics, but mostly the diffe

[Bug target/110313] [14 Regression] GCN Fiji reload ICE in 'process_alt_operands'

2023-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110313 --- Comment #1 from Andrew Stubbs --- This ICE also affect the following standalone test failures (raw amdgcn, no offloading): gfortran.dg/assumed_rank_21.f90 gfortran.dg/finalize_38.f90 gfortran.dg/finalize_38a.f90

[Bug testsuite/108898] [13 Regression] Test introduced by r13-6278-g3da77f217c8b2089ecba3eb201e727c3fcdcd19d failed on i386

2023-03-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108898 --- Comment #4 from Andrew Stubbs --- I did not know there was a way to do that! I'll add this to my to-do list.

[Bug testsuite/108898] [13 Regression] Test introduced by r13-6278-g3da77f217c8b2089ecba3eb201e727c3fcdcd19d failed on i386

2023-02-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108898 --- Comment #1 from Andrew Stubbs --- I tested it on i686-pc-linux-gnu before I posted the patch, and it was working then. Can you be more specific what configuration you were testing, please?

[Bug target/107510] gcc/config/gcn/gcn.cc:4930:9: style: Same expression on both sides of '||'. [duplicateExpression]

2022-11-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug other/89863] [meta-bug] Issues in gcc that other static analyzers (cppcheck, clang-static-analyzer, PVS-studio) find that gcc misses

2022-11-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863 Bug 89863 depends on bug 107510, which changed state. Bug 107510 Summary: gcc/config/gcn/gcn.cc:4930:9: style: Same expression on both sides of '||'. [duplicateExpression] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510 What|R

[Bug target/107510] gcc/config/gcn/gcn.cc:4930:9: style: Same expression on both sides of '||'. [duplicateExpression]

2022-11-03 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107510 Andrew Stubbs changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org

[Bug tree-optimization/107096] Fully masking vectorization with AVX512 ICEs gcc.dg/vect/vect-over-widen-*.c

2022-10-10 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107096 --- Comment #4 from Andrew Stubbs --- I don't understand rgroups, but I can say that GCN masks are very simply one-bit-one-lane. There are always 64-lanes, regardless of the type, so V64QI mode has fewer bytes and bits than V64DImode (when writt

[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64

2022-09-30 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107088 --- Comment #9 from Andrew Stubbs --- I can confirm that the patch fixes the amdgcn build.

[Bug middle-end/107088] [13 Regression] cselib ICE building __trunctfxf2 on ia64

2022-09-30 Thread ams at gcc dot gnu.org via Gcc-bugs
-*-* CC||ams at gcc dot gnu.org --- Comment #7 from Andrew Stubbs --- I get the same failure on amdgcn building newlib/libm/math/kf_rem_pio2.c

[Bug tree-optimization/106476] New: ICE generating FOLD_EXTRACT_LAST

2022-07-29 Thread ams at gcc dot gnu.org via Gcc-bugs
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ams at gcc dot gnu.org CC: rguenther at suse dot de Target Milestone: --- Target: amdgcn-amdhsa Commit 8f4d9c1deda "amdgcn: 64-bit not" exposed an ICE in tree-vect_stmts.cc when

[Bug target/105873] [amdgcn][OpenMP] task reductions fail with "team master not responding; slave thread aborting"

2022-06-08 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105873 --- Comment #4 from Andrew Stubbs --- I think unused threads should be given a no-op function to run, not a null pointer. The GCN implementation cannot tell the difference between a null pointer and an unset pointer (which is what happens when t

[Bug target/105246] [amdgcn] Use library call for SQRT with -ffast-math + provide additional option to use single-precsion opcode

2022-04-13 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105246 --- Comment #2 from Andrew Stubbs --- When we first coded this we only had the GCN3 ISA manual, which says nothing about the accuracy. Now I look in the Vega manual (GCN5) I see: Square root with perhaps not the accuracy you were hoping for

[Bug target/100181] hot-cold partitioned code doesn't assemble

2022-02-11 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 --- Comment #13 from Andrew Stubbs --- I've updated the LLVM version documentation at https://gcc.gnu.org/wiki/Offloading#For_AMD_GCN: It's LLVM 9 or 13.0.1 now (nothing in between), and will be 13.0.1+ for the next release (dropping LLVM 9 bec

[Bug middle-end/104026] [12 Regression] ICE in wide_int_to_tree_1, at tree.c:1755 via tree-vect-loop-manip.c:673

2022-01-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104026 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org --- Comment #6

[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821

2021-11-25 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103396 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821

2021-11-24 Thread ams at gcc dot gnu.org via Gcc-bugs
|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #4 from Andrew Stubbs --- I think I have a fix for this. It happens when the link register has to be saved because it is used

[Bug target/103201] [12 Regression] trunk 20211111 ftbfs for amdgcn – libgomp/teams.c:49:6: error: 'struct gomp_thread' has no member named 'num_teams'

2021-11-12 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103201 --- Comment #3 from Andrew Stubbs --- I did some preliminary testing on your patch: the libgomp.c/target-teams-1.c testcase runs fine on amdgcn. I presume that that covers most of the existing features of those runtime calls?

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-10-04 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #8 from Andrew Stubbs --- Did you get the C version to return anything other than "-1"? (The expected result is "2".) I'm still trying to determine if the device is compatible, but the mapping problem looks like a different issue.

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-10-01 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #5 from Andrew Stubbs --- Sorry, I should have said to compile with -fopenacc. If you did do that, please post the GCN_DEBUG output.

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-10-01 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #3 from Andrew Stubbs --- That output shows that we have the correct libgomp and rocm is installed and working. Libgomp initialized the GCN plugin, but did not attempt to initialize the device (the next message in the output should h

[Bug target/102544] GCN offloading not working for 'amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-'

2021-09-30 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102544 --- Comment #1 from Andrew Stubbs --- Please set "export GCN_DEBUG=1", try it again, and post the output.

[Bug target/102260] amdgcn offload compiler fails to configure, not matching target directive's target id

2021-09-09 Thread ams at gcc dot gnu.org via Gcc-bugs
||2021-09-09 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org --- Comment #1 from Andrew Stubbs --- In addition to changing the amdgcn_target syntax in LLVM 13, the LLVM GCN guys have also renamed the

[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"

2021-07-21 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #5 from Andrew Stubbs --- [Note: all of my comments refer to the amdgcn case. nvptx has somewhat different support in this area.] (In reply to Jonathan Wakely from comment #4) > But it's a waste of space in the .so to build lots of

[Bug target/100208] amdgcn fails to build with llvm-mc from llvm12

2021-07-21 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100208 Andrew Stubbs changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/101544] [OpenMP][AMDGCN][nvptx] C++ offloading: unresolved _Znwm = "operator new(unsigned long)"

2021-07-21 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101544 --- Comment #3 from Andrew Stubbs --- The standalone amdgcn configuration does not support C++. There are a number of technical reasons why it doesn't Just Work, but basically it comes down to no-one ever working on it. Our customers were primar

[Bug target/101484] [12 Regression] trunk 20210717 ftbfs for amdgcn-amdhsa (gcn offload)

2021-07-17 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101484 Andrew Stubbs changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug target/97827] bootstrap error building the amdgcn-amdhsa offload compiler with LLVM 11

2021-07-02 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827 Andrew Stubbs changed: What|Removed |Added CC||xw111luoye at gmail dot com --- Comment

[Bug target/95023] Offloading AMD GCN wiki cannot be followed

2021-07-02 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95023 Andrew Stubbs changed: What|Removed |Added CC||ams at gcc dot gnu.org

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-14 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-06 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 --- Comment #13 from Andrew Stubbs --- I found a lot more ICEs when testing my patch. They look to be unrelated (TImode come back to haunt us), but it makes it hard to be sure.

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-05 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 --- Comment #9 from Andrew Stubbs --- I found a couple of other places to put force_operand and the full case works now. Running more tests

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-05 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 Andrew Stubbs changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed|

[Bug target/100418] [12 Regression][gcn] since r12-397 bootstrap fails: error: unrecognizable insn: in extract_insn, at recog.c:2770

2021-05-05 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100418 --- Comment #4 from Andrew Stubbs --- Alexandre's patch has this: emit_move_insn (rem, plus_constant (ptr_mode, rem, -blksize)); Is that generally a valid thing to do? It seems like other places do similar things...

[Bug target/100208] amdgcn fails to build with llvm-mc from llvm12

2021-04-22 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100208 --- Comment #1 from Andrew Stubbs --- LLVM changed the default parameters, so we either have to change the expectations in the ".amdgcn_target" string (which is basically an assert), or set the attributes be want explicitly on the assembler comm

[Bug target/97521] [11 Regression] wrong code with -mno-sse2 since r11-3394

2020-10-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521 --- Comment #22 from Andrew Stubbs --- (In reply to Andrew Stubbs from comment #21) > (In reply to Richard Biener from comment #19) > > GCN also uses MODE_INT for the mask mode and thus may be similarly affected. > > Andrew - are the bits in the

[Bug target/97521] [11 Regression] wrong code with -mno-sse2 since r11-3394

2020-10-23 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521 --- Comment #21 from Andrew Stubbs --- (In reply to Richard Biener from comment #19) > GCN also uses MODE_INT for the mask mode and thus may be similarly affected. > Andrew - are the bits in the mask dense? Thus for a V4SImode compare > would th

[Bug tree-optimization/84958] int loads not eliminated against larger stores

2020-10-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84958 --- Comment #6 from Andrew Stubbs --- (In reply to Tom de Vries from comment #5) > I've removed the xfail for nvptx. > > The only remaining xfail is for gcn. Is that one still necessary? The test still fails for gcn.

[Bug libgomp/97332] [gcn] GCN_NUM_GANGS/GCN_NUM_WORKERS override compile-time constants

2020-10-08 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97332 Andrew Stubbs changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed|

[Bug target/96306] gcn libgomp build broken after "libomp: Add omp_depend_kind to omp_lib.{f90,h}"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306 --- Comment #8 from Andrew Stubbs --- I'm loath to enable TImode if it's going to ICE all over the place, and I can't just drop everything else and implement working TImode unless there's an easy solution. It's always been on the nice-to-have lis

[Bug target/95730] GCN offloading ICEs after commit fe7ebef7fe4f9acb79658ed9db0749b07efc3105 "Add support for __builtin_bswap128"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95730 --- Comment #4 from Andrew Stubbs --- In fact default_scalar_mode_supported_p does return *false* for TImode (because LONG_LONG_TYPE_SIZE == 64, and BITS_PER_WORD == 32). Therefore int128_t does not exist, as far as users are concerned. I'm not

[Bug target/96306] gcn libgomp build broken after "libomp: Add omp_depend_kind to omp_lib.{f90,h}"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306 --- Comment #5 from Andrew Stubbs --- GCC will automatically generate libgcc calls for types up to 2*BITS_PER_WORD, but no further. Since BITS_PER_WORD is 32 on GCN this means no automatic TImode support for anything that would go that route (suc

[Bug target/96306] gcn libgomp build broken after "libomp: Add omp_depend_kind to omp_lib.{f90,h}"

2020-07-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96306 --- Comment #3 from Andrew Stubbs --- TImode was added for use by a few instructions that take two 64-bit values in consecutive registers. It's also useful for the SLP fake vectorization stuff. It wasn't intended for use with user types; I proba

[Bug target/95864] [11 Regression] GCN offloading execution regressions after commit f062c3f11505b70c5275e5bc0e52f3e441f8afbc "amdgcn: Switch to HSACO v3 binary format"

2020-06-24 Thread ams at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95864 --- Comment #1 from Andrew Stubbs --- I'm aware of these issues. I fixed all the test failures that were definitely bugs in the HSACOv3 implementation, and the ones that remain appear to be either latent bugs uncovered by the new driver configur

  1   2   >