[committed, amdgcn] Fix conditional add LRA failure

2020-01-31 Thread Andrew Stubbs
d LRA failure Fix ICE in testcase gfortran.dg/assumed_rank_bounds_3.f90. 2020-01-31 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (addv64di3_exec): Allow one '0' in each alternative only. diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index 4aad835b2ef..ecdd60b8

[committed, amdgcn] Remove gfx801 "carrizo" support

2020-02-03 Thread Andrew Stubbs
back if and when somebody volunteers to fix and maintain it. Andrew Remove gfx801 "carrizo" support 2020-02-03 Andrew Stubbs gcc/ * config.gcc: Remove "carrizo" support. * config/gcn/gcn-opts.h (processor_type): Likewise. * config/gcn/gcn.c (gcn_omp_device_kind_arch_is

Re: [PATCH, v3] wwwdocs: e-mail subject lines for contributions

2020-02-04 Thread Andrew Stubbs
On 03/02/2020 18:09, Michael Matz wrote: But suggesting that using the subject line for tagging is recommended can lead to subjects like [PATCH][GCC][Foo][component] Fix foo component bootstrap failure in an e-mail directed to gcc-patches@gcc.gnu.org (from somewhen last year, where Foo/foo wa

[committed] amdgcn: Remove redundant multilib

2020-02-05 Thread Andrew Stubbs
This patch removes a redundant "gfx900/gfx906" multilib that was added by accident. We need those options independently, but not together. Andrew amdgcn: Remove redundant multilib 2020-02-05 Andrew Stubbs gcc/ * config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Use / not space. diff -

Re: Host and offload targets have no common meaning of address spaces (was: [ping] Re-unify 'omp_build_component_ref' and 'oacc_build_component_ref')

2021-09-03 Thread Andrew Stubbs
On 24/08/2021 12:43, Richard Biener via Gcc-patches wrote: On Tue, Aug 24, 2021 at 12:23 PM Thomas Schwinge wrote: Hi! On 2021-08-19T22:13:56+0200, I wrote: On 2021-08-16T10:21:04+0200, Jakub Jelinek wrote: On Mon, Aug 16, 2021 at 10:08:42AM +0200, Thomas Schwinge wrote: |> Concerning the

Re: [PATCH][AMDGCN] Skip test gcc/testsuite/gcc.dg/asm-4.c

2019-12-05 Thread Andrew Stubbs
On 05/12/2019 07:05, Harwath, Frederik wrote: Hi, the inline assembly "p" modifier ("An operand that is a valid memory address is allowed", cf. https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints) is not supported on AMD GCN. This causes an ICE during the compilation o

Re: [RFC] Characters per line: from punch card (80) to line printer (132)

2019-12-05 Thread Andrew Stubbs
On 05/12/2019 16:17, Joseph Myers wrote: Longer lines mean less space for multiple terminal / editor windows side-by-side to look at different pieces of code. I don't think that's an improvement. Here's a data-point My 1920 pixel-wide screen, in the default font, allows 239 columns; not

Re: [RFC] Characters per line: from punch card (80) to line printer (132)

2019-12-06 Thread Andrew Stubbs
On 05/12/2019 18:21, Robin Curtis wrote: My IBM Selectric golfball electronic printer only does 90 characters on A4 in portrait mode………(at 10 cps) (as for my all electric TELEX Teleprinter machine !) Is this debate for real ?! - or is this a Christmas spoof ? I can't speak for the debate, b

[committed, amdgcn] Enable QI/HImode vector moves

2019-12-06 Thread Andrew Stubbs
w passes in the vect.exp (there's also 41 new fails, but those are exposed bugs I'll fix shortly). Some of these were internal compiler errors that did not exist in older compilers. -- Andrew Stubbs Mentor Graphics / CodeSourcery Enable QI/HImode vector moves 2019-12-06 Andrew Stubbs

[committed, amdgcn] Fix unrecognised instruction

2019-12-06 Thread Andrew Stubbs
which didn't assemble well. E.g. it had 'flat_load_short', instead of 'flat_load_ustore'. This fixes about 39 tests in vect.exp. -- Andrew Stubbs Mentor Graphics / CodeSourcery Fix unrecognised GCN instruction. 2019-12-06 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md

Re: [committed, amdgcn] Enable QI/HImode vector moves

2019-12-09 Thread Andrew Stubbs
On 06/12/2019 18:21, Richard Sandiford wrote: Andrew Stubbs writes: Hi all, This patch re-enables the V64QImode and V64HImode for GCN. GCC does not make these easy to work with because there is (was?) an assumption that vector registers do not have excess bits in vector registers, and

Re: [committed, amdgcn] Enable QI/HImode vector moves

2019-12-09 Thread Andrew Stubbs
Oops, please consider this patch as submitted from my @codesourcery.com address, for copyright assignment purposes. Andrew On 06/12/2019 17:31, Andrew Stubbs wrote: Hi all, This patch re-enables the V64QImode and V64HImode for GCN. GCC does not make these easy to work with because there is

Re: [committed, amdgcn] Fix unrecognised instruction

2019-12-09 Thread Andrew Stubbs
On 06/12/2019 17:57, Andrew Stubbs wrote: Hi all, I've committed the attached to fix a failure-to-assemble bug that can occur in some vectorized code.  This has been hidden for a long time because sub-word vectors were disabled on GCN, but this is no longer the case. The gather

[RFC, vectorizer] Fix ICE with masked vectors

2019-12-09 Thread Andrew Stubbs
Hi, This patch fixes an ICE in testcase gcc.dg/vect/vect-ctor-1.c: during GIMPLE pass: vect dump file: vect-ctor-1.c.159t.vect .../gcc.dg/vect/vect-ctor-1.c: In function 'intrapred_luma_16x16': .../gcc.dg/vect/vect-ctor-1.c:9:6: internal compiler error: in exact_div, at poly-int.h:2162 0xdf845f

Re: [RFC, vectorizer] Fix ICE with masked vectors

2019-12-10 Thread Andrew Stubbs
On 09/12/2019 15:59, Richard Sandiford wrote: No, the assumption's correct even there. The assert usually triggers because something elsewhere is getting confused about the vector types. The attached patch fixes the ICE in the testcase, but I suspect does not go far enough. Can it happen that

[committed, amdgcn] Add sub-dword vector extend and truncate insns

2019-12-13 Thread Andrew Stubbs
amdgcn 2019-12-13 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (sdwa): New mode attribute. (VCVT_FROM_MODE): Rename to ... (VCVT_MODE): ... this. (VCVT_TO_MODE): Delete mode iterator. (VCVT_FMODE): New mode iterator. (VCVT_IMODE): Likewise. (2): Change ... (2): ... to this. (2): New.

[committed, amdgcn] Add sub-dword vector multiply

2019-12-13 Thread Andrew Stubbs
I've committed this patch to add v64qi and v64hi multiply patterns. This is slowly working toward full char and short vectorization. Andrew Sub-dword vector multiply for amdgcn 2019-12-13 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (mulv64si3): Rename to ... (mul3): ... this

Re: [patch, openacc] Adjust tests for amdgcn offloading

2019-12-13 Thread Andrew Stubbs
On 19/11/2019 12:21, Andrew Stubbs wrote: This patch adds GCN special casing for most of the OpenACC libgomp tests that require it. It also disables one testcase that explicitly uses CUDA. The patches aren't all that controversial, should only change the results on amdgcn, and Tobias al

Re: [PATCH] Add OpenACC 2.6 `acc_get_property' support

2019-12-17 Thread Andrew Stubbs
On 16/12/2019 23:00, Thomas Schwinge wrote: There is no AMD GCN support yet. This will be added later on. ACK, just to note that there now is a 'libgomp/plugin/plugin-gcn.c' that at least needs to get a stub implementation (can mostly copy from 'libgomp/plugin/plugin-hsa.c'?) as otherwise the b

[committed, amdgcn] Implement clz and ctz

2019-12-17 Thread Andrew Stubbs
This patch implements the count leading and trailing zeros instruction patterns in the AMD GCN backend. This is prerequisite for implementing the extract_last patterns. Andrew Stubbs Mentor Graphics / CodeSourcery Add clz and ctz for amdgcn 2019-12-17 Andrew Stubbs gcc/ * config/gcn

[committed, amdgcn] Implement extract_last and fold_extract_last

2019-12-17 Thread Andrew Stubbs
ect.exp to name them all individually, but includes vect-cond_reduc-* and pr65947-10.c. Andrew Stubbs Mentor Graphics / CodeSourcery Add extract_last for amdgcn 2019-12-17 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (extract_last_): New expander. (fold_extract_last_): New expander. gcc

[committed, pr92772] Mention bug in comment

2019-12-17 Thread Andrew Stubbs
m and hopefully the pointer will save future readers some confusion. Andrew Add pointer to PR92772 2019-12-17 Andrew Stubbs * tree-vect-loop.c (vect_create_epilog_for_reduction): Mention pr92772 in the comments. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 353a5ff06e1..6869

[committed, amdgcn] Fix vect/pr65947-8.c testcase for amdgcn

2019-12-18 Thread Andrew Stubbs
tions expect that it will not. I fixed it by special-casing GCN. There's might be a more general way, but apparently this does happen for other architectures (?) Andrew Fix vect/pr65947-8.c testcase for amdgcn. 2019-12-18 Andrew Stubbs gcc/testsuite/ * gcc.dg/vect/pr65947-8.c: C

[committed, amdgcn] Add sub-dword add/sub patterns

2019-12-19 Thread Andrew Stubbs
elsewhere. This results in 80 new test passes. There are a few regressions from vectorization tests that took a different code path and encountered another missing instruction. Andrew Implement sub-dword add/sub on amdgcn 2019-12-19 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (addv64si3

[committed, amdgcn] Use V64SI for all remaining add-with-carry insns

2019-12-19 Thread Andrew Stubbs
not interesting for those modes (being mostly used to implement DImode splitters), so we can dispense with the notional iterator. Andrew Use V64SI for all amdgcn add-with-carry insns 2019-12-19 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (*plus_carry_dpp_shr_): Rename to

[committed, amdgcn] Allow constants in vector extends and truncates

2019-12-19 Thread Andrew Stubbs
drew Allow constants in amdgcn extends and truncates 2019-12-19 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (2): Change input predcate to gcn_alu_operand. (extend2): Likewise. (truncv64di2): Likewise. (truncv64di2_exec): Likewise. (v64di2): Likewise. (v64di2_exec): Likewise. diff -

[committed, amdgcn] Fix inline immediate range

2020-01-06 Thread Andrew Stubbs
Inline immediates for AMD GCN instructions are supposed to be in the range -16..64 inclusive, but the implementation had the upper bound exclusive. This patch fixes the error. Andrew Fix amdgcn inline immediate range 2020-01-06 Andrew Stubbs gcc/ * config/gcn/gcn.c

[committed, amdgcn] Fix early-clobber in vec_extract

2020-01-06 Thread Andrew Stubbs
utput register pairs. Other patterns use '0' to allow exact matches, but the input and outputs here are different size, and I'm not sure what happens there. Anyway, this is safe. Andrew Fix early-clobber in amdgcn vec_extract 2020-01-06 Andrew Stubbs gcc/ * config/gcn/gcn-

[committed, amdgcn] Fix issue with '0' constraints

2020-01-06 Thread Andrew Stubbs
but just didn't produce bad code?) Adding an alternatives for each permutation fixes the problem. This has already been done for many other patterns. Andrew Fix amdgcn issue with '0' constraints 2020-01-06 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (subv64di3): Use separa

[committed, amdgcn] Disallow 'B' constraints on addc/subb

2020-01-07 Thread Andrew Stubbs
ork has addressed that issue too. Andrew Disallow 'B' constraints on amdgcn addc/subb 2020-01-07 Andrew Stubbs gcc/ * config/gcn/constraints.md (DA): Update description and match. (DB): Likewise. (Db): New constraint. * config/gcn/gcn-protos.h (gcn_inline_constant64_p): Ad

[committed, amdgcn] Add more modes for vector comparisons

2020-01-07 Thread Andrew Stubbs
d 3 loops" 1 FAIL: gcc.dg/vect/vect-cond-reduc-4.c scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/vect-cselim-1.c scan-tree-dump-times vect "vectorized 2 loops" 1 FAIL: gcc.dg/vect/vect-version-1.c scan-tree-dump vect "applying loop versioning to oute

Re: [PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap operations

2020-01-08 Thread Andrew Stubbs
On 08/01/2020 11:07, Kwok Cheung Yeung wrote: +#define __sync_subword_compare_and_swap(type, size)    \ Macro parameters are conventionally upper case. +    \ +type    \ +__sync_val_compare_and_swap_##size (ty

Re: [PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap operations

2020-01-09 Thread Andrew Stubbs
On 08/01/2020 18:18, Kwok Cheung Yeung wrote: Is this version okay for trunk? OK, thanks. Andrew

[PATCH] OpenMP: Ensure that offloaded variables are public

2021-11-16 Thread Andrew Stubbs
Hi, This patch is needed for AMD GCN offloading when we use the assembler from LLVM 13+. The GCN runtime (libgomp+ROCm) requires that the location of all variables in the offloaded variables table are discoverable at runtime (using the "hsa_executable_symbol_get_info" API), and this only wor

Re: [Patch?][RFC][RTL] clobber handling & buildin expansion - missing insn_invalid_p call [PR100418]

2021-06-02 Thread Andrew Stubbs
On 30/05/2021 19:51, Jeff Law wrote: On 5/5/2021 7:50 AM, Tobias Burnus wrote: Hi Eric, hi all, currently, gcn (amdgcn-amdhsa) bootstrapping fails as Alexandre's patch to __builtin_memset (applied yesterday) now does more expansions. The problem is [→ PR100418]   (set(reg:DI)(plus:DI(reg:DI)

[committed] amdgcn: Support LLVM 13 assembler syntax

2021-10-07 Thread Andrew Stubbs
I've committed this patch to allow GCC to adapt to the different variants of the LLVM amdgcn assembler. Unfortunately they keep making changes without maintaining backwards compatibility. GCC should now work with LLVM 9, LLVM 12, and LLVM 13 in terms of CLI usage, however only LLVM 9 is well t

[committed] amdgcn: Implement -msram-ecc=any

2021-10-07 Thread Andrew Stubbs
I've committed this patch to implement the -msram-ecc=any feature that has been stubbed out awaiting LLVM support for a while now. When the LLVM assembler supports the "any" feature (v13+) GCC will now make use of it. Otherwise, GCC will continue to treat "any" the same as "on". Using the "a

[committed] amdgcn: Fix assembler version incompatibility

2021-10-07 Thread Andrew Stubbs
I've committed this patch to fix another case of LLVM assembler incompatibility. Marcel previously posted a patch to fix up the global_load and global_store instructions, following a non-backwards-compatible change in the assembler. https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572987.ht

[committed] amdgcn: fix up offload debug linking with LLVM 13

2021-10-15 Thread Andrew Stubbs
This is a follow-up to my previous LLVM13 support patches (the amdgcn port uses the LLVM assembler) to fix up a corner case. With this patch one can now enable debug information in LLVM 13 offload binaries. This was trickier than you'd think because the different LLVM versions have different a

Re: [Patch][GCN] [GCC 11] Backport GCN with LLVM-MC 13 linker fixes to GCC 11

2021-10-18 Thread Andrew Stubbs
This is fine by me. As I said in my email on the 15th, LLVM 13 is still not considered safe to use. The ICE you encountered is a real problem that will affect real users. I expect to work on a solution for that soon. Andrew On 16/10/2021 21:41, Tobias Burnus wrote: This patch is mostly mot

Re: [PATCH libatomic/arm] avoid warning on constant addresses (PR 101379)

2021-07-17 Thread Andrew Stubbs
On 16/07/2021 18:42, Thomas Schwinge wrote: Of course, we may simply re-work the libgomp/GCN code -- but don't we first need to answer the question whether the current code is actually "bad"? Aren't we going to get a lot of similar reports from kernel/embedded/other low-level software developers

Re: [gcn] Work-around libgomp 'error: array subscript 0 is outside array bounds of ‘__lds struct gomp_thread * __lds[0]’ [-Werror=array-bounds]' (was: [PATCH libatomic/arm] avoid warning on constant a

2021-07-19 Thread Andrew Stubbs
On 19/07/2021 09:46, Thomas Schwinge wrote: GCN already uses address 4 for this value because address 0 caused problems with null-pointer checks. Ugh. How much wasted bytes per what is that? (I haven't looked yet; hopefully not per GPU thread?) Because: It's 4 bytes per gang. And that poin

[committed] amdgcn: Add -mxnack and -msram-ecc [PR 100208]

2021-07-19 Thread Andrew Stubbs
This patch adds two new GCN-specific options: -mxnack and -msram-ecc={on,off,any}. The primary purpose is to ensure that we have an explicit default setting for these features and that this is passed to the assembler. This will ensure that if LLVM defaults change, again, GCC won't get caught

[og11][committed] amdgcn: Add -mxnack and -msram-ecc [PR 100208]

2021-07-20 Thread Andrew Stubbs
This is now backported to devel/omp/gcc-11. Andrew On 19/07/2021 17:49, Andrew Stubbs wrote: This patch adds two new GCN-specific options: -mxnack and -msram-ecc={on,off,any}. The primary purpose is to ensure that we have an explicit default setting for these features and that this is

[committed] amdgcn: Fix attributes for LLVM-12 [PR 100208]

2021-07-28 Thread Andrew Stubbs
This patch follows up my previous patch and supports more variants of LLVM 12. There are still other incompatibilities with LLVM 12, but this at least the ELF attributes should now automatically tune to any LLVM 9, 10, or 12 assembler (It would be nice if one set of options would just work ev

[OG11, committed] amdgcn: Fix attributes for LLVM-12 [PR 100208]

2021-07-29 Thread Andrew Stubbs
Now backported to devel/omp/gcc-11. Andrew On 28/07/2021 14:03, Andrew Stubbs wrote: This patch follows up my previous patch and supports more variants of LLVM 12. There are still other incompatibilities with LLVM 12, but this at least the ELF attributes should now automatically tune to any

Re: [committed] amdgcn: Fix attributes for LLVM-12 [PR 100208]

2021-07-29 Thread Andrew Stubbs
On 29/07/2021 08:34, Richard Biener wrote: On Wed, Jul 28, 2021 at 3:04 PM Andrew Stubbs wrote: This patch follows up my previous patch and supports more variants of LLVM 12. There are still other incompatibilities with LLVM 12, but this at least the ELF attributes should now automatically

[PATCH] builtins.c: Ensure emit_move_insn operands are valid (PR100418)

2021-05-07 Thread Andrew Stubbs
A recent patch from Alexandre added new calls to emit_move_insn with PLUS expressions in the operands. Apparently this works fine on (at least) x86_64, but fails on (at least) amdgcn, where the adddi3 patten has clobbers that the movdi3 does not. This results in ICEs in recog. This patch inser

[committed] amdgcn: disable TImode

2021-05-07 Thread Andrew Stubbs
TImode has always been a problem on amdgcn, and now it is causing many new test failures, so I'm disabling it. The mode only has move instructions defined, which was enough for SLP, but any other code trying to use it without checking the optabs is a problem. The mode remains available for u

Re: [committed] amdgcn: disable TImode

2021-05-07 Thread Andrew Stubbs
On 07/05/2021 18:11, Tobias Burnus wrote: On 07.05.21 18:35, Andrew Stubbs wrote: TImode has always been a problem on amdgcn, and now it is causing many new test failures, so I'm disabling it. Does still still work with libgomp? The patch sounds as if it might cause problems, but o

Re: [committed] amdgcn: disable TImode

2021-05-07 Thread Andrew Stubbs
in will at least build. I suspect we'll see some real failures here soon though. Andrew On 07/05/2021 23:45, Andrew Stubbs wrote: On 07/05/2021 18:11, Tobias Burnus wrote: On 07.05.21 18:35, Andrew Stubbs wrote: TImode has always been a problem on amdgcn, and now it is causing many new te

Re: [PATCH] vect: Fix integer overflow calculating mask

2024-03-04 Thread Andrew Stubbs
On 23/02/2024 15:13, Richard Biener wrote: On Fri, 23 Feb 2024, Jakub Jelinek wrote: On Fri, Feb 23, 2024 at 02:22:19PM +, Andrew Stubbs wrote: On 23/02/2024 13:02, Jakub Jelinek wrote: On Fri, Feb 23, 2024 at 12:58:53PM +, Andrew Stubbs wrote: This is a follow-up to the previous

Re: Stabilize flaky GCN target/offloading testing

2024-03-06 Thread Andrew Stubbs
On 06/03/2024 12:09, Thomas Schwinge wrote: Hi! On 2024-02-21T17:32:13+0100, Richard Biener wrote: Am 21.02.2024 um 13:34 schrieb Thomas Schwinge : [...] per my work on "libgomp make check time is excessive", all execution testing in libgomp is serialized in 'lib

Re: amdgcn: additional gfx1030/gfx1100 support: adjust test cases

2024-03-06 Thread Andrew Stubbs
On 06/03/2024 13:49, Thomas Schwinge wrote: Hi! On 2024-01-24T12:43:04+, Andrew Stubbs wrote: This [...] ... became commit 99890e15527f1f04caef95ecdd135c9f1a077f08 "amdgcn: additional gfx1030/gfx1100 support", and included the following: --- a/gcc/config/gcn/gcn-valu.md

Re: GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal

2024-03-07 Thread Andrew Stubbs
On 07/03/2024 11:29, Thomas Schwinge wrote: Hi! On 2019-11-12T13:29:16+, Andrew Stubbs wrote: This patch contributes the GCN libgomp plugin, with the various configure and make bits to go with it. An issue with libgomp GCN plugin 'GCN_SUPPRESS_HOST_FALLBACK' (which is differen

Re: GCN: Even with 'GCN_SUPPRESS_HOST_FALLBACK' set, failure to 'init_hsa_runtime_functions' is not fatal

2024-03-07 Thread Andrew Stubbs
On 07/03/2024 13:37, Thomas Schwinge wrote: Hi Andrew! On 2024-03-07T11:38:27+, Andrew Stubbs wrote: On 07/03/2024 11:29, Thomas Schwinge wrote: On 2019-11-12T13:29:16+, Andrew Stubbs wrote: This patch contributes the GCN libgomp plugin, with the various configure and make bits to

Re: GCN: The original meaning of 'GCN_SUPPRESS_HOST_FALLBACK' isn't applicable (non-shared memory system)

2024-03-08 Thread Andrew Stubbs
On 08/03/2024 10:16, Thomas Schwinge wrote: Hi! So, attached here is now a different patch "GCN: The original meaning of 'GCN_SUPPRESS_HOST_FALLBACK' isn't applicable (non-shared memory system)", that takes a different approach re clarifying the two orthogonal aspects that the 'GCN_SUPPRESS_HOS

[PATCH] vect: Use xor to invert oversized vector masks

2024-03-14 Thread Andrew Stubbs
Don't enable excess lanes when inverting vector bit-masks smaller than the integer mode. This is yet another case of wrong-code due to mishandling of oversized bitmasks. This issue shows up in vect/tsvc/vect-tsvc-s278.c and vect/tsvc/vect-tsvc-s279.c if I set the preferred vector size to V32 (dow

Re: [PATCH] vect: Use xor to invert oversized vector masks

2024-03-15 Thread Andrew Stubbs
On 15/03/2024 03:45, Hongtao Liu wrote: On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs wrote: Don't enable excess lanes when inverting vector bit-masks smaller than the integer mode. This is yet another case of wrong-code due to mishandling of oversized bitmasks. This issue shows up in

Re: [PATCH] vect: Use xor to invert oversized vector masks

2024-03-15 Thread Andrew Stubbs
On 15/03/2024 07:35, Richard Biener wrote: On Fri, Mar 15, 2024 at 4:35 AM Hongtao Liu wrote: On Thu, Mar 14, 2024 at 11:42 PM Andrew Stubbs wrote: Don't enable excess lanes when inverting vector bit-masks smaller than the integer mode. This is yet another case of wrong-code d

Re: [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it

2024-03-15 Thread Andrew Stubbs
On 15/03/2024 12:21, Tobias Burnus wrote: Given the large number of AMD GPU ISAs and the number of files which have to be adapted, I wonder whether it makes sense to consolidate this a bit, especially in the light that we may want to support more in the future. Besides using some macros, I al

Re: [Patch][RFC] GCN: Define ISA archs in gcn-devices.def and use it

2024-03-15 Thread Andrew Stubbs
On 15/03/2024 13:56, Tobias Burnus wrote: Hi Andrew, Andrew Stubbs wrote: This is more-or-less what I was planning to do myself, but as I want to include all the other features that get parametrized in gcn.cc, gcn.h, gcn-hsa.h, gcn-opts.h, I hadn't got around to it yet. Unfortunate

Re: GCN: Enable effective-target 'vect_early_break', 'vect_early_break_hw'

2024-03-21 Thread Andrew Stubbs
On 21/03/2024 10:41, Thomas Schwinge wrote: Hi! On 2024-01-12T15:02:35+0100, I wrote: OK to push the attached "GCN: Enable effective-target 'vect_early_break', 'vect_early_break_hw'"? Ping. (Or is that not what you'd expect to see for GCN? I haven't checked the actual back end code...) So

[committed] amdgcn: Clean up device memory in gcn-run

2024-03-21 Thread Andrew Stubbs
There are some stability issues in the ROC runtime or drivers when we run too many tests in quick succession. I was hoping this patch might fix it, but no; still good to fix the omissions though. Committed to mainline. gcc/ChangeLog: * config/gcn/gcn-run.cc (main): Add an hsa_memory_fre

[commmitted] amdgcn: Comment correction

2024-03-21 Thread Andrew Stubbs
The location of the marker was changed, but the comment wasn't updated. Fixed now. Committed to mainline gcc/ChangeLog: * config/gcn/gcn.cc (gcn_expand_builtin_1): Comment correction. --- gcc/config/gcn/gcn.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/c

[committed] amdgcn: Ensure gfx11 is running in cumode

2024-03-21 Thread Andrew Stubbs
CUmode "on" is the setting for compatibility with GCN and CDNA devices. Committed to mainline. gcc/ChangeLog: * config/gcn/gcn-hsa.h (ASM_SPEC): Pass -mattr=+cumode. --- gcc/config/gcn/gcn-hsa.h | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/config/gcn/gcn-hsa.h b/gcc/config/gc

[PATCH] vect: more oversized bitmask fixups

2024-03-21 Thread Andrew Stubbs
My previous patch to fix this problem with xor was rejected because we want to fix these issues only at the point of use. That patch produced slightly better code, in this example, but this works too These patches fix up a failure in testcase vect/tsvc/vect-tsvc-s278.c when configured to use

Re: [PATCH] vect: more oversized bitmask fixups

2024-03-21 Thread Andrew Stubbs
On 21/03/2024 15:18, Richard Biener wrote: On Thu, Mar 21, 2024 at 3:23 PM Andrew Stubbs wrote: My previous patch to fix this problem with xor was rejected because we want to fix these issues only at the point of use. That patch produced slightly better code, in this example, but this works

Re: [committed] amdgcn: Ensure gfx11 is running in cumode

2024-03-22 Thread Andrew Stubbs
On 22/03/2024 11:56, Thomas Schwinge wrote: Hi Andrew! On 2024-03-21T13:39:53+, Andrew Stubbs wrote: CUmode "on" is the setting for compatibility with GCN and CDNA devices. --- a/gcc/config/gcn/gcn-hsa.h +++ b/gcc/config/gcn/gcn-hsa.h @@ -107,6 +107,7 @@ extern un

Re: [PATCH] vect: more oversized bitmask fixups

2024-03-22 Thread Andrew Stubbs
On 22/03/2024 08:43, Richard Biener wrote: I'll note that we don't pass 'val' there and 'val' is unfortunately not documented - what's it supposed to be? I think I placed the original fix in do_compare_and_jump because we have the full into available there. So what's the do_compare_rtx_and_j

[committed] amdgcn: Add gfx1103 target

2024-03-22 Thread Andrew Stubbs
This patch adds support for the gfx1103 RDNA3 APU integrated graphics devices. The ROCm documentation warns that these may not be supported, but it seems to work at least partially. This device should be considered "Experimental" at this point, although so far it seems to be at least as functiona

[committed] amdgcn: Prefer V32 on RDNA devices

2024-03-22 Thread Andrew Stubbs
This patch alters the default (preferred) vector size to 32 on RDNA devices to better match the actual hardware. 64-lane vectors will continue to be used where they are hard-coded (such as function prologues). We run these devices in wavefrontsize64 for compatibility, but they actually only have

[committed] amdgcn: Adjust GFX10/GFX11 cache coherency

2024-03-22 Thread Andrew Stubbs
The RDNA devices have different cache architectures to the CDNA devices, and the differences go deeper than just the assembler mnemonics, so we probably need to generate different code to maintain coherency across the whole device. I believe this patch is correct according to the documentation in

[wwwdocs, committed] gcc-14: amdgcn: Add gfx1103

2024-03-22 Thread Andrew Stubbs
I added a note about gfx1103 to the existing text for gfx1100. Andrew --- htdocs/gcc-14/changes.html | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index d88fbc96..880b9195 100644 --- a/htdocs/gcc-14/changes.htm

Re: GCN: Enable effective-target 'vect_hw_misalign'

2024-03-25 Thread Andrew Stubbs
On 21/03/2024 10:41, Thomas Schwinge wrote: Hi! OK to push the attached "GCN: Enable effective-target 'vect_hw_misalign'"? (Or is that not what you'd expect to see for GCN? I haven't checked the actual back end code...) OK. Andrew.

Re: GCN: Enable effective-target 'vect_long_mult'

2024-03-25 Thread Andrew Stubbs
On 21/03/2024 10:41, Thomas Schwinge wrote: Hi! OK to push the attached "GCN: Enable effective-target 'vect_long_mult'"? (Or is that not what you'd expect to see for GCN? I haven't checked the actual back end code...) OK. Andrew

Re: [PATCH] amdgcn: Add gfx1036 target

2024-03-25 Thread Andrew Stubbs
On 25/03/2024 11:27, Richard Biener wrote: Add support for the gfx1036 RDNA2 APU integrated graphics devices. The ROCm documentation warns that these may not be supported, but it seems to work at least partially. x86 host bootstrap/regtest running, target-libgomp testing for the offload produce

Re: [Patch] GCN: Fix --with-arch= handling in mkoffload [PR111966]

2024-04-03 Thread Andrew Stubbs
On 03/04/2024 10:05, Tobias Burnus wrote: This patch handles --with-arch= in GCN's mkoffload.cc While mkoffload mostly does not know this and passes it through to the GCN lto1 compiler, it writes an .o file with debug information - and here the -march= in the ELF flags must agree with the one

Re: [Patch] GCN: install.texi update for Newlib change and LLVM 18 release

2024-04-03 Thread Andrew Stubbs
On 03/04/2024 10:27, Jakub Jelinek wrote: On Wed, Apr 03, 2024 at 11:09:19AM +0200, Tobias Burnus wrote: @@ -3954,8 +3956,8 @@ on the GPU. To enable support for GCN3 Fiji devices (gfx803), GCC has to be configured with @option{--with-arch=@code{fiji}} or @option{--with-multilib-list=@code

Re: GCN: '--param=gcn-preferred-vector-lane-width=[default,32,64]'

2024-04-08 Thread Andrew Stubbs
On 08/04/2024 11:45, Thomas Schwinge wrote: Hi! On 2024-03-28T08:00:50+0100, I wrote: On 2024-03-22T15:54:48+, Andrew Stubbs wrote: This patch alters the default (preferred) vector size to 32 on RDNA devices to better match the actual hardware. 64-lane vectors will continue to be used

Re: GCN, RDNA 3: Adjust 'sync_compare_and_swap_lds_insn'

2024-02-01 Thread Andrew Stubbs
On 01/02/2024 11:36, Thomas Schwinge wrote: Hi! On 2024-01-31T11:31:00+, Andrew Stubbs wrote: On 31/01/2024 10:36, Thomas Schwinge wrote: OK to push "GCN, RDNA 3: Adjust 'sync_compare_and_swap_lds_insn'", see attached? In pre-RDNA 3 ISA manuals, there are

Re: GCN: Don't hard-code number of SGPR/VGPR/AVGPR registers

2024-02-01 Thread Andrew Stubbs
On 01/02/2024 13:49, Thomas Schwinge wrote: Hi! On 2018-12-12T11:52:52+, Andrew Stubbs wrote: This patch contains the major part of the GCN back-end. [...] --- /dev/null +++ b/gcc/config/gcn/gcn.c +void +gcn_hsa_declare_function_name (FILE *file, const char *name, tree

Re: [PATCH] libgomp: testsuite: Don't XPASS libgomp.c/alloc-pinned-1.c etc. on non-Linux targets [PR113448]

2024-02-12 Thread Andrew Stubbs
On 05/02/2024 13:04, Rainer Orth wrote: Two libgomp tests XPASS on Solaris (any non-Linux target actually) since their introduction: XPASS: libgomp.c/alloc-pinned-1.c execution test XPASS: libgomp.c/alloc-pinned-2.c execution test The problem is that the test just prints OS unsupported and ex

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Andrew Stubbs
On 13/02/2024 08:26, Richard Biener wrote: On Mon, 12 Feb 2024, Thomas Schwinge wrote: Hi! On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: I've committed this patch ... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691 "amdgcn: add -march=gfx1030 EXPERIMENTAL". The RD

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Andrew Stubbs
On 14/02/2024 13:27, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 13/02/2024 08:26, Richard Biener wrote: On Mon, 12 Feb 2024, Thomas Schwinge wrote: Hi! On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: I've committed this patch ... as c

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Andrew Stubbs
On 14/02/2024 13:43, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 14/02/2024 13:27, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 13/02/2024 08:26, Richard Biener wrote: On Mon, 12 Feb 2024, Thomas Schwinge wrote: Hi! On 2023-10-20T12:51:03

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-15 Thread Andrew Stubbs
On 15/02/2024 07:49, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 14/02/2024 13:43, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 14/02/2024 13:27, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 13/02/2024 08:26, Richard

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-15 Thread Andrew Stubbs
On 15/02/2024 10:21, Richard Biener wrote: [snip] I suppse if RDNA really only has 32 lane vectors (it sounds like it, even if it can "simulate" 64 lane ones?) then it might make sense to vectorize for 32 lanes? That said, with variable-length it likely doesn't matter but I'd not expose fixed-si

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-15 Thread Andrew Stubbs
On 15/02/2024 10:23, Thomas Schwinge wrote: Hi! On 2024-02-15T08:49:17+0100, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 14/02/2024 13:43, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 14/02/2024 13:27, Richard Biener wrote: On Wed, 14 Feb 2024

Re: GCN RDNA2+ vs. GCC SLP vectorizer

2024-02-16 Thread Andrew Stubbs
On 16/02/2024 10:17, Richard Biener wrote: On Fri, 16 Feb 2024, Thomas Schwinge wrote: Hi! On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: I've committed this patch ... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691 "amdgcn: add -march=gfx1030 EXPERIMENTAL", which

Re: GCN RDNA2+ vs. GCC SLP vectorizer

2024-02-16 Thread Andrew Stubbs
On 16/02/2024 12:26, Richard Biener wrote: On Fri, 16 Feb 2024, Andrew Stubbs wrote: On 16/02/2024 10:17, Richard Biener wrote: On Fri, 16 Feb 2024, Thomas Schwinge wrote: Hi! On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: I've committed this patch ... as c

Re: GCN: Conditionalize 'define_expand "reduc__scal_"' on '!TARGET_RDNA2_PLUS' [PR113615]

2024-02-16 Thread Andrew Stubbs
On 16/02/2024 14:34, Thomas Schwinge wrote: Hi! On 2024-01-29T11:34:05+0100, Tobias Burnus wrote: Andrew wrote off list: "Vector reductions don't work on RDNA, as is, but they're supposed to be disabled by the insn condition" This patch disables "fold_left_plus_", which is about vect

[PATCH] vect: Fix integer overflow calculating mask

2024-02-23 Thread Andrew Stubbs
This is a follow-up to the previous patch to ensure that integer vector bit-masks do not have excess bits set. It fixes a bug, observed on amdgcn, in which the mask could be incorrectly set to zero, resulting in wrong-code. The mask was broken when nunits==32. The patched version will probably be

Re: [PATCH] vect: Fix integer overflow calculating mask

2024-02-23 Thread Andrew Stubbs
On 23/02/2024 13:02, Jakub Jelinek wrote: On Fri, Feb 23, 2024 at 12:58:53PM +, Andrew Stubbs wrote: This is a follow-up to the previous patch to ensure that integer vector bit-masks do not have excess bits set. It fixes a bug, observed on amdgcn, in which the mask could be incorrectly set

Re: [Patch] xfail libgomp.c/declare-variant-4-{fiji,gfx803}.c

2024-01-22 Thread Andrew Stubbs
On Fri, 19 Jan 2024 at 18:27, Tobias Burnus wrote: > The problem is as described at > https://gcc.gnu.org/install/specific.html#amdgcn-x-amdhsa > > "Note that support for Fiji devices has been removed in ROCm 4.0 and > support in LLVM is deprecated and will be removed in LLVM 18." > > Therefore,

Re: [PATCH] gcn: Fix a warning

2024-01-23 Thread Andrew Stubbs
On Tue, 23 Jan 2024 at 10:01, Jakub Jelinek wrote: > Hi! > > I see > ../../gcc/config/gcn/gcn.cc: In function ‘void > gcn_hsa_declare_function_name(FILE*, const char*, tree)’: > ../../gcc/config/gcn/gcn.cc:6568:67: warning: unused parameter ‘decl’ > [-Wunused-parameter] > 6568 | gcn_hsa_declare_

[PATCH] Update my email in MAINTAINERS

2024-01-23 Thread Andrew Stubbs
I've moved to BayLibre and don't have access to my codesourcery.com address, at least for a while. ChangeLog: * MAINTAINERS: Update Signed-off-by: Andrew Stubbs --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAI

[PATCH] amdgcn: additional gfx1100 support

2024-01-24 Thread Andrew Stubbs
cn/time.c (RTC_TICKS): Configure RDNA3. (omp_get_wtime): Add RDNA3-compatible variant. * plugin/plugin-gcn.c (max_isa_vgprs): Tune for gfx1030 and gfx1100. Signed-off-by: Andrew Stubbs --- gcc/config/gcn/gcn-opts.h | 2 +- gcc/config/gcn/gcn

Re: [patch] gcn/mkoffload.cc: Fix SRAM_ECC and XNACK handling [PR111966]

2024-01-25 Thread Andrew Stubbs
On 24/01/2024 22:12, Tobias Burnus wrote: This patch fixes "-g" debug compilation for gfx1100 and gfx1030, which fail to link when "-g" is specified. The reason is: When using gfx1100 and compiling with '-g' I was running into an error because the eflags used for the debugger file has additional

Re: [patch] gcn: Add missing space to ASM_SPEC in gcn-hsa.h

2024-01-25 Thread Andrew Stubbs
On 25/01/2024 12:44, Tobias Burnus wrote: This patch avoids assembler warnings for gfx908 and gfx90a such as '-xnack-mattr=-sramecc' is not a recognized feature for this target(ignoring feature) as we pass -mattr=-xnack-mattr=-sramecc to the llvm-mc assembler. Solution: Add a space before

<    1   2   3   4   5   6   7   8   9   10   >