This patch is a prerequisite for some amdgcn patches I'm working on to
support shorter vector lengths (having fixed 64 lanes tends to miss
optimizations, and masking is not supported everywhere yet).
The problem is that, unlike AArch64, I'm not using different mask modes
for different sized ve
On 29/09/2022 08:52, Richard Biener wrote:
On Wed, Sep 28, 2022 at 5:06 PM Andrew Stubbs wrote:
This patch is a prerequisite for some amdgcn patches I'm working on to
support shorter vector lengths (having fixed 64 lanes tends to miss
optimizations, and masking is not supported everywher
On 29/09/2022 10:24, Richard Sandiford wrote:
Otherwise:
operand0[0] = operand1 < operand2;
for (i = 1; i < operand3; i++)
operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
looks like a "length and mask" operation, which IIUC is also what
RVV wanted? (Wasn't at the Cauldro
On 27/09/2022 14:16, Tobias Burnus wrote:
@@ -422,6 +428,12 @@ struct agent_info
if it has been. */
bool initialized;
+ /* Flag whether the HSA program that consists of all the modules has been
+ finalized. */
+ bool prog_finalized;
+ /* Flag whether the HSA OpenMP's requires
I've committed this small clean up. It silences a warning.
Andrewamdgcn: remove unused variable
This was left over from a previous version of the SIMD clone patch.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen):
Remove unused elt_bits variable.
On 29/09/2022 14:46, Richard Biener wrote:
It's not the nicest way of carrying the information but short of inventing
new modes I can't see something better (well, another optab). I see
the GCN backend expects a constant in operand 3 but the docs don't
specify the operand has to be a CONST_INT,
On 10/10/2022 12:03, Richard Biener wrote:
The following picks up the prototype by Ju-Zhe Zhong for vectorizing
first order recurrences. That solves two TSVC missed optimization PRs.
There's a new scalar cycle def kind, vect_first_order_recurrence
and it's handling of the backedge value vectori
ssues, but rather existing
problems that did not show up because the code did not previously
vectorize. Expanding the testcase to allow 64-lane vectors shows the
same problems there.
I shall backport these patches to the OG12 branch shortly.
Andrew
Andrew Stubbs (6):
amdgcn: add multiple ve
GET_MODE_NUNITS isn't a compile time constant, so we end up with many
impossible insns in the machine description. Adding MODE_VF allows the insns
to be eliminated completely.
gcc/ChangeLog:
* config/gcn/gcn-valu.md
(2): Use MODE_VF.
(2): Likewise.
* config/gcn/g
Add vec_extract expanders for all valid pairs of vector types.
gcc/ChangeLog:
* config/gcn/gcn-protos.h (get_exec): Add prototypes for two variants.
* config/gcn/gcn-valu.md
(vec_extract): New define_expand.
* config/gcn/gcn.cc (get_exec): Export the existing func
Implements vec_init when the input is a vector of smaller vectors, or of
vector MEM types, or a smaller vector duplicated several times.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (vec_init): New.
* config/gcn/gcn.cc (GEN_VN): Add andvNsi3, subvNsi3.
(GEN_VNM): Add gathervNm
The vectors sizes are simulated using implicit masking, but they make life
easier for the autovectorizer and SLP passes.
gcc/ChangeLog:
* config/gcn/gcn-modes.def (VECTOR_MODE): Add new modes
V32QI, V32HI, V32SI, V32DI, V32TI, V32HF, V32SF, V32DF,
V16QI, V16HI, V16SI, V16
Another example of the vectorizer needing explicit insns where the scalar
expander just works.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (neg2): New define_expand.
---
gcc/config/gcn/gcn-valu.md | 13 +
1 file changed, 13 insertions(+)
diff --git a/gcc/config/gcn/gcn-valu.md
The testsuite needs a few tweaks following my patches to add multiple vector
sizes for amdgcn.
gcc/testsuite/ChangeLog:
* gcc.dg/pr104464.c: Xfail on amdgcn.
* gcc.dg/signbit-2.c: Likewise.
* gcc.dg/signbit-5.c: Likewise.
* gcc.dg/vect/bb-slp-68.c: Likewise.
On 11/10/2022 12:29, Richard Biener wrote:
On Tue, Oct 11, 2022 at 1:03 PM Andrew Stubbs wrote:
This patch series adds additional vector sizes for the amdgcn backend.
The hardware supports any arbitrary vector length up to 64-lanes via
masking, but GCC cannot (yet) make full use of them due
On 09/03/2022 16:29, Tobias Burnus wrote:
This shows up with with OpenMP offloading as libgomp since a couple
of months uses __atomic_compare_exchange (see PR for details),
causing link errors when the gcn libgomp.a is linked.
It also shows up with sollve_vv.
The implementation does a bit copy'n
On 13/01/2021 15:07, Chung-Lin Tang wrote:
We currently emit errors, but do not fatally cause exit of the program
if those
are not met. We're still unsure if complete block-out of program
execution is the right
thing for the user. This can be discussed later.
After the Unified Shared Memory p
On 08/03/2022 11:30, Hafiz Abid Qadeer wrote:
gcc/ChangeLog:
* omp-low.cc (omp_enable_pinned_mode): New function.
(execute_lower_omp): Call omp_enable_pinned_mode.
This worked for x86_64, but I needed to make the attached adjustment to
work on powerpc without a linker error.
On 08/03/2022 11:30, Hafiz Abid Qadeer wrote:
This patches changes calls to malloc/free/calloc/realloc and operator new to
memory allocation functions in libgomp with
allocator=ompx_unified_shared_mem_alloc.
This additional patch adds transformation for omp_target_alloc. The
OpenMP 5.0 documen
On 02/04/2022 13:04, Andrew Stubbs wrote:
This additional patch adds transformation for omp_target_alloc. The
OpenMP 5.0 document says that addresses allocated this way needs to work
without is_device_ptr. The easiest way to make that work is to make them
USM addresses.
Actually, reading on
This patch adjusts the testcases, previously proposed, to allow for
testing on machines with varying page sizes and default amounts of
lockable memory. There turns out to be more variation than I had thought.
This should go on mainline at the same time as the previous patches in
this thread.
This patch adds enough support for "requires unified_address" to make
the sollve_vv testcases pass. It implements unified_address as a synonym
of unified_shared_memory, which is both valid and the only way I know of
to unify addresses with Cuda (could be wrong).
This patch should be applied on
Sorry, I had not seen that this was entirely within my amdgcn remit
On 12/01/2022 09:43, Marcel Vollweiler wrote:
Hi,
Currently omp_get_device_num does not work on gcn targets with more than
one offload device. The reason is that GOMP_DEVICE_NUM_VAR is static in
icv-device.c and thus "__gom
On 18/01/2022 12:25, Thomas Schwinge wrote:
Hi!
Maybe I'm just totally confused -- as so often ;-) -- but things seem
strange here:
On 2022-01-12T10:43:05+0100, Marcel Vollweiler wrote:
Currently omp_get_device_num does not work on gcn targets with more than
one offload device. The reason is
This patch adds a new predefined allocator named ompx_pinned_mem_alloc
as an extension to the OpenMP standard. It is intended as a convenient
way to allocate pinned memory using the Linux support patch I posted
recently. I anticipate it being used by compiler internals in future as
part of a pr
This patch adjusts the NVPTX low-latency allocator that I have
previously posted (awaiting re-review). The patch assumes that all my
previously posted patches are applied already.
Given that any memory allocated from the low-latency memory space cannot
support the "access=all" allocator trait
On 02/02/2022 15:39, Tobias Burnus wrote:
On 09.08.21 15:55, Tobias Burnus wrote:
Now that the GCN/OpenACC patches for this have been committed today,
I think it makes sense to add it to the documentation.
(I was told that some follow-up items are still pending, but as
the feature does work ...)
I've committed this fix for an ICE compiling sollve_vv testcase
test_target_teams_distribute_defaultmap.c.
Somehow the optimizers result in a vector reduction on a vector of
duplicated constants. This was a case the backend didn't handle, so we
ended up with an unrecognised instruction ICE.
On 14/02/2022 14:13, Andrew Stubbs wrote:
I've committed this fix for an ICE compiling sollve_vv testcase
test_target_teams_distribute_defaultmap.c.
Somehow the optimizers result in a vector reduction on a vector of
duplicated constants. This was a case the backend didn't handle, so
If committed this patch to fix the amdgcn ICE reported in PR103396.
The problem was that it was mis-counting the number of registers to save
when the link register was only clobbered implicitly by calls. The issue
is easily fixed by adjusting the condition to match elsewhere in the
same functi
On 30/11/2021 16:54, Jakub Jelinek wrote:
Why does the GCN plugin or runtime need to know those vars?
It needs to know the single array that contains their addresses of course...
With older LLVM there were issues with relocations that made it
impossible to link the the offload_var_table. This
On 02/12/2021 12:58, Jakub Jelinek wrote:
I've tried modifying offload_handle_link_vars but that spot doesn't catch
the omp_data_sizes variables emitted by libgomp.c-c++-common/target_42.c,
which was one of the motivating examples.
Why doesn't catch it? Is the variable created only post-IPA?
I
On 02/12/2021 16:05, Andrew Stubbs wrote:
On 02/12/2021 12:58, Jakub Jelinek wrote:
I've tried modifying offload_handle_link_vars but that spot doesn't
catch
the omp_data_sizes variables emitted by
libgomp.c-c++-common/target_42.c,
which was one of the motivating examples.
Why doe
On 02/12/2021 16:43, Jakub Jelinek wrote:
On Thu, Dec 02, 2021 at 04:31:36PM +, Andrew Stubbs wrote:
On 02/12/2021 16:05, Andrew Stubbs wrote:
On 02/12/2021 12:58, Jakub Jelinek wrote:
I've tried modifying offload_handle_link_vars but that spot
doesn't catch
the omp_data_sizes
On 12/10/2022 15:29, Tobias Burnus wrote:
On 29.09.22 18:24, Andrew Stubbs wrote:
On 27/09/2022 14:16, Tobias Burnus wrote:
Andrew did suggest a while back to piggyback on the console_output
handling,
avoiding another atomic access. - If this is still wanted, I like to
have some
guidance
On 14/10/2022 08:07, Richard Biener wrote:
On Tue, 11 Oct 2022, Richard Sandiford wrote:
Richard Biener writes:
On Mon, 10 Oct 2022, Andrew Stubbs wrote:
On 10/10/2022 12:03, Richard Biener wrote:
The following picks up the prototype by Ju-Zhe Zhong for vectorizing
first order recurrences
This patch fixes a problem in which fatal errors inside mutex-locked
regions (i.e. basically anything in the plugin) will cause it to hang up
trying to take the lock to clean everything up.
Using abort() instead of exit(1) bypasses the atexit handlers and solves
the problem.
OK for mainline?
I've committed this patch to the devel/omp/gcc-12 branch. I will have to
fold it into my previous OpenMP memory management patch series when I
repost it.
The patch changes the internal memory allocation method such that memory
is allocated in the regular heap and then marked as "coarse-grained
I've committed this patch to the devel/omp/gcc-12 branch. I will have to
fold it into my previous OpenMP memory management patch series when I
repost it.
The GFX908 (MI100) devices only partially support the Unified Shared
Memory model that we have, and only then with additional kernel boot
p
I've committed this to the OG12 branch to remove some test failures. We
probably ought to have something on mainline also, but a proper fix
would be better.
Without this. the libgomp.oacc-c-c++-common/private-variables.c testcase
fails to compile due to an ICE. The OpenACC worker broadcasting
On 24/10/2022 19:06, Richard Biener wrote:
Am 24.10.2022 um 18:51 schrieb Andrew Stubbs :
I've committed this to the OG12 branch to remove some test failures. We
probably ought to have something on mainline also, but a proper fix would be
better.
Without this. the libgomp.oac
A function parameter was left over from a previous draft of my
multiple-vector-length patch. This patch silences the harmless warning.
Andrewamdgcn: Silence unused parameter warning
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen):
Set base_type a
This patch adds patterns for the fmin and fmax operators, for scalars,
vectors, and vector reductions.
The compiler uses smin and smax for most floating-point optimizations,
etc., but not where the user calls fmin/fmax explicitly. On amdgcn the
hardware min/max instructions are already IEEE c
My recent patch to add additional vector lengths didn't address the
vector reductions yet.
This patch adds the missing support. Shorter vectors use fewer reduction
steps, and the means to extract the final value has been adjusted.
Lacking from this is any useful costs, so for loops the vect p
On 03/11/2022 17:47, Kwok Cheung Yeung wrote:
Hello
This patch fixes a bug introduced in a previous patch adding support for
generating native instructions for the exp2 and log2 patterns. The
problem is that the name of the instruction implementing the exp2
operation is v_exp (and not v_exp2)
On 26/08/2022 12:04, Jakub Jelinek wrote:
gcc/ChangeLog:
* doc/tm.texi: Regenerate.
* omp-simd-clone.cc (simd_clone_adjust_return_type): Allow zero
vecsize.
(simd_clone_adjust_argument_types): Likewise.
* target.def (compute_vecsize_and_simdlen): Document
On 09/08/2022 14:23, Andrew Stubbs wrote:
Enable and configure SIMD clones for amdgcn. This affects both the __simd__
function attribute, and the OpenMP "declare simd" directive.
Note that the masked SIMD variants are generated, but the middle end doesn't
actually support c
On 31/08/2022 09:29, Jakub Jelinek wrote:
On Tue, Aug 30, 2022 at 06:54:49PM +0200, Rainer Orth wrote:
--- a/gcc/omp-simd-clone.cc
+++ b/gcc/omp-simd-clone.cc
@@ -504,7 +504,10 @@ simd_clone_adjust_return_type (struct cgraph_node *node)
veclen = node->simdclone->vecsize_int;
else
On 08/09/2022 21:38, Kwok Cheung Yeung wrote:
Hello
This patch adds support for some additional floating-point operations,
in scalar and vector modes, which are natively supported by the AMD GCN
instruction set, but haven't been implemented in GCC yet. With the
exception of frexp, these imple
On 09/09/2022 13:20, Tobias Burnus wrote:
However, the pre-existing 'sqrt' problem still is real. It also applies
to reverse sqrt ("v_rsq"), but that's for whatever reason not used for GCN.
This patch now adds a commandline flag - off by default - to choose
whether this behavior is wanted. I d
On 06/04/2022 11:02, Thomas Schwinge wrote:
Hi!
On 2021-01-14T15:50:23+0100, I wrote:
I'm raising here an issue with HSA libgomp plugin code changes from a
while ago. While HSA is now no longer relevant for GCC master branch,
the same code has also been copied into the GCN libgomp plugin.
He
On 16/05/2022 11:28, Tobias Burnus wrote:
While 'vendor' and 'kind' is well defined, 'arch' and 'isa' isn't.
When looking at an 'metadirective' testcase (which oddly uses 'arch(amd)'),
I noticed that LLVM uses 'arch(amdgcn)' while we use 'gcn', cf. e.g.
'clang/lib/Headers/openmp_wrappers/math.h'
On 19/05/2022 17:00, Jakub Jelinek wrote:
Without requires dynamic_allocators, there are various extra restrictions
imposed:
1) omp_init_allocator/omp_destroy_allocator may not be called (except for
implicit calls to it from uses_allocators) in a target region
I interpreted that more like "
I've committed this patch to set the minimum required LLVM version, for
the assembler and linker, to 13.0.1. An upgrade from LLVM 9 is a
prerequisite for the gfx90a support, and 13.0.1 is now the oldest
version not known to have compatibility issues.
The patch removes all the obsolete feature
I've committed this patch to add support for gfx90a AMD GPU devices.
The patch updates all the places that have architecture/ISA specific
code, tidies up the ISA naming and handling in the backend, and adds a
new multilib.
This is just lightly tested at this point, but there are no known issu
On 04/11/2019 15:37, Jakub Jelinek wrote:
My preference would be that arch on amdgcn is something like amdgcn or gcn.
I hope the general distinction between arch and isa will be something that
will be discussed next Tuesday on the language committee, so hopefully we'll
know more afterwards and ca
shared with the GCN port, thus preventing much of
the duplication.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* configure.tgt (nvptx*-*-*): Add "accel" directory.
* config/nvptx/libgomp-plugin.c: Move ...
* config/accel/libgomp-plugin.c: ..
to forward-port. Otherwise that will have to wait for GCC 11.
Andrew
Andrew Stubbs (7):
Move generic libgomp files from nvptx to accel
GCN mkoffload
Add device number to GOMP_OFFLOAD_openacc_async_construct
GCN libgomp port
Optimize GCN OpenMP malloc performance
Use a single worker for Ope
This patch adds the mkoffload tool to the amdgcn backend. It's similar,
but not quite the same as that on the openacc-gcc-9-branch.
I will commit this patch when the others in this series are approved.
Andrew
2019-11-12 Andrew Stubbs
gcc/
* config/gcn/mkoffload.c: New
e the queue is intended, so this simply provides that
information to the queue constructor.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_construct): Add int
parameter.
* oacc-as
new target-specific
symbols to be added to libgomp. I couldn't find an existing way to do
this without adding a new top-level file also, to there's an empty
placeholder also. (The OG9 branch has this symbol in libgcc, but that
seems wrong.)
OK to commit?
Thanks
Andrew
2019-11-12
This patch contributes the GCN libgomp plugin, with the various
configure and make bits to go with it.
This implementation is a much-cleaned-up version of the one present on the
openacc-gcc-9-branch.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* plugin
l
search and replace.
Dummy pass-through definitions are provided for other targets.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* config/gcn/team.c (gomp_gcn_enter_kernel): Set up the team arena
and use team_malloc variants.
(gomp_gcn_exit_ke
This patch prevents the compiler using multiple workers in a gang. This
should be reverted when worker support is committed.
I will commit this with the reset of the series.
Andrew
2019-11-12 Andrew Stubbs
Julian Brown
gcc/
* config/gcn/gcn.c
On 12/11/2019 13:46, Jakub Jelinek wrote:
On Tue, Nov 12, 2019 at 01:29:13PM +, Andrew Stubbs wrote:
2019-11-12 Andrew Stubbs
include/
* gomp-constants.h (GOMP_DEVICE_GCN): Define.
(GOMP_VERSION_GCN): Define.
Perhaps this could be 0, but not a big deal.
OG9
On 12/11/2019 14:01, Jakub Jelinek wrote:
On Tue, Nov 12, 2019 at 01:29:16PM +, Andrew Stubbs wrote:
2019-11-12 Andrew Stubbs
libgomp/
* plugin/Makefrag.am: Add amdgcn plugin support.
* plugin/configfrag.ac: Likewise.
* plugin/plugin-gcn.c: New file
e that the loop would vectorize inline, but I don't
think it was doing so anyway. I need to look at that, but how is this,
for now?
Andrew
Optimize GCN OpenMP malloc performance
2019-11-12 Andrew Stubbs
libgomp/
* config/gcn/team.c (gomp_gcn_enter_kernel): Set up t
fload kernels will experience, and therefore make
standalone testing more meaningful.
Andrew
Move gcn-run heap into GPU memory.
2019-11-13 Andrew Stubbs
gcc/
* config/gcn/gcn-run.c (heap_region): New global variable.
(struct hsa_runtime_fn_info): Add hsa_memory_assign_age
These patches are now all committed. I've adjusted the changelogs to
list all the proper authors (apologies if I missed anyone).
Thank you for the quick reviews, Jakub. :-)
Andrew
On 12/11/2019 13:29, Andrew Stubbs wrote:
Hi all,
This patch series contributes initial OpenMP and Op
On 11/11/2020 11:16, Richard Sandiford wrote:
[Andrew: cc:ing you in case this affects/helps GCN.]
The vcond code requires the compared vectors and the selected
vectors to have both the same size and the same number of elements
as each other. But the operation makes logical sense even for
diffe
On 10/01/2020 14:21, Kwok Cheung Yeung wrote:
The patch for sub-word atomics support added an include of stdint.h for
the definition of uintptr_h, but this can result in GCC compilation
failing if the stdint.h header has not been installed (from newlib in
the case of AMD GCN).
I have fixed th
Ping.
On 22/11/2019 11:06, Andrew Stubbs wrote:
This test case causes an ICE (reformatted for email):
void test(int k)
{
unsigned int x = 1;
#pragma acc parallel loop async(x)
for (int i = 0; i < k; i++) { }
}
t.c: In function 'test':
t.c:4:9: e
On 19/12/2019 17:39, Richard Sandiford wrote:
Andrew Stubbs writes:
This patch changes the operand predicates such that vector constants are
permitted during compilation. This prevents ICEs caused by the compiler
trying to emit such instructions without checking.
That sounds like a target
y existing
code will use, if anything, so we ought to be compatible.
There's no official release using the "wrong" name, so I don't believe
we need to retain that name for any reason.
I've tested that there are no regressions.
Andrew
Rename acc_device_gcn to acc_devic
Hi Frederik,
On 20/01/2020 06:57, Harwath, Frederik wrote:
Hi,
this patch implements a runtime ISA check for amdgcn offloading.
The check verifies that the ISA of the GPU to which we try to offload matches
the ISA for which the code to be offloaded has been compiled. If it detects
a mismatch, it
On 20/01/2020 10:08, Jakub Jelinek wrote:
On Mon, Jan 20, 2020 at 10:00:09AM +, Andrew Stubbs wrote:
@@ -396,6 +396,88 @@ struct gcn_image_desc
struct global_var_info *global_variables;
};
+/* This enum mirrors the corresponding LLVM enum's values for all ISAs that we
+ su
On 20/01/2020 10:42, Jakub Jelinek wrote:
:(
Another option would be to build offloading code by GCN multiple times, once
for each incompatible ISA the user is asking for, so that one can have then
binaries that will work on different hw.
Because e.g. with the distro vendor hat, it is hard to gue
On 20/01/2020 11:07, Jakub Jelinek wrote:
On Mon, Jan 20, 2020 at 11:00:58AM +, Andrew Stubbs wrote:
Indeed, fat binaries would be a good solution.
Presumably it's possible, but I'm not sure how we'd go about getting the
offload mechanism to launch the backend multiple ti
tests for amdgcn
2020-01-20 Andrew Stubbs
libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Skip test on gcn.
* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c (main):
Adjust test dimensions for amdgcn.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c (main): A
On 20/01/2020 16:42, Harwath, Frederik wrote:
Hi Andrew,
Thanks for the review! I have attached a revised patch containing the changes
that you suggested.
On 20.01.20 11:00, Andrew Stubbs wrote:
On 20/01/2020 06:57, Harwath, Frederik wrote:
Is it ok to commit this patch to the master branch
theory it could read attempt to read any unhandled argument as the
thread limit.
Andrew
Fix libgomp plugin-gcn bug
2020-01-23 Andrew Stubbs
libgomp/
* plugin/plugin-gcn.c (parse_target_attributes): Use correct mask for
the device id.
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plu
ld have been rejected, but the predicates were too
loose.
Andrew
Fix ICE on unsupported FP comparison
2020-01-24 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (vec_cmpdi): Use
gcn_fp_compare_operator.
(vec_cmpudi): Use gcn_compare_operator.
(vec_cmpv64qidi): Use gcn_compare_op
On 24/01/2020 14:59, Tobias Burnus wrote:
As reported in PR93409, the build of libgomp/plugin/plugin-gcn.c fails
with a bunch of error messages when building with
--with-multilib-list=m32,m64,mx32
The reason is that the GCN plugin assumes 64bit pointers. As with HSA,
the build is only enabled
On 28/01/2020 14:55, Harwath, Frederik wrote:
Hi,
this patch adds full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin. This replaces
the existing stub in libgomp/plugin-gcn.c.
Andrew: The value returned for acc_property_memory ("size
On 24/01/2020 14:58, Andrew Stubbs wrote:
I've committed this patch to fix an ICE building the
gcc.dg/vect/fast-math-pr55281.c testcase.
Oops, I got that crossed. This was the fix for gcc.dg/pr50310-2.c. The
fast-math-pr55281.c fix will be posted shortly.
The problem was that the co
tand why I only see this issue on amdgcn, but it
might be because the pointer in question is in a MASK_LOAD which is
perhaps not that commonly used?
I've tested this on amdgcn, and done a full bootstrap and test on x86_64
also.
OK to commit?
Thanks
Andrew
Fix fast-math-pr55281.c ICE.
On 29/01/2020 09:52, Harwath, Frederik wrote:
@@ -1513,6 +1518,23 @@ init_hsa_context (void)
GOMP_PLUGIN_error ("Failed to list all HSA runtime agents");
}
+ uint16_t minor, major;
+ status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MINOR,
&minor);
+ if (status
On 29/01/2020 12:30, Thomas Schwinge wrote:
Hi Andrew!
On 2019-11-22T11:06:14+, Andrew Stubbs wrote:
This test case causes an ICE (reformatted for email):
void test(int k)
{
unsigned int x = 1;
#pragma acc parallel loop async(x)
for (int i = 0; i < k
On 29/01/2020 12:53, Tobias Burnus wrote:
Cf. PR93409 comments 4 and later. The comments 1–3 of the PR are covered
by patch https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01663.html (skip
building libgomp's HSA/GCN plugin with -mx32).
For AMDGCN, the LLVM assembler is used. While for LLVM 7+8,
On 29/01/2020 15:40, Tobias Burnus wrote:
Hi Andrew,
On 1/29/20 2:01 PM, Andrew Stubbs wrote:
On 29/01/2020 12:53, Tobias Burnus wrote:
With LLVM 9, the old variant is only accepted when also passing
"-mattr=-code-object-v3" to the compiler; that's a"-" after the &qu
On 29/01/2020 17:44, Thomas Schwinge wrote:
@@ -1513,6 +1518,23 @@ init_hsa_context (void)
+ size_t len = sizeof hsa_context.driver_version_s;
+ int printed = snprintf (hsa_context.driver_version_s, len,
+ "HSA Runtime %hu.%hu", (unsigned short int)major,
+
On 30/01/2020 09:20, Jakub Jelinek wrote:
On Fri, Jan 24, 2020 at 03:59:28PM +0100, Tobias Burnus wrote:
As reported in PR93409, the build of libgomp/plugin/plugin-gcn.c fails with
a bunch of error messages when building with
--with-multilib-list=m32,m64,mx32
The reason is that the GCN plugin a
On 29/01/2020 08:24, Richard Biener wrote:
On Tue, Jan 28, 2020 at 5:53 PM Andrew Stubbs wrote:
This patch fixes an ICE compiling fast-math-pr55281.c for amdgcn.
The problem is that an "iv" is created in which both base and step are
pointer types,
How did you get a POINTER
On 30/01/2020 13:49, Richard Biener wrote:
On Thu, Jan 30, 2020 at 2:04 PM Bin.Cheng wrote:
On Thu, Jan 30, 2020 at 8:53 PM Andrew Stubbs wrote:
On 29/01/2020 08:24, Richard Biener wrote:
On Tue, Jan 28, 2020 at 5:53 PM Andrew Stubbs wrote:
This patch fixes an ICE compiling fast-math
On 30/01/2020 16:08, Thomas Schwinge wrote:
Hi!
Andrew and Frederik, thanks for your emails reminding/educating me about
'snprintf' as well as this HSA fixed-size buffer API. There doesn't
happen to be something available in the HSA API available so that we
could use 'sizeof [something]' instea
fixes an ICE in testcase
gcc.dg/pr81228.c.
Andrew
Add LTGT operator support for amdgcn
Fixes ICE in testcase gcc.dg/pr81228.c
2020-01-30 Andrew Stubbs
gcc/
* config/gcn/gcn.c (print_operand): Handle LTGT.
* config/gcn/predicates.md (gcn_fp_compare_operator): Allow ltgt.
diff --git a/gcc/con
e PR, but I've not verified whether "that covers *all* relevant code
paths". This should then be backported to all GCC release branches; I
can easily test the backports for you, if you're not already set up to do
such testing.
How's this?
Andrew
Normalize GOACC_parallel_k
ailure in testcase gfortran.dg/assumed_rank_1.f90.
2020-01-30 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (gather_exec): Move contents ...
(mask_gather_load): ... here, and zero-initialize the
destination.
(maskloaddi): Zero-initialize the destination.
* config/gcn/gcn.c:
diff --git a/gc
On 31/01/2020 13:56, Kwok Cheung Yeung wrote:
The GCN architecture has 4 SIMD units per compute unit, with 256 VGPRs
per SIMD unit. OpenMP threads or OpenACC workers must be distributed
across the SIMD units, with each thread/worker fitting entirely within a
single SIMD unit. VGPRs are shared b
On 31/01/2020 08:09, Richard Biener wrote:
On Thu, Jan 30, 2020 at 3:09 PM Andrew Stubbs wrote:
How about this?
I've only tested it on the one testcase, so far, but it works for that.
OK to commit (following a full test)?
OK.
X86_64 bootstrap and test showed no issues. Nor amdgcn
101 - 200 of 1052 matches
Mail list logo