On 02/06/2025 15:40, Tobias Burnus wrote:
Hi Andrew,
Andrew Stubbs wrote:
The hsa_memory_copy API is known to be slow, so for smaller data sizes
it's probably better to have one hsa_memory_copy replace the whole
memset than use three API calls, even with setting up some host-side
memory to co
On 30/05/2025 23:36, Tobias Burnus wrote:
Attached patch adds omp_target_memset and omp_target_memset_async
permitting to set (potentially large) data on the device to a
certain value - in particular to '\0'.
It uses 'memset' on the host (and for shared memory, e.g. via
requires unified_shared_m
On 5/30/25 16:36, Tobias Burnus wrote:
Attached patch adds omp_target_memset and omp_target_memset_async
permitting to set (potentially large) data on the device to a
certain value - in particular to '\0'.
It uses 'memset' on the host (and for shared memory, e.g. via
requires unified_shared_memo
Attached patch adds omp_target_memset and omp_target_memset_async
permitting to set (potentially large) data on the device to a
certain value - in particular to '\0'.
It uses 'memset' on the host (and for shared memory, e.g. via
requires unified_shared_memory/self_maps). For nvptx, cuMemsetD8
is