https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121813
Bug ID: 121813
Summary: [OpenMP] omp_target_is_accessible too simplistic -
false positive and false negative
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Keywords: openmp
Severity: normal
Priority: P3
Component: libgomp
Assignee: unassigned at gcc dot gnu.org
Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
Target Milestone: ---
The current code assumes that memory is only accessible if
GOMP_OFFLOAD_CAP_SHARED_MEM and always if so.
However, that's not universally true. I think we effectively required
unified address space for it (currently the case with GCC), albeit it is
not strictly required. Otherwise:
* Managed/pinned memory might be accessible even though no USM support exists.
* Memory allocated on one device might not be accessible on another one,
even though USM support is enabled.
NOTE: It is permitted to say 'not accessible' even though memory is accessible,
if it is not possible to determine it, but always returning 'false' is kind
of lame. - However, false 'true' should be avoided.
NOTE: Later versions of OpenMP permit any pointer – i.e. the allocated on
one device, accessing on the host or on another device case. (Issue #4569),
before
int
omp_target_is_accessible (const void *ptr, size_t size, int device_num)
{
if (device_num == omp_initial_device
|| device_num == gomp_get_num_devices ())
return true;
struct gomp_device_descr *devicep = resolve_device (device_num, false);
if (devicep == NULL)
return false;
/* TODO: Unified shared memory must be handled when available. */
return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;
}
EXPECTED:
* A nvptx + gcn plugin function that can be called for checking.
→ cudaPointerGetAttributes
"Returns in *attributes the attributes of the pointer ptr. If pointer was
not
allocated in, mapped by or registered with context supporting unified
addressing cudaErrorInvalidValue is returned."
For the host, we may need to iterate through all devices - to check whether we
get, e.g., a result with cudaPointerGetAttributes where the attributes are
available but the 'hostPointer' struct member == NULL (nor host pointer !=
original pointer).
For AMD, it seems to be more complex, unless HIP is used →
hipPointerGetAttributes, but I might have missed some feature in hsa.h or
hsa_ext_amd.h.
It seems as if hsa_region_get_info with HSA_AMD_REGION_INFO_HOST_ACCESSIBLE
might provide the host-access data. For the other, I wonder whether
HSA_REGION_INFO_SEGMENT + HSA_REGION_SEGMENT_GLOBAL will work - but this seems
to assume as if all memory falls into a GPU category. My feeling is that it
will fail when passing an nvptx GPU address there, but I might be wrong.