https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120216
Bug ID: 120216 Summary: openmp unified shared memory currently requires pageableMemoryAccess perhaps managedMemory would suffice Product: gcc Version: 15.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: schulz.benjamin at googlemail dot com Target Milestone: --- Hi there, as per the gcc 15.1 documentation: https://gcc.gnu.org/onlinedocs/gcc-15.1.0/libgomp/nvptx.html OpenMP code that has a requires directive with self_maps or unified_shared_memory runs on nvptx devices if and only if all of those support the pageableMemoryAccess property;5 otherwise, all nvptx device are removed from the list of available devices (“host fallback”). However, there are devices, like the Nvidia gtx 1660 super, which has cuda capability 7.5 and the cuda flags concurrentManagedAccess and managedMemory, but no pageableMemoryAccess. In that case, the Nvidia documentation says: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-cc60 For devices with compute capability 6.x or higher but without pageable memory access, CUDA Managed Memory is fully supported and coherent. The programming model and performance tuning of unified memory is largely similar to the model as described in Unified memory on devices with full CUDA Unified Memory support, with the notable exception that system allocators cannot be used to allocate memory. Thus, the following list of sub-sections do not apply: System-Allocated Memory: in-depth examples Hardware/Software Coherency So, is pageable memory access really needed for the openmp unified shared memory directive? unified_shared_memory? Because if I understand the nividia documentation correctly, if managed memory and concurrent managed access are there, then the compiler could just look whether a pointer is needed on the device, and then replace malloc by cudamallocmanaged and then it would have the shared pointer?