https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120216
--- Comment #4 from Benjamin Schulz <schulz.benjamin at googlemail dot com> --- I did a benchmark here with my new card : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120679 In many nvidia graphics cards on a pci bus, unified shared memory, in form of hmm is currently not very suitable due to low speed. Perhaps this is an issue of Nvidia's cuda implementation? So the recommendation, for now, is to work with mapping macros.