On Wed, Nov 02, 2016 at 12:34:47PM -0700, Cesar Philippidis wrote: > @@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, > void **devaddrs, > > if (seen_zero) > { > + /* See if the user provided GOMP_OPENACC_DIM environment > + variable to specify runtime defaults. */ > + static int default_dims[GOMP_DIM_MAX]; > + > + if (!default_dims[0]) > + {
Is this guarded by some lock, or is it just racy if multiple nvptx_execs are done at the same time? > + /* We only read the environment variable once. You can't > + change it in the middle of execution. The sytntax is syntax > + the same as for the -fopenacc-dim compilation option. */ > + const char *env_var = getenv ("GOMP_OPENACC_DIM"); > + > + if (CUDA_SUCCESS == cuDeviceGetAttribute > + (&block_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev) > + && CUDA_SUCCESS == cuDeviceGetAttribute > + (&warp_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev) > + && CUDA_SUCCESS == cuDeviceGetAttribute > + (&dev_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev) > + && CUDA_SUCCESS == cuDeviceGetAttribute > + (&cpu_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, > dev)) The formatting is wrong. 1) you should use the call should be on lhs of ==, not rhs 2) ( should be after cuDeviceGetAttribute, not on the next line 3) still the lines are too long. if (cuDeviceGetAttribute (&block_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev) == CUDA_SUCCESS && cuDeviceGetAttribute (... CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR is still way too long, perhaps initialize a temporary const var to that, or use some macro like DEV_ATTR (MAX_THREADS_PER_MULTIPROCESSOR) where #define DEV_ATTR(x) CU_DEVICE_ATTRIBUTE_##x Otherwise LGTM. Jakub