On Wed, Nov 02, 2016 at 12:34:47PM -0700, Cesar Philippidis wrote:
> @@ -932,9 +933,84 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, 
> void **devaddrs,
>  
>    if (seen_zero)
>      {
> +      /* See if the user provided GOMP_OPENACC_DIM environment
> +      variable to specify runtime defaults. */
> +      static int default_dims[GOMP_DIM_MAX];
> +
> +      if (!default_dims[0])
> +     {

Is this guarded by some lock, or is it just racy if multiple
nvptx_execs are done at the same time?

> +       /* We only read the environment variable once.  You can't
> +          change it in the middle of execution.  The sytntax  is

syntax

> +          the same as for the -fopenacc-dim compilation option.  */
> +       const char *env_var = getenv ("GOMP_OPENACC_DIM");

> +
> +       if (CUDA_SUCCESS == cuDeviceGetAttribute
> +           (&block_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev)
> +           && CUDA_SUCCESS == cuDeviceGetAttribute
> +           (&warp_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev)
> +           && CUDA_SUCCESS == cuDeviceGetAttribute
> +           (&dev_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev)
> +           && CUDA_SUCCESS == cuDeviceGetAttribute
> +           (&cpu_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, 
> dev))

The formatting is wrong.  1) you should use the call should be on lhs of ==,
not rhs 2) ( should be after cuDeviceGetAttribute, not on the next line
3) still the lines are too long.

          if (cuDeviceGetAttribute (&block_size,
                                    CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK,
                                    dev) == CUDA_SUCCESS
              && cuDeviceGetAttribute (...

CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR
is still way too long, perhaps initialize a temporary const var
to that, or use some macro like
DEV_ATTR (MAX_THREADS_PER_MULTIPROCESSOR)
where
#define DEV_ATTR(x) CU_DEVICE_ATTRIBUTE_##x

Otherwise LGTM.

        Jakub

Reply via email to