On 08/13/2018 08:08 AM, Tom de Vries wrote:
> On 08/13/2018 04:54 PM, Cesar Philippidis wrote:
>> Going
>> forward, how would you like to proceed with the nvptx BE vector length
>> changes.
>
> Do you have a branch available on github containing the patch series
> you've submitted?
Yes, https://g
On 08/13/2018 04:54 PM, Cesar Philippidis wrote:
> Going
> forward, how would you like to proceed with the nvptx BE vector length
> changes.
Do you have a branch available on github containing the patch series
you've submitted?
Thanks,
- Tom
On 08/13/2018 05:04 AM, Tom de Vries wrote:
> On 08/10/2018 08:39 PM, Cesar Philippidis wrote:
>> is that I modified the default value for vectors as follows
>>
>> +int vectors = default_dim_p[GOMP_DIM_VECTOR]
>> + ? 0 : dims[GOMP_DIM_VECTOR];
>>
>> Technically, trunk only supports
vector, otherwise
the code will become too convoluted.
Btw, I've also noticed that we don't handle a too high
GOMP_OPENACC_DIM[GOMP_DIM_WORKER], I've added a TODO comment for this.
> If you want, I can resubmit a patch without that change though>
> 0001-nvptx-Use-CUDA-dr
On 08/08/2018 08:19 AM, Tom de Vries wrote:
> On Wed, Aug 08, 2018 at 07:09:16AM -0700, Cesar Philippidis wrote:
>> On 08/07/2018 06:52 AM, Cesar Philippidis wrote:
Thanks for review. This version should address all of the following
remarks. However, one thing to note ...
>> [nvptx] Use CUDA driv
On Wed, Aug 08, 2018 at 07:09:16AM -0700, Cesar Philippidis wrote:
> On 08/07/2018 06:52 AM, Cesar Philippidis wrote:
>
> > I attached an updated version of the CUDA driver patch, although I
> > haven't rebased it against your changes yet. It still needs to be tested
> > against CUDA 5.5 using the
On 08/07/2018 06:52 AM, Cesar Philippidis wrote:
> I attached an updated version of the CUDA driver patch, although I
> haven't rebased it against your changes yet. It still needs to be tested
> against CUDA 5.5 using the systems/Nvidia's cuda.h. But I wanted to give
> you an update.
>
> Does thi
your changes yet. It still needs to be tested
against CUDA 5.5 using the systems/Nvidia's cuda.h. But I wanted to give
you an update.
Does this patch look OK, at least after testing competes? I removed the
tests for CUDA_ONE_CALL_MAYBE_NULL, because the newer CUDA API isn't
supported in the o
On 08/01/2018 12:18 PM, Tom de Vries wrote:
> I think we need to add and handle:
> ...
> CUDA_ONE_CALL_MAYBE_NULL (cuOccupancyMaxPotentialBlockSize)
> ...
>
I realized that the patch I posted introducing CUDA_ONE_CALL_MAYBE_NULL
was incomplete, and needed to use the weak attribute in case of l
On 08/03/2018 05:37 PM, Cesar Philippidis wrote:
>> But I still see no rationale why blocks is used here, and I wonder
>> whether something like num_gangs = grids * 64 would give similar results.
> My original intent was to keep the load proportional to the block size.
> So, in the case were a blo
On 08/03/2018 08:22 AM, Tom de Vries wrote:
> On 08/01/2018 09:11 PM, Cesar Philippidis wrote:
>> On 08/01/2018 07:12 AM, Tom de Vries wrote:
>>
>> + gangs = grids * (blocks / warp_size);
>
> So, we launch with gangs == grids * workers ? Is that intentional?
Yes.
On 08/01/2018 09:11 PM, Cesar Philippidis wrote:
> On 08/01/2018 07:12 AM, Tom de Vries wrote:
>
> + gangs = grids * (blocks / warp_size);
So, we launch with gangs == grids * workers ? Is that intentional?
>>>
>>> Yes. At least that's what I've been using in og8. Setting num_ga
On 08/01/2018 07:12 AM, Tom de Vries wrote:
+gangs = grids * (blocks / warp_size);
>>>
>>> So, we launch with gangs == grids * workers ? Is that intentional?
>>
>> Yes. At least that's what I've been using in og8. Setting num_gangs =
>> grids alone caused significant slow downs.
>>
>
On 08/01/2018 04:01 PM, Cesar Philippidis wrote:
> On 08/01/2018 03:18 AM, Tom de Vries wrote:
>> On 07/31/2018 04:58 PM, Cesar Philippidis wrote:
>>> The attached patch teaches libgomp how to use the CUDA thread occupancy
>>> calculator built into the CUDA driver. Despite both being based off the
On 08/01/2018 03:18 AM, Tom de Vries wrote:
> On 07/31/2018 04:58 PM, Cesar Philippidis wrote:
>> The attached patch teaches libgomp how to use the CUDA thread occupancy
>> calculator built into the CUDA driver. Despite both being based off the
>> CUDA thread occupancy spreadsheet distributed with
On 07/31/2018 04:58 PM, Cesar Philippidis wrote:
> The attached patch teaches libgomp how to use the CUDA thread occupancy
> calculator built into the CUDA driver. Despite both being based off the
> CUDA thread occupancy spreadsheet distributed with CUDA, the built in
> occupancy calculator differs
The attached patch teaches libgomp how to use the CUDA thread occupancy
calculator built into the CUDA driver. Despite both being based off the
CUDA thread occupancy spreadsheet distributed with CUDA, the built in
occupancy calculator differs from the occupancy calculator in og8 in two
key ways. Fi
17 matches
Mail list logo