On 07/20/2017 03:04 PM, Tom de Vries wrote:
On 07/13/2017 06:53 PM, Cesar Philippidis wrote:
Similarly, for nvptx vector reductions, when it comes time to initialize
the reduction variable, the nvptx BE constructs a branch so that only
vector lanes 1 to vector_length-1 are initialized the the de
On 07/20/2017 04:12 PM, Cesar Philippidis wrote:
Would you like to take over this patch?
I saw that you started working on this issue (or a similar one to it) in
PR81442.
Sure.
Thanks,
- Tom
On 07/20/2017 06:04 AM, Tom de Vries wrote:
> On 07/13/2017 06:53 PM, Cesar Philippidis wrote:
>> Similarly, for nvptx vector reductions, when it comes time to initialize
>> the reduction variable, the nvptx BE constructs a branch so that only
>> vector lanes 1 to vector_length-1 are initialized th
On 07/13/2017 06:53 PM, Cesar Philippidis wrote:
Similarly, for nvptx vector reductions, when it comes time to initialize
the reduction variable, the nvptx BE constructs a branch so that only
vector lanes 1 to vector_length-1 are initialized the the default value
for a given reduction type, where
The recent basic block profiling changes broke a couple of libgomp
OpenACC execution tests involving reductions with nvptx offloading. For
gang and worker reductions, the nvptx BE updates the original reduction
variable using a lock-free atomic algorithm. This lock-free algorithm
utilizes a polling