This makes a lot of sense
Reviewed-by: Jason Ekstrand
On Tue, Aug 16, 2016 at 1:54 PM, Francisco Jerez
wrote:
> ANY4H is more efficient than ANY8H and ANY16H because it makes sure
> that whenever a whole subspan hits a discard statement it gets
> disabled by the EU until the end of the program
ANY4H is more efficient than ANY8H and ANY16H because it makes sure
that whenever a whole subspan hits a discard statement it gets
disabled by the EU until the end of the program, regardless of whether
the discard condition is uniform across all channels of the SIMD8-16
thread. OTOH ANY8H/ANY16H w