On 01/29/16 10:18, Jakub Jelinek wrote:
On Thu, Jan 28, 2016 at 10:38:51AM -0500, Nathan Sidwell wrote:
This patch adds default compute dimension handling.  Users rarely specify
compute dimensions, expecting the toolchain to DTRT.  More savvy users would
like to specify global defaults.  This patch permits both.

Isn't it better to be able to override the defaults on the library side?
I mean, when when somebody is compiling the code, often he doesn't know the
exact properties of the hw it will be run on, if he does, I think it is
better to specify them explicitly in the code.  But if he doesn't, one just
has to hope libgomp will figure out the best defaults.
So, wouldn't it be better to add some env var that would allow to control
this instead?

You have anticipated part 2 of this patch, which would allow a default to be deferred to runtime in the manner you describe.

Generally, one can know at compile time the upper bound on workers (it's part of the chip specification), but the number of physical gangs depends on the accelerator card. (That is true for PTX and IIUC for other GPGPUs too.) So, you may want defer num gangs to runtime -- but of course then you lose constant folding opportunities.

nathan

Reply via email to