> I'd like to explore distributing threads across a heterogenous NUMA
> architecture.  I.e. input/output data would have to be transferred
> explicitly, and the compiler would have to have more than one backend.

I'm currently working on something that looks quite similar, in the
"streamization" branch. The gist is that tasks (or threads), with no
access to shared memory, communicate through streams (or input/output
channels). I'm using OpenMP annotations to help in the analysis, but
they are not a requirement.

> Would such work be appropriate for an existing branch, or should I better
> work on my own branch for that?

The multiple backends compilation is not directly related, so you
should use a separate branch. It makes sense to go in that direction.

> And do the current autoparallelization algorithms find or propagate
> sufficent
> alias information ((not always, obviously, but at least sometimes) to
> determine
> if offloading a job to another processor with separate memories is safe and
> likely to be worthwhile?

For the safety, what matters is that no data dependences are violated.
Alias analysis will be used to determine whether such dependences
exist.
The analysis will not be able to always tell you yes or no for the
presence of such dependences, but it's conservative, so if it says
there are none, then you're safe. If the code is nasty, it will
probably just decide that it clobbers memory and reject it.

For the worthwhile part, it depends on many things ... the
communication latencies and bandwidths, each node's computational
capabilities, the task or thread's workload (or rather the arithmetic
intensity) ...  I would tend to believe that this is not available and
it would probably be a most interesting addition.

Antoniu

Reply via email to