> I'd like to explore distributing threads across a heterogenous NUMA > architecture. I.e. input/output data would have to be transferred > explicitly, and the compiler would have to have more than one backend.
I'm currently working on something that looks quite similar, in the "streamization" branch. The gist is that tasks (or threads), with no access to shared memory, communicate through streams (or input/output channels). I'm using OpenMP annotations to help in the analysis, but they are not a requirement. > Would such work be appropriate for an existing branch, or should I better > work on my own branch for that? The multiple backends compilation is not directly related, so you should use a separate branch. It makes sense to go in that direction. > And do the current autoparallelization algorithms find or propagate > sufficent > alias information ((not always, obviously, but at least sometimes) to > determine > if offloading a job to another processor with separate memories is safe and > likely to be worthwhile? For the safety, what matters is that no data dependences are violated. Alias analysis will be used to determine whether such dependences exist. The analysis will not be able to always tell you yes or no for the presence of such dependences, but it's conservative, so if it says there are none, then you're safe. If the code is nasty, it will probably just decide that it clobbers memory and reject it. For the worthwhile part, it depends on many things ... the communication latencies and bandwidths, each node's computational capabilities, the task or thread's workload (or rather the arithmetic intensity) ... I would tend to believe that this is not available and it would probably be a most interesting addition. Antoniu