On 09/29/11 23:32, Vladimir Makarov wrote: > Bernd, sorry for the delay.
No problem. > I thought for long time about this approach because we already have > selective scheduler which with some modifications could be used for > this. Selective scheduler was implemented for Itanium, designed to work > after RA (although it can work before too) and it implements a general > software pipelining (not modulo scheduling) which could be used for > loops with conditionals and with varying II (if we speak in terms of > modulo scheduling). Yes, but that's somewhat orthogonal to what we need for C6X - the hardware support is really designed for modulo scheduling. For more complicated or bigger loops, using more general software pipelining could be a win, but benchmark results seem to suggest that a large part of the possible gains on C6X can be achieved without it. The other problem with sel-sched is that, as you said, it's quite complicated - the code is still rather opaque to me. Also, to use it on C6X, the support for delay slots that now exists in haifa-sched would have to be added to it as well. In general I'm not too keen on supporting two different schedulers side-by-side and duplicating features in each. > On the other hand, your changes to haifa-scheduler are small, so I > concluded it might be ok (I hope the coming changes with register > renaming which selective scheduler already does will be not big too). The register renaming changes will be localized to c6x.c (and a few small changes in regrename.c) - their purpose is to ensure that the instructions present in a loop are balanced across the two halves of the machine. I don't think there's another target with similar enough requirements to try to add some form of general support for this kind of thing yet. What I do have coming up are some more haifa-sched changes to make it predicate insns when that allows them to be moved across jumps. Just bootstrapping that on ia64 now... > Now we have more complex selective scheduler with general software > pipelining and simpler haifa scheduler with modulo scheduling. So I > think we could look at selective scheduler for servers with VLIW and > in-order pipelined processors where code expansion is not so important > and haifa-scheduler with modulo scheduler for embedded VLIW processors. Note that the C6X modulo-scheduling really relies on the exposed pipeline to produce sensible code; you'd probably have to be more clever with the unrolling to make this work reasonably on a different target. Still, I think the haifa-sched code could in principle be extended to also be interesting for other CPUs, maybe even ia64 - doesn't that have rotating registers for modulo scheduling? > As for the patch itself (only scheduler parts in 1/4 and 2/4), it is > ok to me to commit this. I did not find anything which should be changed. Thanks! Bernd