On 09/29/11 23:32, Vladimir Makarov wrote:
> Bernd, sorry for the delay.

No problem.

>   I thought for long time about this approach because we already have
> selective scheduler which with some modifications could be used for
> this.  Selective scheduler was implemented for Itanium, designed to work
> after RA (although it can work before too) and it implements a general
> software pipelining (not modulo scheduling) which could be used for
> loops with conditionals and with varying II (if we speak in terms of
> modulo scheduling).

Yes, but that's somewhat orthogonal to what we need for C6X - the
hardware support is really designed for modulo scheduling. For more
complicated or bigger loops, using more general software pipelining
could be a win, but benchmark results seem to suggest that a large part
of the possible gains on C6X can be achieved without it.

The other problem with sel-sched is that, as you said, it's quite
complicated - the code is still rather opaque to me. Also, to use it on
C6X, the support for delay slots that now exists in haifa-sched would
have to be added to it as well. In general I'm not too keen on
supporting two different schedulers side-by-side and duplicating
features in each.

>   On the other hand, your changes to haifa-scheduler are small, so I
> concluded it might be ok (I hope the coming changes with register
> renaming which selective scheduler already does will be not big too). 

The register renaming changes will be localized to c6x.c (and a few
small changes in regrename.c) - their purpose is to ensure that the
instructions present in a loop are balanced across the two halves of the
machine. I don't think there's another target with similar enough
requirements to try to add some form of general support for this kind of
thing yet.

What I do have coming up are some more haifa-sched changes to make it
predicate insns when that allows them to be moved across jumps. Just
bootstrapping that on ia64 now...

> Now we have more complex selective scheduler with general software
> pipelining and simpler haifa scheduler with modulo scheduling.  So I
> think we could look at selective scheduler for servers with VLIW and
> in-order pipelined processors where code expansion is not so important
> and haifa-scheduler with modulo scheduler for embedded VLIW processors.

Note that the C6X modulo-scheduling really relies on the exposed
pipeline to produce sensible code; you'd probably have to be more clever
with the unrolling to make this work reasonably on a different target.
Still, I think the haifa-sched code could in principle be extended to
also be interesting for other CPUs, maybe even ia64 - doesn't that have
rotating registers for modulo scheduling?

>   As for the patch itself (only scheduler parts in 1/4 and 2/4), it is
> ok to me to commit this.  I did not find anything which should be changed.

Thanks!


Bernd

Reply via email to