Re: [RFC] sched: Do not move expensive insns speculatively (PR68664)

Jeff Law Fri, 27 Jan 2017 10:04:58 -0800

On 01/27/2017 07:19 AM, Segher Boessenkool wrote:

On Fri, Jan 27, 2017 at 02:30:49PM +0100, Richard Biener wrote:

Ok, maybe with -fno-trapping-math we don't consider that case but even
then generating
a NaN is usually dreadfully slow so avoiding speculation of such insns
looks good in
any case (w/o considering its cost).


And -ffast-math includes -ffinite-math-only.  No, the testcase never
takes the square root of number smaller than zero, it isn't *that* slow ;-)


Well, the testcase as written doesn't but if you speculate the sqrt it might?


Yeah true.  Except we have -ffast-math so we told the compiler that is
just fine to do.

Things slow down so much because there is a loop immediately followed
by a square root insn, and sched-rgn decides it is a good idea to move
it to inside the loop.  Which is a bad idea no matter what the frequency
of the loop is because 1) we do not get such profiles very correct, and
2) sqrt is really expensive.


I understood that but then moving sth inside a loop is almost never a win.


It defaults to moving something if it has space for it in the schedule
and it is executed at least 40% of the time (I think).

Can't "not modeled" insns not be marked somehow in the pipeline description?


Well, the only thing from the pipeline description that is used here is
the insn latency, which isn't all that much higher than "normal" FP insns.
And simply "not decribed properly" won't do much good -- if we could
(without blowing up the automata) we would, and sched-rgn would then
still speculate this.

And I think this is the core of the issue. We have multiple ports thatdon't necessarily fully describe the latency, issue rates, etc ofcertain insns like div/sqrt/rsqrt. There are good reasons for doing that.

Because of the partial description, the scheduler may think those insnsfit into a pipeline bubble within the loop, when reality they do not.

The scheduler currently has no way of knowing what insns have thisproperty. While there are cases where we'd like to speculate a div orsqrt to give it more time to complete without stalls -- there's no goodway to do that without fully describing them to the scheduler.

My preference would be somehow either mark those insns as not fullymodeled and avoid speculating on them. Or invent a target hook to allowthe scheduler to query the backend.

Note that these could be used elsewhere -- for example delay slotscheduling and predication. Delay slot scheduling does speculation andthere's ports that simply refuse to allow certain instructions (div/sqrton one port, I think all FP stuff on another) to avoid these kinds ofproblems.

Similarly nullification/predication often work by wiping out the finalposting of results into the register file. So imagine a non-pipelineddiv/sqrt. Predicating a div/sqrt instruction will actually keep thepipeline busy computing results that will be thrown away and preventingother useful work from occurring. And, yes, this really does happen.THe PA suffered from these problems.


jeff

Re: [RFC] sched: Do not move expensive insns speculatively (PR68664)

Reply via email to