On 06/11/2018 20:18, Segher Boessenkool wrote: > On Tue, Nov 06, 2018 at 07:43:36PM +0000, Richard Earnshaw (lists) wrote: >> On 06/11/2018 18:18, Segher Boessenkool wrote: >>> On Tue, Nov 06, 2018 at 11:46:53AM +0000, Richard Earnshaw (lists) wrote: >>>> Well it generates new 'light-weight' prologue and epilogue sequences for >>>> the 'shrunk' code path that lack the establishment of the tracker >>>> register and doesn't know how to move the existing sequence to the new >>>> entry sequence. >>> >>> Ah, so the shrink-wrapping code is not deleting anything at all (just >>> not adding it). Gotcha :-) >> >> Well.... you could argue that it deleted the tracker update for the case >> where the branch was not taken, and it also deleted the part of the >> prologue where the tracker state was restored into SP before the return. >> But I'm being picky... :-) > > When I say "deleted" I mean "deleted RTL code that was actually there". > You seem to mean "prevented it from being created later"? > > What I'm after is, if the shrink-wrapping code is deleting RTL it has > no business touching, that sounds like a serious bug.
Well it has 'deleted' the update of the tracker register after the conditional branch leading directly to the return insn. But it's possible that what has happened is that the use of the tracker variable has been deleted (not re-emitted for the shrunk-wrap return sequence) and thus another optimization has deleted the update as being dead. I haven't checked the rtl output directly to see how this is happening. R. > >>> [ snip example code; thanks, that helped ] >>> >>>> I'm not asking that shrink wrapping be updated to handle all this; in >>>> fact, I'm not sure it's that easy to do as the branch patterns and >>>> simple-return patterns aren't set up to handle this. >>> >>> One thing you could do is make shrink-wrap aware what part of the code >>> needs the speculation tracking parts of the prologue. You could do this >>> by making a separate shrink-wrapping component for it, or you can do it >>> by marking the places needing it as needing the full prologue, e.g. by >>> emitting a fake call into it (and not outputting any code for that call). >>> The latter does cause a stack frame to be emitted even when it wouldn't >>> otherwise, unfortunately. The separate shrink-wrapping approach should >>> work beautifully as far as I see. >> >> There are number of optimizations that are worth investigation with the >> tracking support; but whether they'll notably improve performance I'm >> not sure. Tracking just just expensive and the main problem is the >> serialization of the state, which limits the core's ability to reorder >> stuff internally. > > Yeah, it will be seriously expensive always. If people still use this > in production code you really _do_ want to optimise it. If that helps > measurably, anyway. > > > Segher >