On Wed, Aug 9, 2017 at 4:22 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On August 9, 2017 9:53:05 AM GMT+02:00, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >>Richard Biener <richard.guent...@gmail.com> writes: >>> On August 8, 2017 7:36:35 PM GMT+02:00, Richard Sandiford >>> <richard.sandif...@linaro.org> wrote: >>>>Richard Sandiford <richard.sandif...@linaro.org> writes: >>>>> Richard Biener <richard.guent...@gmail.com> writes: >>>>>> On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu" >>>><hjl.to...@gmail.com> wrote: >>>>>>>On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford >>>>>>><richard.sandif...@linaro.org> wrote: >>>>>>>> Arjan van de Ven <ar...@linux.intel.com> writes: >>>>>>>>> On 8/7/2017 8:43 AM, Jakub Jelinek wrote: >>>>>>>>>> On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote: >>>>>>>>>>> When Linux/x86-64 kernel is compiled with >>>>-fno-omit-frame-pointer. >>>>>>>>>>> this optimization removes more than 730 >>>>>>>>>>> >>>>>>>>>>> pushq %rbp >>>>>>>>>>> movq %rsp, %rbp >>>>>>>>>>> popq %rbp >>>>>>>>>> >>>>>>>>>> If you don't want the frame pointer, why are you compiling >>with >>>>>>>>>> -fno-omit-frame-pointer? Are you going to add >>>>>>>>>> -fforce-no-omit-frame-pointer or something similar so that >>>>people >>>>>>>can >>>>>>>>>> actually get what they are asking for? This doesn't really >>make >>>>>>>sense. >>>>>>>>>> It is perfectly fine to omit frame pointer by default, when it >>>>>>>isn't >>>>>>>>>> required for something, but if the user asks for it, we >>>>shouldn't >>>>>>>ignore his >>>>>>>>>> request. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> wanting a framepointer is very nice and desired... ... but if >>>>the >>>>>>>>> optimizer/ins scheduler moves instructions outside of the >>frame'd >>>>>>>>> portion, (it does it for cases like below as well), the value >>is >>>>>>>>> already negative for these functions that don't have stack use. >>>>>>>>> >>>>>>>>> <MPIDU_Sched_are_pending@@Base>: >>>>>>>>> mov all_schedules@@Base-0x38460,%rax >>>>>>>>> push %rbp >>>>>>>>> mov %rsp,%rbp >>>>>>>>> pop %rbp >>>>>>>>> cmpq $0x0,(%rax) >>>>>>>>> setne %al >>>>>>>>> movzbl %al,%eax >>>>>>>>> retq >>>>>>>> >>>>>>>> Yeah, and it could be even weirder for big single-block >>functions. >>>>>>>> I think GCC has been doing this kind of scheduling of prologue >>and >>>>>>>> epilogue instructions for a while, so there hasn*t really been a >>>>>>>> guarantee which parts of the function will have a new FP and >>which >>>>>>>> will still have the old one. >>>>>>>> >>>>>>>> Also, with an arbitrarily-picked host compiler (GCC 6.3.1), >>>>>>>shrink-wrapping >>>>>>>> kicks in when the following is compiled with -O3 >>>>>>>-fno-omit-frame-pointer: >>>>>>>> >>>>>>>> void f (int *); >>>>>>>> void >>>>>>>> g (int *x) >>>>>>>> { >>>>>>>> for (int i = 0; i < 1000; ++i) >>>>>>>> x[i] += 1; >>>>>>>> if (x[0]) >>>>>>>> { >>>>>>>> int temp; >>>>>>>> f (&temp); >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> so only the block with the call to f sets up FP. The relatively >>>>>>>> long-running loop runs with the caller's FP. >>>>>>>> >>>>>>>> I hope we can go for a target-independent position that what >>HJ*s >>>>>>>> patch does is OK... >>>>>>>> >>>>>>> >>>>>>>In light of this, I am resubmitting my patch. I added 3 more >>>>>>>testcases >>>>>>>and also handle: >>>>>>> >>>>>>>typedef int v8si __attribute__ ((vector_size (32))); >>>>>>> >>>>>>>void >>>>>>>foo (v8si *out_start, v8si *out_end, v8si *regions) >>>>>>>{ >>>>>>> v8si base = regions[3]; >>>>>>> *out_start = base; >>>>>>> *out_end = base; >>>>>>>} >>>>>>> >>>>>>>OK for trunk? >>>>>> >>>>>> The invoker specified -fno-omit-frame-pointer, why did you >>eliminate >>>>it? >>>>>> I'd argue it's OK when neither -f nor -fno- is explicitly >>specified >>>>>> irrespective of the default in case we document the change but an >>>>>> explicit -fno- is pretty clear. >>>>> >>>>> I don't buy that we're ignoring the user. -fomit-frame-pointer >>says >>>>> that, when you're creating a frame, it's OK not to set up the frame >>>>> pointer. Forcing it off means that if you create a frame, you need >>>>> to set up the frame pointer too. But it doesn't say anything about >>>>> whether the frame itself is needed. I.e. it's >>>>-fno-omit-frame*-pointer* >>>>> rather than -fno-omit-frame. >>> >>> Isn't that a bit splitting hairs if you look at (past) history? >> >>I guess it would have been splitting hairs in the days when they >>amounted to the same thing, i.e. when there was no behaviour that >>would match "-fomit-frame" and when the prologue and epilogue were >>glued to the start and end of the function. But that was quite a >>long time ago. Shrink-wrapping at least means that omitting the frame >>and omitting the frame pointer are different things, and it seems >>fair that -fomit-frame-pointer has followed the natural meaning. >> >>> You could also interpret -fno-omit-frame-pointer as obviously forcing >>a >>> frame as otherwise there's nothing to omit... >> >>But applying that kind of interpretation to something like >>-maccumulate-outgoing-args would make inlining all calls within a >>function invalid, since there'd no longer be arguments to accumulate. >> >>I think this kind of disagreement just emphasises that if we really >>need a "always emit a prologue at the very start, an epilogue at the >>very end, and always use a frame pointer" option, we should add it >>and document exactly what the guarantees are. I don't think >>-fno-omit-frame-pointer should be it, since as the replies earlier in >>the thread said, the natural meaning of that option has its uses too. > > OK, but then both -f[no-]omit-frame-pointer do not have clearly defined > semantics and thus shouldn't be exposed to the user? >
-f[no-]omit-frame-pointer apply to cases where a new stack frame is needed. -fno-omit-frame-pointer allows you to unwind each stack frame, not necessarily each function, via frame pointer. -fno-omit-frame-pointer may not create a new stack frame for each function, similar to LTO or function inlining. But you can still unwind via frame pointer. We never guarantee that we will create a new stack frame for each function. Some functions are inlined completely. Some just use the caller's stack frame. -- H.J.