On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu" <hjl.to...@gmail.com> wrote: >On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford ><richard.sandif...@linaro.org> wrote: >> Arjan van de Ven <ar...@linux.intel.com> writes: >>> On 8/7/2017 8:43 AM, Jakub Jelinek wrote: >>>> On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote: >>>>> When Linux/x86-64 kernel is compiled with -fno-omit-frame-pointer. >>>>> this optimization removes more than 730 >>>>> >>>>> pushq %rbp >>>>> movq %rsp, %rbp >>>>> popq %rbp >>>> >>>> If you don't want the frame pointer, why are you compiling with >>>> -fno-omit-frame-pointer? Are you going to add >>>> -fforce-no-omit-frame-pointer or something similar so that people >can >>>> actually get what they are asking for? This doesn't really make >sense. >>>> It is perfectly fine to omit frame pointer by default, when it >isn't >>>> required for something, but if the user asks for it, we shouldn't >ignore his >>>> request. >>>> >>> >>> >>> wanting a framepointer is very nice and desired... ... but if the >>> optimizer/ins scheduler moves instructions outside of the frame'd >>> portion, (it does it for cases like below as well), the value is >>> already negative for these functions that don't have stack use. >>> >>> <MPIDU_Sched_are_pending@@Base>: >>> mov all_schedules@@Base-0x38460,%rax >>> push %rbp >>> mov %rsp,%rbp >>> pop %rbp >>> cmpq $0x0,(%rax) >>> setne %al >>> movzbl %al,%eax >>> retq >> >> Yeah, and it could be even weirder for big single-block functions. >> I think GCC has been doing this kind of scheduling of prologue and >> epilogue instructions for a while, so there hasn*t really been a >> guarantee which parts of the function will have a new FP and which >> will still have the old one. >> >> Also, with an arbitrarily-picked host compiler (GCC 6.3.1), >shrink-wrapping >> kicks in when the following is compiled with -O3 >-fno-omit-frame-pointer: >> >> void f (int *); >> void >> g (int *x) >> { >> for (int i = 0; i < 1000; ++i) >> x[i] += 1; >> if (x[0]) >> { >> int temp; >> f (&temp); >> } >> } >> >> so only the block with the call to f sets up FP. The relatively >> long-running loop runs with the caller's FP. >> >> I hope we can go for a target-independent position that what HJ*s >> patch does is OK... >> > >In light of this, I am resubmitting my patch. I added 3 more >testcases >and also handle: > >typedef int v8si __attribute__ ((vector_size (32))); > >void >foo (v8si *out_start, v8si *out_end, v8si *regions) >{ > v8si base = regions[3]; > *out_start = base; > *out_end = base; >} > >OK for trunk?
The invoker specified -fno-omit-frame-pointer, why did you eliminate it? I'd argue it's OK when neither -f nor -fno- is explicitly specified irrespective of the default in case we document the change but an explicit -fno- is pretty clear. And yes, unfortunate placement of frame pointer init/de-init should be fixed. Richard. > >Thanks.