Hi, I'd like to ping for this patch: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02362.html
Thanks, Bernd. On 09/30/16 15:33, Bernd Edlinger wrote: > On 09/30/16 12:14, Bernd Edlinger wrote: >> Eric Botcazou wrote: >>>> A comment before the SETs and a testcase would be nice. IIRC >>>> we do have stack size testcases via using -fstack-usage. >>> >>> Or -Wstack-usage, which might be more appropriate here. >> >> Yes. good idea. I was not aware that we already have that kind of >> tests. >> >> When trying to write this test. I noticed, that I did not try -Os so far. >> But for -Os the stack is still the unchanged 3500 bytes. >> >> However for embedded targets I am often inclined to use -Os, and >> would certainly not expect the stack to explode... >> >> I see in arm.md there are places like >> >> /* If we're optimizing for size, we prefer the libgcc calls. */ >> if (optimize_function_for_size_p (cfun)) >> FAIL; >> > > Oh, yeah. The comment is completely misleading. > > If this pattern fails, expmed.c simply expands some > less efficient rtl, which also results in two shifts > and one or-op. No libgcc calls at all. > > So in simple cases without spilling the resulting > assembler is the same, regardless if this pattern > fails or not. But the half-defined out registers > make a big difference when it has to be spilled. > >> /* Expand operation using core-registers. >> 'FAIL' would achieve the same thing, but this is a bit >> smarter. */ >> scratch1 = gen_reg_rtx (SImode); >> scratch2 = gen_reg_rtx (SImode); >> arm_emit_coreregs_64bit_shift (LSHIFTRT, operands[0], operands[1], >> operands[2], scratch1, scratch2); >> >> >> .. that explains why this happens. I think it would be better to >> use the emit_coreregs for shift count >= 32, because these are >> effectively 32-bit shifts. >> >> Will try if that can be improved, and come back with the >> results. >> > > The test case with -Os has 3520 bytes stack usage. > When only shift count >= 32 are handled we > have still 3000 bytes stack usage. > And when arm_emit_coreregs_64bit_shift is always > allowed to run, we have 2360 bytes stack usage. > > Also for the code size it is better not to fail this > pattern. So I propose to remove this exception in all > three expansions. > > Here is an improved patch with the test case from the PR. > And a comment on the redundant SET why it is better to clear > the out register first. > > > Bootstrap and reg-testing on arm-linux-gnueabihf. > Is it OK for trunk? > > > Thanks > Bernd.