Re: stack-protection vs alloca vs dwarf2
> While debugging some gdb-related FAILs, I discovered that gcc's > -fstack-check option effectively calls alloca() to adjust the stack > pointer. Note that the subject is misleading, -fstack-check is stack checking and not stack protection, which is -fstack-protector. > However, it doesn't mark the stack adjustment as FRAME_RELATED even > when it's setting up the local variables for the function. > > In the case of rx-elf, for this testcase, the CFA for the function is > defined in terms of the stack pointer - and thus is incorrect after > the alloca call. That's the bug, any function using alloca must use the frame pointer as CFA. > My question is: who's fault is this? Should alloca() tell the debug > stuff that the stack pointer has changed? Should it tell it to not > use $sp at all? Should the debug stuff "just know" that $sp isn't a > valid choice for the CFA? Presumably the rx back-end and more precisely TARGET_FRAME_POINTER_REQUIRED, which needs to return true if cfun->calls_alloca. -- Eric Botcazou
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
Hi, the opt_random.h header includes unconditionally and breaks crytopp build (redefinition of _mm_shuffle_epi8 in cpu.h). could you please add #ifdef __SSSE3__ around this include? BR, Paweł. cpu.h @ cryptopp: (...) #if !defined(__GNUC__) || defined(__SSSE3__) || defined(__INTEL_COMPILER) #include #else __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_shuffle_epi8 (__m128i a, __m128i b) { asm ("pshufb %1, %0" : "+x"(a) : "xm"(b)); return a; } #endif (...)
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
On Thu, Apr 17, 2014 at 11:38:18AM +0200, Paweł Sikora wrote: > Hi, > > the opt_random.h header includes unconditionally and > breaks crytopp build > (redefinition of _mm_shuffle_epi8 in cpu.h). > could you please add #ifdef __SSSE3__ around this include? No, just fix cryptopp. The *intrin.h headers have been redesigned to use #pragma GCC target, so that they are usable regardless of the command line options used. So, unlike in 4.8, you can #include and similar headers and then have say __attribute__((target ("ssse3"))) function which can use _mm_shuffle_epi8 and other SSSE3 intrinsics (similarly for lots of other ISA sets), which wasn't possible before. So, in this case you'd supposedly change the #ifdef to #if !defined(__GNUC__) || defined(__SSSE3__) || (__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ > 8) || defined(__INTEL_COMPILER) or so. Of course, you can use _mm_shuffle_epi8 only from functions compiled within #pragma GCC target ("ssse3") (or anything higher) or __attribute__((target ("ssse3"))). If you don't, cryptopp is even more buggy. Jakub
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
On 17 April 2014 10:38, Paweł Sikora wrote: > Hi, > > the opt_random.h header includes unconditionally and breaks > crytopp build > (redefinition of _mm_shuffle_epi8 in cpu.h). > could you please add #ifdef __SSSE3__ around this include? Do you mean __SSE3__ not __SSSE3__? That's the macro that controls whether the config/cpu/i486/opt/bits/opt_random.h header actually uses the intrinsics or not: namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION #ifdef __SSE3__ template<> template void normal_distribution::
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
W dniu 2014-04-17 12:13, Jonathan Wakely napisał(a): On 17 April 2014 10:38, Paweł Sikora wrote: Hi, the opt_random.h header includes unconditionally and breaks crytopp build (redefinition of _mm_shuffle_epi8 in cpu.h). could you please add #ifdef __SSSE3__ around this include? Do you mean __SSE3__ not __SSSE3__? yes, __SSE3__. That's the macro that controls whether the config/cpu/i486/opt/bits/opt_random.h header actually uses the intrinsics or not: namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION #ifdef __SSE3__ template<> template void normal_distribution::
Multiple cc's, one gcc -- is it possible?
Hi, I want to support, say arch1 and arch2 in custom gcc in the way gcc -march1 test.c calls ${INSTALL}/libexec/gcc/arch1/4.8.2/cc1 and gcc -march2 test.c calls ${INSTALL}/libexec/gcc/arch2/4.8.2/cc1 Are there any way to do it? Maybe not exactly as I outlined, but the whole idea is clear, I think. I looked through multilib and multiarch options, but it seems to be nothing like this. I may correct anything in build system and arch1 and arch2 backends, but I don't want to change any core gcc code, say inside gcc.c. Is it possible? Are there examples how is it possible? --- With best regards, Konstantin
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
Is it ok to port this patch to 4.9 branch: commit 15bee5d49b1c746fd3e784432d7e4988941a671e Author: bviyer Date: Fri Apr 11 19:56:42 2014 + Fix for PR other/60644. +2014-04-11 Barry Tannenbaum + + PR other/60644 + * runtime/os-unix.c: Replaced all occurrances of ANDROID with + __ANDROID__. + * runtime/bug.h: Likewise. + * include/cilk/metaprogramming.h: Likewise. + * include/cilk/reducer_min_max.h: Likewise. + git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@209324 138bc75d-0d04-0410-961f-82ee72b054a4 to fix http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60644 and cure i686-pc-linux-android build? thanks, Alexander 2014-04-17 14:25 GMT+04:00 Paweł Sikora : > W dniu 2014-04-17 12:13, Jonathan Wakely napisał(a): > >> On 17 April 2014 10:38, Paweł Sikora wrote: >>> >>> Hi, >>> >>> the opt_random.h header includes unconditionally and breaks >>> crytopp build >>> (redefinition of _mm_shuffle_epi8 in cpu.h). >>> could you please add #ifdef __SSSE3__ around this include? >> >> >> Do you mean __SSE3__ not __SSSE3__? > > > yes, __SSE3__. > > >> That's the macro that controls whether the >> config/cpu/i486/opt/bits/opt_random.h header actually uses the >> intrinsics or not: >> >> namespace std _GLIBCXX_VISIBILITY(default) >> { >> _GLIBCXX_BEGIN_NAMESPACE_VERSION >> >> #ifdef __SSE3__ >> template<> >> template >> void >> normal_distribution:: > >
Re: Multiple cc's, one gcc -- is it possible?
On Thu, Apr 17, 2014 at 12:39 PM, Konstantin Vladimirov wrote: > Hi, > > I want to support, say arch1 and arch2 in custom gcc in the way > > gcc -march1 test.c > > calls > > ${INSTALL}/libexec/gcc/arch1/4.8.2/cc1 > > and > > gcc -march2 test.c > > calls > > ${INSTALL}/libexec/gcc/arch2/4.8.2/cc1 > > Are there any way to do it? Maybe not exactly as I outlined, but the > whole idea is clear, I think. I looked through multilib and multiarch > options, but it seems to be nothing like this. > > I may correct anything in build system and arch1 and arch2 backends, > but I don't want to change any core gcc code, say inside gcc.c. Is it > possible? Are there examples how is it possible? Only via -B. We used to have -V to select between different versions (that was removed), a similar switch to select between different target triplets could in theory be added. But note that the driver needs to understand a (subset) of target specific options for specs processing so even the driver is somewhat target specific and cannot really be shared. Richard. > --- > With best regards, Konstantin
Re: LRA Stuck in a loop until aborting
pshor...@dataworx.com.au writes: > On 17.04.2014 13:00, Jeff Law wrote: >> On 04/16/14 16:19, Richard Henderson wrote: >>> >>> The register allocators only select an alternative for a move. They >>> do not >>> choose between N different patterns, separately describing loads, >>> stores, and >>> register-to-register movement. >>> >>> I'm fairly sure the documentation is quite clear on this, and GCC had >>> required >>> this since the beginning of time. >> Correct on all counts; many an hour was spent reimplementing the PA >> movXX patterns to satisfy that requirement. >> >> jeff > I'm convinced :-) but... > > gcc internals info about movm is fairly comprehensive and I had taken > care to ensure that I satisfied ... > > "The constraints on a ‘movm’ must permit moving any hard register to > any other hard register provided..." > > by providing a define_expand that assigns from a general_operand to a > nonimmediate_operand and ... > > *ldsi instruction that can load from a general_operand to a > nonimmediate_operand > and a > *storesi instruction that can store a register_operand to a > memory_operand Must admit I can't find where this is documented from a quick look. Like Jeff and Richard, I could swear it was documented somewhere... > In any case, out of curiosity and to convice myself I hadn't imagined > the old reload pass handling this I reverted my recent fixes so that > ldsi and storesi were once again as described above then repeated the > exercise with full rtl dumping on and compared the rtl generated both > with and without LRA enabled. > > In both cases the *.ira dmp produced the triggering ... > > (insn 57 61 58 5 (set (reg/v:SI 46 [orig:31 s ] [31]) > (reg/v:SI 31 [ s ])) 48 {*ldsi} > (expr_list:REG_DEAD (reg/v:SI 31 [ s ]) > (nil))) > > The non-LRA reload rtl produced .. > > (insn 57 61 67 3 (set (reg:SI 1 r1) > (mem/c:SI (plus:HI (reg/f:HI 3 r3) > (const_int 4 [0x4])) [4 %sfp+4 S4 A16])) 48 {*ldsi} > (nil)) > (insn 67 57 58 3 (set (mem/c:SI (plus:HI (reg/f:HI 3 r3) > (const_int 4 [0x4])) [4 %sfp+4 S4 A16]) > (reg:SI 1 r1)) 47 {*storesi} > (nil)) > > While the LRA just got stuck in a loop unable to perform the reload of > insn 57 that the old reload pass handled (or more correctly didn't choke > over - it seems to be a redundant load/store). > > I'm really just highlighting this because I know the LRA is quite young > and this might be a hint towards a deeper/other issues. I think LRA just exploits the rule more than reload did. Reload would look at the original instruction as a move between two MEMs, pick a hard register for the destination of the *ldsi, then emit a store from the hard register to the original MEM destination, on the assumption that that move must be possible. (It would never actually match against *storesi contraints, or even know that *storesi is the instruction that would be used.) It was then down to pot luck whether that hard register gets rused ("inherited") by later instructions that need (reg/v:SI 46 [orig:31 s ] [31]). LRA tries to take a more global view by introducing new pseudo registers to represent as-yet unallocated reload registers. As you say, a consequence of this is that you can see moves between two pseudo registers that match the load pattern predicates but in which the destination is destined to be a MEM. Thanks, Richard
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
On Thu, Apr 17, 2014 at 02:47:50PM +0400, Alexander Ivchenko wrote: > Is it ok to port this patch to 4.9 branch: If it always fails to bootstrap with cilkrts on Android right now, then the patch can't do more harm, so ok. > > commit 15bee5d49b1c746fd3e784432d7e4988941a671e > Author: bviyer > Date: Fri Apr 11 19:56:42 2014 + > > Fix for PR other/60644. > +2014-04-11 Barry Tannenbaum > + > + PR other/60644 > + * runtime/os-unix.c: Replaced all occurrances of ANDROID with > + __ANDROID__. > + * runtime/bug.h: Likewise. > + * include/cilk/metaprogramming.h: Likewise. > + * include/cilk/reducer_min_max.h: Likewise. Jakub
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
That fixes "--disable-shared" case only, which is important for NDK build. Without "--disable-shared" build fails because there is no -lpthread on Android and pthreads are in libc there. Apparently, cilkrts configure does not check that. I can fix that in 4.9, but I'm ok with fixing it in trunk later as well, depending on your decision. 2014-04-17 15:31 GMT+04:00 Jakub Jelinek : > On Thu, Apr 17, 2014 at 02:47:50PM +0400, Alexander Ivchenko wrote: >> Is it ok to port this patch to 4.9 branch: > > If it always fails to bootstrap with cilkrts on Android right now, then > the patch can't do more harm, so ok. >> >> commit 15bee5d49b1c746fd3e784432d7e4988941a671e >> Author: bviyer >> Date: Fri Apr 11 19:56:42 2014 + >> >> Fix for PR other/60644. >> +2014-04-11 Barry Tannenbaum >> + >> + PR other/60644 >> + * runtime/os-unix.c: Replaced all occurrances of ANDROID with >> + __ANDROID__. >> + * runtime/bug.h: Likewise. >> + * include/cilk/metaprogramming.h: Likewise. >> + * include/cilk/reducer_min_max.h: Likewise. > > Jakub
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
On Thu, Apr 17, 2014 at 03:40:10PM +0400, Alexander Ivchenko wrote: > That fixes "--disable-shared" case only, which is important for NDK build. > > Without "--disable-shared" build fails because there is no -lpthread > on Android and pthreads are in libc there. > Apparently, cilkrts configure does not check that. > > I can fix that in 4.9, but I'm ok with fixing it in trunk later as > well, depending on your decision. Ugh, I guess depends on how the patch looks like. In any case, I think libgomp contains pretty much the same hunk as cilkrts, so why you don't run into issues in libgomp and only in libcilkrts? # Check to see if -pthread or -lpthread is needed. Prefer the former. # In case the pthread.h system header is not found, this test will fail. XPCFLAGS="" CFLAGS="$CFLAGS -pthread" AC_LINK_IFELSE( [AC_LANG_PROGRAM( [#include void *g(void *d) { return NULL; }], [pthread_t t; pthread_create(&t,NULL,g,NULL);])], [XPCFLAGS=" -Wc,-pthread"], [CFLAGS="$save_CFLAGS" LIBS="-lpthread $LIBS" AC_LINK_IFELSE( [AC_LANG_PROGRAM( [#include void *g(void *d) { return NULL; }], [pthread_t t; pthread_create(&t,NULL,g,NULL);])], [], [AC_MSG_ERROR([Pthreads are required to build libgomp])])]) Jakub
Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org
On Thu, Apr 17, 2014 at 02:04:03PM +0200, Jakub Jelinek wrote: > Ugh, I guess depends on how the patch looks like. > In any case, I think libgomp contains pretty much the same hunk as cilkrts, > so why you don't run into issues in libgomp and only in libcilkrts? > > # Check to see if -pthread or -lpthread is needed. Prefer the former. > # In case the pthread.h system header is not found, this test will fail. > XPCFLAGS="" > CFLAGS="$CFLAGS -pthread" > AC_LINK_IFELSE( > [AC_LANG_PROGRAM( > [#include >void *g(void *d) { return NULL; }], > [pthread_t t; pthread_create(&t,NULL,g,NULL);])], > [XPCFLAGS=" -Wc,-pthread"], > [CFLAGS="$save_CFLAGS" LIBS="-lpthread $LIBS" > AC_LINK_IFELSE( >[AC_LANG_PROGRAM( > [#include > void *g(void *d) { return NULL; }], > [pthread_t t; pthread_create(&t,NULL,g,NULL);])], >[], >[AC_MSG_ERROR([Pthreads are required to build libgomp])])]) Ah, I see, while it checks for that, it then happily ignores it. libgomp instead propagates it into XCFLAGS and LIBS vars that are substituted in the Makefile and used for the build, libcilkrts hardcodes -lpthread. I guess I'd prefer to see this fixed on the trunk first and then backported to 4.9.1. Jakub
Re: LRA Stuck in a loop until aborting
Hi, On Thu, 17 Apr 2014, Richard Sandiford wrote: > > "The constraints on a ‘movm’ must permit moving any hard register to > > any other hard register provided..." > > > > by providing a define_expand that assigns from a general_operand to a > > nonimmediate_operand and ... > > > > *ldsi instruction that can load from a general_operand to a > > nonimmediate_operand > > and a > > *storesi instruction that can store a register_operand to a > > memory_operand > > Must admit I can't find where this is documented from a quick look. > Like Jeff and Richard, I could swear it was documented somewhere... --- snip -- `movM' ... This class of patterns is special in several ways. First of all, each of these names up to and including full word size _must_ be defined, because there is no other way to copy a datum from one place to another. If there are patterns accepting operands in larger modes, `movM' must be defined for integer modes of those sizes. Second, these patterns are not used solely in the RTL generation pass. Even the reload pass can generate move insns to copy values from stack slots into temporary registers. When it does so, one of the operands is a hard register and the other is an operand that can need to be reloaded into a register. Therefore, when given such a pair of operands, the pattern must generate RTL which needs no reloading and needs no temporary registers--no registers other than the operands. For example, if you support the pattern with a `define_expand', then in such a case the `define_expand' mustn't call `force_reg' or any other such function which might generate new pseudo registers. --- snap -- So, what's actually documented is a bit more relaxed than "movM must be able to move around Mmode operands of all forms". And it's restricted to patterns named 'movM', while in reality all patterns that look like (and would match) a move are so constrained. Ciao, Michael.
Re: [RFC] Detect most integer overflows.
On Sat, Apr 12, 2014 at 12:53:45AM +0200, Hannes Frederic Sowa wrote: > Hi! > > On Tue, Oct 29, 2013 at 10:41:56AM +0100, Richard Biener wrote: > > For a "quick" GCC implementation of the builtins you could expand > > them to a open-coded sequence during gimplification. But due to > > the issues pointed out above I'm not sure it is the best interface > > to support (though now the names are taken). > > I played around with gcc internals for the first time today and came > up with this. As this is my first patch to gcc I am very happy to hear > feedback. Thanks! > Did you looked at resulting assembly for simple expressions? Also Paul Eggert suggested at another list[1] to implement these with 128bit arithmetic which gcc can optimize quite well, it uses overflow flag as check. Could these builtins use a 128bit arithmetic as well? P.S. A generated code is affected by generic gcc bug that gcc uses conditional move instruction even when branch is very unlikely and jump would be faster. [1] https://sourceware.org/ml/libc-alpha/2013-12/msg00084.html
Re: stack-protection vs alloca vs dwarf2
> Presumably the rx back-end and more precisely TARGET_FRAME_POINTER_REQUIRED, > which needs to return true if cfun->calls_alloca. The rx back-end doesn't define TARGET_FRAME_POINTER_REQUIRED, as the documentation says the compiler handles target-independent reasons why there needs to be a frame pointer. But, the default TARGET_FRAME_POINTER_REQUIRED just returns false - shouldn't it, by default, check for calls_alloca ? Also, I added that hook and set it to return true always, and it didn't fix the bug. There is a frame pointer (there was before, too), but there's also a stack adjustment after the pseudo-alloca which the dwarf2 stuff doesn't know about. The last stack adjustment it sees is the rx backend's adjustment to allocate the frame: _medium_frame: pushm r6-r12 add #-4, r0, r6 ; marked frame-related (fp = sp - 4) mov.L r6, r0 ; marked frame-related (sp = fp) . . .; stack checking code goes here add #0xc000, r0 ; not marked frame-related <_medium_frame>: 0: 6e 6c pushm r6-r12 2: 71 06 fcadd #-4, r0, r6 5: ef 60 mov.l r6, r0 7: 2e: 72 00 00 c0 add #0xc000, r0, r0 0014 0030 FDE cie= pc=..0043 DW_CFA_advance_loc4: 2 to 0002 DW_CFA_def_cfa_offset: 32 DW_CFA_offset: r12 at cfa-8 . . . DW_CFA_offset: r6 at cfa-32 DW_CFA_advance_loc4: 3 to 0005 DW_CFA_def_cfa: r6 ofs 36 DW_CFA_advance_loc4: 2 to 0007 DW_CFA_def_cfa_register: r0 ( that's it for debug info ) Perhaps the stack-check code should set FRAME_RELATED on any stack adjustment insn?
Re: stack-protection vs alloca vs dwarf2
On Thu, Apr 17, 2014 at 10:14 AM, DJ Delorie wrote: > > The rx back-end doesn't define TARGET_FRAME_POINTER_REQUIRED, as the > documentation says the compiler handles target-independent reasons why > there needs to be a frame pointer. But, the default > TARGET_FRAME_POINTER_REQUIRED just returns false - shouldn't it, by > default, check for calls_alloca ? calls_alloce is checked where frame_pointer_needed is set, in ira_setup_eliminable_regset in ira.c. TARGET_FRAME_POINTER_REQUIRED only needs to handle target-specific conditions. Ian
gcc-4.8-20140417 is now available
Snapshot gcc-4.8-20140417 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140417/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 209500 You'll find: gcc-4.8-20140417.tar.bz2 Complete GCC MD5=15f73bb81261a6f25eeb211b740a7aa5 SHA1=c4ac3f3b7564a38e60ad2efcb0898499e31f7c2d Diffs from 4.8-20140410 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: stack-protection vs alloca vs dwarf2
> The last stack adjustment it sees is > the rx backend's adjustment to allocate the frame: > > _medium_frame: > pushm r6-r12 > add #-4, r0, r6 ; marked frame-related (fp = sp - > 4) mov.L r6, r0 ; marked frame-related (sp = fp) . . > . ; stack checking code goes here > add #0xc000, r0 ; not marked frame-related > > <_medium_frame>: >0: 6e 6c pushm r6-r12 >2: 71 06 fcadd #-4, r0, r6 >5: ef 60 mov.l r6, r0 >7: I gather that r0 is the stack pointer and r6 the frame pointer? > 0014 0030 FDE cie= pc=..0043 > DW_CFA_advance_loc4: 2 to 0002 > DW_CFA_def_cfa_offset: 32 > DW_CFA_offset: r12 at cfa-8 > . . . > DW_CFA_offset: r6 at cfa-32 > DW_CFA_advance_loc4: 3 to 0005 > DW_CFA_def_cfa: r6 ofs 36 > DW_CFA_advance_loc4: 2 to 0007 > DW_CFA_def_cfa_register: r0 > ( that's it for debug info ) If so, the above DW_CFA_def_cfa_register doesn't make sense, it should be r6 once the frame is established. What does the CIE contain exactly? > Perhaps the stack-check code should set FRAME_RELATED on any stack > adjustment insn? No, the design is that stack checking or alloca force the use of the frame pointer, which thus becomes the CFA register, which means that subsequent stack adjustments are irrelevant for the CFI. -- Eric Botcazou
Re: stack-protection vs alloca vs dwarf2
> I gather that r0 is the stack pointer and r6 the frame pointer? Yes. > > 0014 0030 FDE cie= pc=..0043 > > DW_CFA_advance_loc4: 2 to 0002 > > DW_CFA_def_cfa_offset: 32 > > DW_CFA_offset: r12 at cfa-8 > > . . . > > DW_CFA_offset: r6 at cfa-32 > > DW_CFA_advance_loc4: 3 to 0005 > > DW_CFA_def_cfa: r6 ofs 36 > > DW_CFA_advance_loc4: 2 to 0007 > > DW_CFA_def_cfa_register: r0 > > ( that's it for debug info ) > > If so, the above DW_CFA_def_cfa_register doesn't make sense, it should be r6 > once the frame is established. What does the CIE contain exactly? 0010 CIE Version: 3 Augmentation: "" Code alignment factor: 1 Data alignment factor: -4 Return address column: 17 DW_CFA_def_cfa: r0 ofs 4 DW_CFA_offset: r17 at cfa-4 DW_CFA_nop DW_CFA_nop 0014 0030 FDE cie= pc=..0043 DW_CFA_advance_loc4: 2 to 0002 DW_CFA_def_cfa_offset: 32 DW_CFA_offset: r12 at cfa-8 DW_CFA_offset: r11 at cfa-12 DW_CFA_offset: r10 at cfa-16 DW_CFA_offset: r9 at cfa-20 DW_CFA_offset: r8 at cfa-24 DW_CFA_offset: r7 at cfa-28 DW_CFA_offset: r6 at cfa-32 DW_CFA_advance_loc4: 3 to 0005 DW_CFA_def_cfa: r6 ofs 36 DW_CFA_advance_loc4: 2 to 0007 DW_CFA_def_cfa_register: r0 > > Perhaps the stack-check code should set FRAME_RELATED on any stack > > adjustment insn? > > No, the design is that stack checking or alloca force the use of the frame > pointer, which thus becomes the CFA register, which means that subsequent > stack adjustments are irrelevant for the CFI. Does the backend have to *not* mark further changes to the stack pointer in the prologue as frame related, if the function calls alloca? This is the RL expand_prologue() is emitting: (insn/f 42 5 43 2 (parallel [ (set/f (reg/f:SI 0 r0) (minus:SI (reg/f:SI 0 r0) (const_int 28 [0x1c]))) (set/f (mem:SI (minus:SI (reg/f:SI 0 r0) (const_int 4 [0x4])) [0 S4 A8]) (reg:SI 12 r12)) (set/f (mem:SI (minus:SI (reg/f:SI 0 r0) (const_int 8 [0x8])) [0 S4 A8]) (reg:SI 11 r11)) (set/f (mem:SI (minus:SI (reg/f:SI 0 r0) (const_int 12 [0xc])) [0 S4 A8]) (reg:SI 10 r10)) (set/f (mem:SI (minus:SI (reg/f:SI 0 r0) (const_int 16 [0x10])) [0 S4 A8]) (reg:SI 9 r9)) (set/f (mem:SI (minus:SI (reg/f:SI 0 r0) (const_int 20 [0x14])) [0 S4 A8]) (reg:SI 8 r8)) (set/f (mem:SI (minus:SI (reg/f:SI 0 r0) (const_int 24 [0x18])) [0 S4 A8]) (reg/f:SI 7 r7)) (set/f (mem:SI (minus:SI (reg/f:SI 0 r0) (const_int 28 [0x1c])) [0 S4 A8]) (reg/f:SI 6 r6)) ]) dj.c:2 -1 (nil)) (insn/f 43 42 44 2 (parallel [ (set (reg/f:SI 6 r6) (plus:SI (reg/f:SI 0 r0) (const_int -4 [0xfffc]))) (clobber (reg:CC 16 cc)) ]) dj.c:2 -1 (nil)) (insn/f 44 43 45 2 (set (reg/f:SI 0 r0) (reg/f:SI 6 r6)) dj.c:2 -1 (nil))
Re: stack-protection vs alloca vs dwarf2
> 0010 CIE > Version: 3 > Augmentation: "" > Code alignment factor: 1 > Data alignment factor: -4 > Return address column: 17 > > DW_CFA_def_cfa: r0 ofs 4 > DW_CFA_offset: r17 at cfa-4 > DW_CFA_nop > DW_CFA_nop OK so r0 (sp) is the CFA register upon entering the function. Then... > 0014 0030 FDE cie= pc=..0043 > DW_CFA_advance_loc4: 2 to 0002 > DW_CFA_def_cfa_offset: 32 > DW_CFA_offset: r12 at cfa-8 > DW_CFA_offset: r11 at cfa-12 > DW_CFA_offset: r10 at cfa-16 > DW_CFA_offset: r9 at cfa-20 > DW_CFA_offset: r8 at cfa-24 > DW_CFA_offset: r7 at cfa-28 > DW_CFA_offset: r6 at cfa-32 > DW_CFA_advance_loc4: 3 to 0005 > DW_CFA_def_cfa: r6 ofs 36 ... r6 (fp) briefly becomes the CFA register but... > DW_CFA_advance_loc4: 2 to 0007 > DW_CFA_def_cfa_register: r0 ... r0 (sp) is restored almost immediately as the CFA register. That's the bug, r6 (fp) must be kept as the CFA register throughout the function. > Does the backend have to *not* mark further changes to the stack > pointer in the prologue as frame related, if the function calls > alloca? I think that the bug is unrelated to alloca or stack checking: _medium_frame: pushm r6-r12 add #-4, r0, r6 ; marked frame-related (fp = sp - 4) mov.L r6, r0 ; marked frame-related (sp = fp) . . .; stack checking code goes here add #0xc000, r0 ; not marked frame-related The "mov.L r6, r0" instruction must never be marked as frame-related, for any function. -- Eric Botcazou
Re: stack-protection vs alloca vs dwarf2
> The "mov.L r6, r0" instruction must never be marked as frame-related, for any > function. Is this documented somewhere?
Re: stack-protection vs alloca vs dwarf2
> The "mov.L r6, r0" instruction must never be marked as frame-related, for any > function. Also, is that rule true if we *don't* have a frame pointer? That is, when we add a constant to the stack to allocate the frame, should that function be marked as frame-related? Or is it just the fp->sp move (or potentially an add, if there's outgoing args) that shouldn't be marked?