Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread Eric Botcazou
> While debugging some gdb-related FAILs, I discovered that gcc's
> -fstack-check option effectively calls alloca() to adjust the stack
> pointer.

Note that the subject is misleading, -fstack-check is stack checking and not 
stack protection, which is -fstack-protector.

> However, it doesn't mark the stack adjustment as FRAME_RELATED even
> when it's setting up the local variables for the function.
> 
> In the case of rx-elf, for this testcase, the CFA for the function is
> defined in terms of the stack pointer - and thus is incorrect after
> the alloca call.

That's the bug, any function using alloca must use the frame pointer as CFA.

> My question is: who's fault is this?  Should alloca() tell the debug
> stuff that the stack pointer has changed?  Should it tell it to not
> use $sp at all?  Should the debug stuff "just know" that $sp isn't a
> valid choice for the CFA?

Presumably the rx back-end and more precisely TARGET_FRAME_POINTER_REQUIRED, 
which needs to return true if cfun->calls_alloca.

-- 
Eric Botcazou


Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Paweł Sikora

Hi,

the opt_random.h header includes  unconditionally and 
breaks crytopp build

(redefinition of _mm_shuffle_epi8 in cpu.h).
could you please add #ifdef __SSSE3__ around this include?

BR,
Paweł.

cpu.h @ cryptopp:

(...)
#if !defined(__GNUC__) || defined(__SSSE3__) || 
defined(__INTEL_COMPILER)

#include 
#else
__inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))

_mm_shuffle_epi8 (__m128i a, __m128i b)
{
asm ("pshufb %1, %0" : "+x"(a) : "xm"(b));
return a;
}
#endif
(...)



Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Jakub Jelinek
On Thu, Apr 17, 2014 at 11:38:18AM +0200, Paweł Sikora wrote:
> Hi,
> 
> the opt_random.h header includes  unconditionally and
> breaks crytopp build
> (redefinition of _mm_shuffle_epi8 in cpu.h).
> could you please add #ifdef __SSSE3__ around this include?

No, just fix cryptopp.
The *intrin.h headers have been redesigned to use #pragma GCC target,
so that they are usable regardless of the command line options used.
So, unlike in 4.8, you can
#include 
and similar headers and then have say
__attribute__((target ("ssse3"))) function which can use _mm_shuffle_epi8
and other SSSE3 intrinsics (similarly for lots of other ISA sets),
which wasn't possible before.

So, in this case you'd supposedly change the #ifdef to
#if !defined(__GNUC__) || defined(__SSSE3__) || (__GNUC__ > 4) || (__GNUC__ == 
4 && __GNUC_MINOR__ > 8) || defined(__INTEL_COMPILER)
or so.  Of course, you can use _mm_shuffle_epi8 only from functions
compiled within #pragma GCC target ("ssse3") (or anything higher)
or __attribute__((target ("ssse3"))).  If you don't, cryptopp
is even more buggy.

Jakub


Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Jonathan Wakely
On 17 April 2014 10:38, Paweł Sikora wrote:
> Hi,
>
> the opt_random.h header includes  unconditionally and breaks
> crytopp build
> (redefinition of _mm_shuffle_epi8 in cpu.h).
> could you please add #ifdef __SSSE3__ around this include?

Do you mean __SSE3__ not __SSSE3__?
That's the macro that controls whether the
config/cpu/i486/opt/bits/opt_random.h header actually uses the
intrinsics or not:

namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION

#ifdef __SSE3__
  template<>
template
  void
  normal_distribution::


Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Paweł Sikora

W dniu 2014-04-17 12:13, Jonathan Wakely napisał(a):

On 17 April 2014 10:38, Paweł Sikora wrote:

Hi,

the opt_random.h header includes  unconditionally and 
breaks

crytopp build
(redefinition of _mm_shuffle_epi8 in cpu.h).
could you please add #ifdef __SSSE3__ around this include?


Do you mean __SSE3__ not __SSSE3__?


yes, __SSE3__.


That's the macro that controls whether the
config/cpu/i486/opt/bits/opt_random.h header actually uses the
intrinsics or not:

namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION

#ifdef __SSE3__
  template<>
template
  void
  normal_distribution::




Multiple cc's, one gcc -- is it possible?

2014-04-17 Thread Konstantin Vladimirov
Hi,

I want to support, say arch1 and arch2 in custom gcc in the way

gcc -march1 test.c

calls

${INSTALL}/libexec/gcc/arch1/4.8.2/cc1

and

gcc -march2 test.c

calls

${INSTALL}/libexec/gcc/arch2/4.8.2/cc1

Are there any way to do it? Maybe not exactly as I outlined, but the
whole idea is clear, I think. I looked through multilib and multiarch
options, but it seems to be nothing like this.

I may correct anything in build system and arch1 and arch2 backends,
but I don't want to change any core gcc code, say inside gcc.c. Is it
possible? Are there examples how is it possible?

---
With best regards, Konstantin


Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Alexander Ivchenko
Is it ok to port this patch to 4.9 branch:

commit 15bee5d49b1c746fd3e784432d7e4988941a671e
Author: bviyer 
Date:   Fri Apr 11 19:56:42 2014 +

Fix for PR other/60644.
+2014-04-11  Barry Tannenbaum  
+
+   PR other/60644
+   * runtime/os-unix.c: Replaced all occurrances of ANDROID with
+   __ANDROID__.
+   * runtime/bug.h: Likewise.
+   * include/cilk/metaprogramming.h: Likewise.
+   * include/cilk/reducer_min_max.h: Likewise.
+



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@209324
138bc75d-0d04-0410-961f-82ee72b054a4

to fix http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60644 and cure
i686-pc-linux-android build?

thanks,
Alexander

2014-04-17 14:25 GMT+04:00 Paweł Sikora :
> W dniu 2014-04-17 12:13, Jonathan Wakely napisał(a):
>
>> On 17 April 2014 10:38, Paweł Sikora wrote:
>>>
>>> Hi,
>>>
>>> the opt_random.h header includes  unconditionally and breaks
>>> crytopp build
>>> (redefinition of _mm_shuffle_epi8 in cpu.h).
>>> could you please add #ifdef __SSSE3__ around this include?
>>
>>
>> Do you mean __SSE3__ not __SSSE3__?
>
>
> yes, __SSE3__.
>
>
>> That's the macro that controls whether the
>> config/cpu/i486/opt/bits/opt_random.h header actually uses the
>> intrinsics or not:
>>
>> namespace std _GLIBCXX_VISIBILITY(default)
>> {
>> _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>
>> #ifdef __SSE3__
>>   template<>
>> template
>>   void
>>   normal_distribution::
>
>


Re: Multiple cc's, one gcc -- is it possible?

2014-04-17 Thread Richard Biener
On Thu, Apr 17, 2014 at 12:39 PM, Konstantin Vladimirov
 wrote:
> Hi,
>
> I want to support, say arch1 and arch2 in custom gcc in the way
>
> gcc -march1 test.c
>
> calls
>
> ${INSTALL}/libexec/gcc/arch1/4.8.2/cc1
>
> and
>
> gcc -march2 test.c
>
> calls
>
> ${INSTALL}/libexec/gcc/arch2/4.8.2/cc1
>
> Are there any way to do it? Maybe not exactly as I outlined, but the
> whole idea is clear, I think. I looked through multilib and multiarch
> options, but it seems to be nothing like this.
>
> I may correct anything in build system and arch1 and arch2 backends,
> but I don't want to change any core gcc code, say inside gcc.c. Is it
> possible? Are there examples how is it possible?

Only via -B.  We used to have -V to select between different versions
(that was removed), a similar switch to select between different
target triplets could in theory be added.

But note that the driver needs to understand a (subset) of target specific
options for specs processing so even the driver is somewhat target
specific and cannot really be shared.

Richard.

> ---
> With best regards, Konstantin


Re: LRA Stuck in a loop until aborting

2014-04-17 Thread Richard Sandiford
pshor...@dataworx.com.au writes:
> On 17.04.2014 13:00, Jeff Law wrote:
>> On 04/16/14 16:19, Richard Henderson wrote:
>>> 
>>> The register allocators only select an alternative for a move.  They 
>>> do not
>>> choose between N different patterns, separately describing loads, 
>>> stores, and
>>> register-to-register movement.
>>> 
>>> I'm fairly sure the documentation is quite clear on this, and GCC had 
>>> required
>>> this since the beginning of time.
>> Correct on all counts; many an hour was spent reimplementing the PA
>> movXX patterns to satisfy that requirement.
>> 
>> jeff
> I'm convinced :-) but...
>
> gcc internals info about movm is fairly comprehensive and I had taken 
> care to ensure that I satisfied ...
>
> "The constraints on a ‘movm’ must permit moving any hard register to 
> any other hard register provided..."
>
> by providing a define_expand that assigns from a general_operand to a 
> nonimmediate_operand and ...
>
> *ldsi instruction that can load from a general_operand to a 
> nonimmediate_operand
> and a
> *storesi instruction that can store a register_operand to a 
> memory_operand

Must admit I can't find where this is documented from a quick look.
Like Jeff and Richard, I could swear it was documented somewhere...

> In any case, out of curiosity and to convice myself I hadn't imagined 
> the old reload pass handling this I reverted my recent fixes so that 
> ldsi and storesi were once again as described above then repeated the 
> exercise with full rtl dumping on and compared the rtl generated both 
> with and without LRA enabled.
>
> In both cases the *.ira dmp produced the triggering ...
>
> (insn 57 61 58 5 (set (reg/v:SI 46 [orig:31 s ] [31])
>  (reg/v:SI 31 [ s ])) 48 {*ldsi}
>   (expr_list:REG_DEAD (reg/v:SI 31 [ s ])
>  (nil)))
>
> The non-LRA reload rtl produced ..
>
> (insn 57 61 67 3 (set (reg:SI 1 r1)
>  (mem/c:SI (plus:HI (reg/f:HI 3 r3)
>  (const_int 4 [0x4])) [4 %sfp+4 S4 A16])) 48 {*ldsi}
>   (nil))
> (insn 67 57 58 3 (set (mem/c:SI (plus:HI (reg/f:HI 3 r3)
>  (const_int 4 [0x4])) [4 %sfp+4 S4 A16])
>  (reg:SI 1 r1)) 47 {*storesi}
>   (nil))
>
> While the LRA just got stuck in a loop unable to perform the reload of 
> insn 57 that the old reload pass handled (or more correctly didn't choke 
> over - it seems to be a redundant load/store).
>
> I'm really just highlighting this because I know the LRA is quite young 
> and this might be a hint towards a deeper/other issues.

I think LRA just exploits the rule more than reload did.  Reload would
look at the original instruction as a move between two MEMs, pick a hard
register for the destination of the *ldsi, then emit a store from the
hard register to the original MEM destination, on the assumption that
that move must be possible.  (It would never actually match against
*storesi contraints, or even know that *storesi is the instruction
that would be used.)  It was then down to pot luck whether that hard
register gets rused ("inherited") by later instructions that need
(reg/v:SI 46 [orig:31 s ] [31]).

LRA tries to take a more global view by introducing new pseudo registers
to represent as-yet unallocated reload registers.  As you say, a consequence
of this is that you can see moves between two pseudo registers that match the
load pattern predicates but in which the destination is destined to be a MEM.

Thanks,
Richard



Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Jakub Jelinek
On Thu, Apr 17, 2014 at 02:47:50PM +0400, Alexander Ivchenko wrote:
> Is it ok to port this patch to 4.9 branch:

If it always fails to bootstrap with cilkrts on Android right now, then
the patch can't do more harm, so ok.
> 
> commit 15bee5d49b1c746fd3e784432d7e4988941a671e
> Author: bviyer 
> Date:   Fri Apr 11 19:56:42 2014 +
> 
> Fix for PR other/60644.
> +2014-04-11  Barry Tannenbaum  
> +
> +   PR other/60644
> +   * runtime/os-unix.c: Replaced all occurrances of ANDROID with
> +   __ANDROID__.
> +   * runtime/bug.h: Likewise.
> +   * include/cilk/metaprogramming.h: Likewise.
> +   * include/cilk/reducer_min_max.h: Likewise.

Jakub


Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Alexander Ivchenko
That fixes "--disable-shared" case only, which is important for NDK build.

Without "--disable-shared" build fails because there is no -lpthread
on Android and pthreads are in libc there.
Apparently, cilkrts configure does not check that.

I can fix that in 4.9, but I'm ok with fixing it in trunk later as
well, depending on your decision.

2014-04-17 15:31 GMT+04:00 Jakub Jelinek :
> On Thu, Apr 17, 2014 at 02:47:50PM +0400, Alexander Ivchenko wrote:
>> Is it ok to port this patch to 4.9 branch:
>
> If it always fails to bootstrap with cilkrts on Android right now, then
> the patch can't do more harm, so ok.
>>
>> commit 15bee5d49b1c746fd3e784432d7e4988941a671e
>> Author: bviyer 
>> Date:   Fri Apr 11 19:56:42 2014 +
>>
>> Fix for PR other/60644.
>> +2014-04-11  Barry Tannenbaum  
>> +
>> +   PR other/60644
>> +   * runtime/os-unix.c: Replaced all occurrances of ANDROID with
>> +   __ANDROID__.
>> +   * runtime/bug.h: Likewise.
>> +   * include/cilk/metaprogramming.h: Likewise.
>> +   * include/cilk/reducer_min_max.h: Likewise.
>
> Jakub


Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Jakub Jelinek
On Thu, Apr 17, 2014 at 03:40:10PM +0400, Alexander Ivchenko wrote:
> That fixes "--disable-shared" case only, which is important for NDK build.
> 
> Without "--disable-shared" build fails because there is no -lpthread
> on Android and pthreads are in libc there.
> Apparently, cilkrts configure does not check that.
> 
> I can fix that in 4.9, but I'm ok with fixing it in trunk later as
> well, depending on your decision.

Ugh, I guess depends on how the patch looks like.
In any case, I think libgomp contains pretty much the same hunk as cilkrts,
so why you don't run into issues in libgomp and only in libcilkrts?

# Check to see if -pthread or -lpthread is needed.  Prefer the former.
# In case the pthread.h system header is not found, this test will fail.
XPCFLAGS=""
CFLAGS="$CFLAGS -pthread"
AC_LINK_IFELSE(
 [AC_LANG_PROGRAM(
  [#include 
   void *g(void *d) { return NULL; }],
  [pthread_t t; pthread_create(&t,NULL,g,NULL);])],
 [XPCFLAGS=" -Wc,-pthread"],
 [CFLAGS="$save_CFLAGS" LIBS="-lpthread $LIBS"
  AC_LINK_IFELSE(
   [AC_LANG_PROGRAM(
[#include 
 void *g(void *d) { return NULL; }],
[pthread_t t; pthread_create(&t,NULL,g,NULL);])],
   [],
   [AC_MSG_ERROR([Pthreads are required to build libgomp])])])

Jakub


Re: GCC 4.9.0 Release Candidate available from gcc.gnu.org

2014-04-17 Thread Jakub Jelinek
On Thu, Apr 17, 2014 at 02:04:03PM +0200, Jakub Jelinek wrote:
> Ugh, I guess depends on how the patch looks like.
> In any case, I think libgomp contains pretty much the same hunk as cilkrts,
> so why you don't run into issues in libgomp and only in libcilkrts?
> 
> # Check to see if -pthread or -lpthread is needed.  Prefer the former.
> # In case the pthread.h system header is not found, this test will fail.
> XPCFLAGS=""
> CFLAGS="$CFLAGS -pthread"
> AC_LINK_IFELSE(
>  [AC_LANG_PROGRAM(
>   [#include 
>void *g(void *d) { return NULL; }],
>   [pthread_t t; pthread_create(&t,NULL,g,NULL);])],
>  [XPCFLAGS=" -Wc,-pthread"],
>  [CFLAGS="$save_CFLAGS" LIBS="-lpthread $LIBS"
>   AC_LINK_IFELSE(
>[AC_LANG_PROGRAM(
> [#include 
>  void *g(void *d) { return NULL; }],
> [pthread_t t; pthread_create(&t,NULL,g,NULL);])],
>[],
>[AC_MSG_ERROR([Pthreads are required to build libgomp])])])

Ah, I see, while it checks for that, it then happily ignores it.
libgomp instead propagates it into XCFLAGS and LIBS vars that are
substituted in the Makefile and used for the build, libcilkrts
hardcodes -lpthread.

I guess I'd prefer to see this fixed on the trunk first and then backported
to 4.9.1.

Jakub


Re: LRA Stuck in a loop until aborting

2014-04-17 Thread Michael Matz
Hi,

On Thu, 17 Apr 2014, Richard Sandiford wrote:

> > "The constraints on a ‘movm’ must permit moving any hard register to 
> > any other hard register provided..."
> >
> > by providing a define_expand that assigns from a general_operand to a 
> > nonimmediate_operand and ...
> >
> > *ldsi instruction that can load from a general_operand to a 
> > nonimmediate_operand
> > and a
> > *storesi instruction that can store a register_operand to a 
> > memory_operand
> 
> Must admit I can't find where this is documented from a quick look.
> Like Jeff and Richard, I could swear it was documented somewhere...

--- snip --
`movM'
 ...
 This class of patterns is special in several ways.  First of all,
 each of these names up to and including full word size _must_ be
 defined, because there is no other way to copy a datum from one
 place to another.  If there are patterns accepting operands in
 larger modes, `movM' must be defined for integer modes of those
 sizes.

 Second, these patterns are not used solely in the RTL generation
 pass.  Even the reload pass can generate move insns to copy values
 from stack slots into temporary registers.  When it does so, one
 of the operands is a hard register and the other is an operand
 that can need to be reloaded into a register.

 Therefore, when given such a pair of operands, the pattern must
 generate RTL which needs no reloading and needs no temporary
 registers--no registers other than the operands.  For example, if
 you support the pattern with a `define_expand', then in such a
 case the `define_expand' mustn't call `force_reg' or any other such
 function which might generate new pseudo registers.
--- snap --

So, what's actually documented is a bit more relaxed than "movM must be 
able to move around Mmode operands of all forms".  And it's restricted to 
patterns named 'movM', while in reality all patterns that look like (and 
would match) a move are so constrained.


Ciao,
Michael.

Re: [RFC] Detect most integer overflows.

2014-04-17 Thread Ondřej Bílka
On Sat, Apr 12, 2014 at 12:53:45AM +0200, Hannes Frederic Sowa wrote:
> Hi!
> 
> On Tue, Oct 29, 2013 at 10:41:56AM +0100, Richard Biener wrote:
> > For a "quick" GCC implementation of the builtins you could expand
> > them to a open-coded sequence during gimplification.  But due to
> > the issues pointed out above I'm not sure it is the best interface
> > to support (though now the names are taken).
> 
> I played around with gcc internals for the first time today and came
> up with this. As this is my first patch to gcc I am very happy to hear
> feedback. Thanks!
> 
Did you looked at resulting assembly for simple expressions?

Also Paul Eggert suggested at another list[1] to implement these with
128bit arithmetic which gcc can optimize quite well, it uses overflow
flag as check.

Could these builtins use a 128bit arithmetic as well?

P.S.

A generated code is affected by generic gcc bug that gcc uses
conditional move instruction even when branch is very unlikely and 
jump would be faster.

[1] https://sourceware.org/ml/libc-alpha/2013-12/msg00084.html


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> Presumably the rx back-end and more precisely TARGET_FRAME_POINTER_REQUIRED, 
> which needs to return true if cfun->calls_alloca.

The rx back-end doesn't define TARGET_FRAME_POINTER_REQUIRED, as the
documentation says the compiler handles target-independent reasons why
there needs to be a frame pointer.  But, the default
TARGET_FRAME_POINTER_REQUIRED just returns false - shouldn't it, by
default, check for calls_alloca ?

Also, I added that hook and set it to return true always, and it
didn't fix the bug.  There is a frame pointer (there was before, too),
but there's also a stack adjustment after the pseudo-alloca which the
dwarf2 stuff doesn't know about.  The last stack adjustment it sees is
the rx backend's adjustment to allocate the frame:

_medium_frame:   
pushm   r6-r12   
add #-4, r0, r6  ; marked frame-related (fp = sp - 4)
mov.L   r6, r0   ; marked frame-related (sp = fp)
. . .; stack checking code goes here
add #0xc000, r0  ; not marked frame-related

 <_medium_frame>:
   0:   6e 6c   pushm   r6-r12
   2:   71 06 fcadd #-4, r0, r6
   5:   ef 60   mov.l   r6, r0
   7:

  2e:   72 00 00 c0 add #0xc000, r0, r0

0014 0030  FDE cie= pc=..0043
  DW_CFA_advance_loc4: 2 to 0002
  DW_CFA_def_cfa_offset: 32
  DW_CFA_offset: r12 at cfa-8
  . . .
  DW_CFA_offset: r6 at cfa-32
  DW_CFA_advance_loc4: 3 to 0005
  DW_CFA_def_cfa: r6 ofs 36
  DW_CFA_advance_loc4: 2 to 0007
  DW_CFA_def_cfa_register: r0
  ( that's it for debug info )


Perhaps the stack-check code should set FRAME_RELATED on any stack
adjustment insn?


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread Ian Lance Taylor
On Thu, Apr 17, 2014 at 10:14 AM, DJ Delorie  wrote:
>
> The rx back-end doesn't define TARGET_FRAME_POINTER_REQUIRED, as the
> documentation says the compiler handles target-independent reasons why
> there needs to be a frame pointer.  But, the default
> TARGET_FRAME_POINTER_REQUIRED just returns false - shouldn't it, by
> default, check for calls_alloca ?

calls_alloce is checked where frame_pointer_needed is set, in
ira_setup_eliminable_regset in ira.c.

TARGET_FRAME_POINTER_REQUIRED only needs to handle target-specific
conditions.

Ian


gcc-4.8-20140417 is now available

2014-04-17 Thread gccadmin
Snapshot gcc-4.8-20140417 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140417/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.8 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch 
revision 209500

You'll find:

 gcc-4.8-20140417.tar.bz2 Complete GCC

  MD5=15f73bb81261a6f25eeb211b740a7aa5
  SHA1=c4ac3f3b7564a38e60ad2efcb0898499e31f7c2d

Diffs from 4.8-20140410 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.8
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread Eric Botcazou
> The last stack adjustment it sees is
> the rx backend's adjustment to allocate the frame:
> 
> _medium_frame:
> pushm   r6-r12
> add #-4, r0, r6  ; marked frame-related (fp = sp -
> 4) mov.L   r6, r0   ; marked frame-related (sp = fp) . .
> .  ; stack checking code goes here
> add #0xc000, r0  ; not marked frame-related
>
>  <_medium_frame>:
>0:   6e 6c   pushm   r6-r12
>2:   71 06 fcadd #-4, r0, r6
>5:   ef 60   mov.l   r6, r0
>7:

I gather that r0 is the stack pointer and r6 the frame pointer?

> 0014 0030  FDE cie= pc=..0043
>   DW_CFA_advance_loc4: 2 to 0002
>   DW_CFA_def_cfa_offset: 32
>   DW_CFA_offset: r12 at cfa-8
>   . . .
>   DW_CFA_offset: r6 at cfa-32
>   DW_CFA_advance_loc4: 3 to 0005
>   DW_CFA_def_cfa: r6 ofs 36
>   DW_CFA_advance_loc4: 2 to 0007
>   DW_CFA_def_cfa_register: r0
>   ( that's it for debug info )

If so, the above DW_CFA_def_cfa_register doesn't make sense, it should be r6 
once the frame is established.  What does the CIE contain exactly?

> Perhaps the stack-check code should set FRAME_RELATED on any stack
> adjustment insn?

No, the design is that stack checking or alloca force the use of the frame 
pointer, which thus becomes the CFA register, which means that subsequent 
stack adjustments are irrelevant for the CFI.

-- 
Eric Botcazou


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> I gather that r0 is the stack pointer and r6 the frame pointer?

Yes.

> > 0014 0030  FDE cie= pc=..0043
> >   DW_CFA_advance_loc4: 2 to 0002
> >   DW_CFA_def_cfa_offset: 32
> >   DW_CFA_offset: r12 at cfa-8
> >   . . .
> >   DW_CFA_offset: r6 at cfa-32
> >   DW_CFA_advance_loc4: 3 to 0005
> >   DW_CFA_def_cfa: r6 ofs 36
> >   DW_CFA_advance_loc4: 2 to 0007
> >   DW_CFA_def_cfa_register: r0
> >   ( that's it for debug info )
> 
> If so, the above DW_CFA_def_cfa_register doesn't make sense, it should be r6 
> once the frame is established.  What does the CIE contain exactly?


 0010  CIE
  Version:   3
  Augmentation:  ""
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 17

  DW_CFA_def_cfa: r0 ofs 4
  DW_CFA_offset: r17 at cfa-4
  DW_CFA_nop
  DW_CFA_nop

0014 0030  FDE cie= pc=..0043
  DW_CFA_advance_loc4: 2 to 0002
  DW_CFA_def_cfa_offset: 32
  DW_CFA_offset: r12 at cfa-8
  DW_CFA_offset: r11 at cfa-12
  DW_CFA_offset: r10 at cfa-16
  DW_CFA_offset: r9 at cfa-20
  DW_CFA_offset: r8 at cfa-24
  DW_CFA_offset: r7 at cfa-28
  DW_CFA_offset: r6 at cfa-32
  DW_CFA_advance_loc4: 3 to 0005
  DW_CFA_def_cfa: r6 ofs 36
  DW_CFA_advance_loc4: 2 to 0007
  DW_CFA_def_cfa_register: r0


> > Perhaps the stack-check code should set FRAME_RELATED on any stack
> > adjustment insn?
> 
> No, the design is that stack checking or alloca force the use of the frame 
> pointer, which thus becomes the CFA register, which means that subsequent 
> stack adjustments are irrelevant for the CFI.

Does the backend have to *not* mark further changes to the stack
pointer in the prologue as frame related, if the function calls
alloca?  This is the RL expand_prologue() is emitting:

(insn/f 42 5 43 2 (parallel [
(set/f (reg/f:SI 0 r0)
(minus:SI (reg/f:SI 0 r0)
(const_int 28 [0x1c])))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 4 [0x4])) [0  S4 A8])
(reg:SI 12 r12))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 8 [0x8])) [0  S4 A8])
(reg:SI 11 r11))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 12 [0xc])) [0  S4 A8])
(reg:SI 10 r10))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 16 [0x10])) [0  S4 A8])
(reg:SI 9 r9))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 20 [0x14])) [0  S4 A8])
(reg:SI 8 r8))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 24 [0x18])) [0  S4 A8])
(reg/f:SI 7 r7))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 28 [0x1c])) [0  S4 A8])
(reg/f:SI 6 r6))
]) dj.c:2 -1
 (nil))

(insn/f 43 42 44 2 (parallel [
(set (reg/f:SI 6 r6)
(plus:SI (reg/f:SI 0 r0)
(const_int -4 [0xfffc])))
(clobber (reg:CC 16 cc))
]) dj.c:2 -1
 (nil))

(insn/f 44 43 45 2 (set (reg/f:SI 0 r0)
(reg/f:SI 6 r6)) dj.c:2 -1
 (nil))


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread Eric Botcazou
>  0010  CIE
>   Version:   3
>   Augmentation:  ""
>   Code alignment factor: 1
>   Data alignment factor: -4
>   Return address column: 17
> 
>   DW_CFA_def_cfa: r0 ofs 4
>   DW_CFA_offset: r17 at cfa-4
>   DW_CFA_nop
>   DW_CFA_nop

OK so r0 (sp) is the CFA register upon entering the function.  Then...

> 0014 0030  FDE cie= pc=..0043
>   DW_CFA_advance_loc4: 2 to 0002
>   DW_CFA_def_cfa_offset: 32
>   DW_CFA_offset: r12 at cfa-8
>   DW_CFA_offset: r11 at cfa-12
>   DW_CFA_offset: r10 at cfa-16
>   DW_CFA_offset: r9 at cfa-20
>   DW_CFA_offset: r8 at cfa-24
>   DW_CFA_offset: r7 at cfa-28
>   DW_CFA_offset: r6 at cfa-32
>   DW_CFA_advance_loc4: 3 to 0005
>   DW_CFA_def_cfa: r6 ofs 36

... r6 (fp) briefly becomes the CFA register but...

>   DW_CFA_advance_loc4: 2 to 0007
>   DW_CFA_def_cfa_register: r0

... r0 (sp) is restored almost immediately as the CFA register.  That's the 
bug, r6 (fp) must be kept as the CFA register throughout the function.

> Does the backend have to *not* mark further changes to the stack
> pointer in the prologue as frame related, if the function calls
> alloca?

I think that the bug is unrelated to alloca or stack checking:

_medium_frame:   
pushm   r6-r12   
add #-4, r0, r6  ; marked frame-related (fp = sp - 4)
mov.L   r6, r0   ; marked frame-related (sp = fp)
. . .; stack checking code goes here
add #0xc000, r0  ; not marked frame-related

The "mov.L r6, r0" instruction must never be marked as frame-related, for any 
function.

-- 
Eric Botcazou


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> The "mov.L r6, r0" instruction must never be marked as frame-related, for any 
> function.

Is this documented somewhere?


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> The "mov.L r6, r0" instruction must never be marked as frame-related, for any 
> function.

Also, is that rule true if we *don't* have a frame pointer?  That is,
when we add a constant to the stack to allocate the frame, should that
function be marked as frame-related?  Or is it just the fp->sp move
(or potentially an add, if there's outgoing args) that shouldn't be
marked?