Re: Stack offset computation for incoming arguments.

2014-05-30 Thread Eric Botcazou
> ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM need to be pseudo-registers if
> they do not represent real registers.

The wording "pseudo registers" is obviously a bit confusing in this context...

If ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM do not represent real registers 
then they need to be fake hard registers, i.e. hard registers according to the 
FIRST_PSEUDO_REGISTER macro but with an arbitrary REGNUM (typically just below 
the FIRST_PSEUDO_REGISTER macro).  See the numerous examples in the tree.

-- 
Eric Botcazou


Re: Stack offset computation for incoming arguments.

2014-05-30 Thread Umesh Kalappa
Dear Eric,

Really Appreciate your reply here and made the following changes like

#define  ARG_POINTER_REGNUM 8   //Fake hard reg
#define FRAME _POINTER_REGNUM 9  // Fake hard reg
#define SP_REG  10

#define ELIMINABLE_REGS { {ARG_POINTER_REGNUM,STACK_POINTER_REGNUM},\
  {ARG_POINTER_REGNUM, FRAME_POINTER_REGNUM},   \
  {FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM} }


 The ARG and FRAME  reg are not marked as fixed regs ,but marked as
call used regs respectively.


The  reload pass is eliminating the arg and fp regs to sp  ,when the
sample is  run with -fomit-frame-pointer  option, i.e

$t-gcc -S -fomit-frame-pointer  sample.c

But without the –fomit-frame-pointer  option the arg and fp is not
replaced with sp.

Please help us regrading with   any hints ??

Thank you
~Umesh

On Fri, May 30, 2014 at 4:24 PM, Eric Botcazou  wrote:
>> ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM need to be pseudo-registers if
>> they do not represent real registers.
>
> The wording "pseudo registers" is obviously a bit confusing in this context...
>
> If ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM do not represent real registers
> then they need to be fake hard registers, i.e. hard registers according to the
> FIRST_PSEUDO_REGISTER macro but with an arbitrary REGNUM (typically just below
> the FIRST_PSEUDO_REGISTER macro).  See the numerous examples in the tree.
>
> --
> Eric Botcazou


Re: [RFC] PR61300 K&R incoming args

2014-05-30 Thread Jeff Law

On 05/26/14 01:38, Alan Modra wrote:

PR61300 shows a need to differentiate between incoming and outgoing
REG_PARM_STACK_SPACE for the PowerPC64 ELFv2 ABI, due to code like
function.c:assign_parm_is_stack_parm determining that a stack home
is available for incoming args if REG_PARM_STACK_SPACE is non-zero.

Background: The ELFv2 ABI requires a parameter save area only when
stack is actually used to pass parameters, and since varargs are
passed on the stack, unprototyped calls must pass both on the stack
and in registers.  OK, easy you say, !prototype_p(fun) means a
parameter save area is needed.  However, a prototype might not be in
scope when compiling an old K&R style C function body, but this does
*not* mean a parameter save area has necesasrily been allocated.  A
caller may well have a prototype in scope at the point of the call.
Ugh.  This reminds me a lot of the braindamage we had to deal with in 
the original PA abi's handling of FP values.


In the general case, how can any function ever be sure as to whether or 
not its prototype was in scope at a call site?  Yea, we can know for 
things with restricted scope, but if it's externally visible, I don't 
see how we're going to know the calling context with absolute certainty.


What am I missing here?

jeff




Re: RFA: [VAX] SUBREG of MEM with a mode dependent address

2014-05-30 Thread Jeff Law

On 05/25/14 18:19, Matt Thomas wrote:


But even if  movhi is a define_expand, as far as I can tell there's
isn't enough info to know whether that is possible.  At that time,
how can I tell that operands[0] will be a hard reg or operands[1]
will be subreg of a mode dependent memory access?
At that time, you can't know those things.  Not even close ;-)  You 
certainly don't want to try and rewrite the insn to just use SImode. 
This is all an indication something has gone wrong elsewhere and this 
would just paper over the problem.




I've tried using secondary_reload and it called called with

(subreg:HI (reg:SI 113 [ MEM[base: _154, offset: 0B] ]) 0)

but it dies in change_address_1 before invoking the code returned in
sri.
I suspect if you dig deep enough, you can make a secondary reload do 
what you want.  It's just amazingly painful.


You want to allocate an SImode temporary, do the load of the SI memory 
location into that SImode temporary, then (subreg:SI (tempreg:SI)). 
Your best bet is going to be to look at how some other ports handle 
their secondary reloads.  But I warn you, it's going to be painful.








I've tracked this down to reload replacing (reg:SI 113) with
reg_equiv_mem (133) in the rtx.  However, it doesn't verify the rtx
is actually valid.  I added a gcc_assert to trap this and got:
Right.  reload will make that replacement and it's not going to do any 
verification at that point.  Verification would have happened earlier.


You have to look at the beginning of the main reload loop and poke at 
that for a while:


 /* For each pseudo register that has an equivalent location defined,
 try to eliminate any eliminable registers (such as the frame 
pointer)

 assuming initial offsets for the replacement register, which
 is the normal case.

 If the resulting location is directly addressable, substitute
 the MEM we just got directly for the old REG.

 If it is not addressable but is a constant or the sum of a 
hard reg
 and constant, it is probably not addressable because the 
constant is

 out of range, in that case record the address; we will generate
 hairy code to compute the address in a register each time it is
 needed.  Similarly if it is a hard register, but one that is not
 valid as an address register.

 If the location is not addressable, but does not have one of the
 above forms, assign a stack slot.  We have to do this to avoid the
 potential of producing lots of reloads if, e.g., a location 
involves
 a pseudo that didn't get a hard register and has an equivalent 
memory
 location that also involves a pseudo that didn't get a hard 
register.


 Perhaps at some point we will improve reload_when_needed handling
 so this problem goes away.  But that's very hairy.  */


Jeff


Re: [RFC] PR61300 K&R incoming args

2014-05-30 Thread Florian Weimer

On 05/26/2014 09:38 AM, Alan Modra wrote:


Background: The ELFv2 ABI requires a parameter save area only when
stack is actually used to pass parameters, and since varargs are
passed on the stack, unprototyped calls must pass both on the stack
and in registers.  OK, easy you say, !prototype_p(fun) means a
parameter save area is needed.  However, a prototype might not be in
scope when compiling an old K&R style C function body, but this does
*not* mean a parameter save area has necesasrily been allocated.


It's fine to change ABI when compiling an old-style function definition 
for which a prototype exists (relative to the non-prototype case).  It 
happens on i386, too.


--
Florian Weimer / Red Hat Product Security Team


PowerPC IEEE 128-bit floating point: Meta discussion

2014-05-30 Thread Michael Meissner
I'm sorry for a wide angle shotgun approach, but I wanted to discuss with all
of the stakeholders how to phase in IEEE 128-bit floating point to the PowerPC
toolchain.  For those of you who are not on the gcc@gcc.gnu.org mailing list,
this thread will be archived at: https://gcc.gnu.org/ml/gcc/

What I'm going to do is break this into several followups that each cover one
topic, so that it is easier to address issues without having to requote the
whole article.

What I want to do in the GCC 4.10 time frame is add the ability of users to use
IEEE 128-bit extended precision support, with as minimal disruption as possible
to existing code.  It is unfortunate that we could not have squeezed the
support in the PowerPC little endian support with Elf v2, so that we would not
have backwards compatibility issues on that platform, but alas it did not
happen.

Before IEEE 128-bit can be fully implemented, we will need to modify the
compiler, libgcc, glibc, the debugger, and possibly other tools as well.
Because different teams work on different schedules, it will likely be some
time before all of the pieces are in place.  However, we likely will need to
make sure the compiler piece is in place before work can start on the libraries
and debuggers.

In terms of user impact, I really don't know how many people are using long
double, or want to use IEEE 128-bit floating point.  I would hope that if a
user program does not use long double, there is not a flag day where they have
to move to a compiler/library combination for a feature they don't use.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: PowerPC IEEE 128-bit floating point: Where we are

2014-05-30 Thread Michael Meissner
I'm going to try and summarize the current state of 128-bit floating point on
the PowerPC here.

There are 2 switches that control long double support in the compiler, but
without supporting libraries, it isn't useful to users:

-mabi=ieeelongdouble vs. -mabi=ibmlongdouble:

These switches control which 128-bit format to use.  If you use either
switch, you get two warning messages (one from the gcc drive, one from
the compiler proper).

-mlong-double-128 vs. -mlong-double 64

These switches control whether long double is 128-bits (either ibm/ieee
formats), or 64-bits.

AIX and Darwin hardwires the choice to -mabi=ibmlongdouble, and you cannot use
the switch to override things.  Linux and Freebsd set the default to
-mabi=ibmlongdouble.  Any PowerPC system that is not AIX, Darwin, Linux, nor
Freebsd appears to default to IEEE 128-bit (vxworks?).

In terms of places where TFmode is mentioned in GCC, it is the following files:

predicates.md, rs6000.c, rs6000.h, rs6000.md, rs6000-modes.def, spe.md

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: PowerPC IEEE 128-bit floating point: Emulation functions

2014-05-30 Thread Michael Meissner
Right now the rs6000 backend of the GCC compiler is rather inconsistant in the
names used for the IBM extended double floating point.

For the basic operations it used __gcc_q:

__gcc_qadd
__gcc_qsub
__gcc_qmul
__gcc_qdiv
__gcc_qneg
__gcc_qne
__gcc_qgt
__gcc_qlt
__gcc_qle

In theory the conversions also use the __gcc_ format, but while it is set in
rs6000.c, the compiler ignores these names and uses:

__dpd_extendddtf
__dpd_extendsdtf
__dpd_extendtftd
__dpd_trunctdtf
__dpd_trunctfdd
__dpd_trunctfsd
__fixtfdi
__fixtfti
__fixunstfdi
__fixunstfti
__floatditf
__floattitf
__floatunditf
__floatuntitf
__powitf2

This means if have a flag day and change all of long double to IEEE 128-bit, we
run the risk of the user inadvertently calling the wrong function.

As I see it, we have a choice to have something like multilibs where you select
which library to use, or we have to use alternate names for all of the IEEE
128-bit emulation functions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: PowerPC IEEE 128-bit floating point: Two 128-bit floating point types

2014-05-30 Thread Michael Meissner
I assume we do not want a flag day, where the user flips a switch, and long
double is now IEEE 128-bit.  That would involve having 2 sets of libraries,
etc. to work with existing programs and new programs.  Even if the user does
not directly use long double, it still may be visible with the switch, since
many of the math libraries use long double internally to get more precision.  I
assume that the current IBM extended double format gives better performance
than the emulation functions.

I assume the way forward is to initially have a __float128 type that gives IEEE
128-bit support for those that need/want it, and keep long double to be the
current format.  When all the bits and pieces are in place, we can think about
flipping the switch.  However, there we have to think about appropriate times
for distributions to change over.

In terms of calling sequence, there are 2 ways to go: Either pass/return the
IEEE 128-bit value in 2 registers (like long double is now) or treat it like a
128-bit vector.  The v2 ELF abi explicitly says that it is treated like a
vector object, and I would prefer the v1 ELF on big endian server PowerPC's
also treat it like a vector.  If we are building a compiler for a server
target, to prevent confusion, I think it should be a requirement that you must
have -mvsx (or at least -maltivec -mabi=altivec) to use __float128.  Or do I
need to implement two sets of conversion functions, one if the user builds
his/her programs with -mcpu=power5 and the other for more recent customers?

I don't have a handle on the need for IEEE 128-bit floating point in non-server
platforms.  I assume in these environments, if we need IEEE 128-bit, it will be
passed as two floating point values.  Do we need this support?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: PowerPC IEEE 128-bit floating point: Internal GCC types

2014-05-30 Thread Michael Meissner
Currently within the GCC rs6000 backend, the type TFmode is the default 128-bit
floating point type.  Depending on TARGET_IEEEQUAD (the internal name for the
-mabi=ieeelongdouble option), this gets mapped into either the IEEE 128-bit
format or the IBM extended double format.  In theory, any place that checks for
TFmode should also check for TARGET_IEEEQUAD to determine which type of
floating point format is used.  Most places do do the check, but I have to
assume that a few places might not do it properly.

One question for GCC folk is did we want to simplify this, and make TFmode
always be IEEE 128-bit, and add another floating type for the IBM extended
double format.  This will involve fixing up both the compiler and libgcc.  I
tried to do this in one set of changes, and the compiler itself was fairly
straight forward.  However, it broke libgcc, due to the use of the mode
attribute to get to TFmode instead of using long double.  While this too can be
fixed (as well as uses in glibc), I wonder users in general use the mode
attribute declaration and explicitly use TFmode, and assuming they will get the
IBM extended double support.  For now, I think it is better to not change
TFmode, and just invent a new mode for IEEE 128-bit support.

I've looked at code that add two new modes, such as JFmode for explicitly IBM
extended double and KFmode mode for the IEEE 128-bit support, and TFmode would
be the same as either JFmode or KFmode.  These changes were getting a little
complex, so I've also looked at using a single mode for IEEE 128-bit (JFmode),
and leaving most of the TFmode support alone.

Do people have a preference for names?  I think using XFmode may cause
confusion with x86

One issue is the current mode setup is when you create new floating point
types, the widening system kicks in and the compiler will generate all sorts of
widening from one 128-bit floating point format to another (because internally
the precision for IBM extended double is less than the precision of IEEE
128-bit, due to the size of the mantisas).  Ideally we need a different way to
create an alternate floating point mode than FRACITION_FLOAT_MODE that does no
automatic widening.  If there is a way under the current system, I am not aware
of it.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [RFC] PR61300 K&R incoming args

2014-05-30 Thread Alan Modra
On Fri, May 30, 2014 at 11:27:52AM -0600, Jeff Law wrote:
> On 05/26/14 01:38, Alan Modra wrote:
> >PR61300 shows a need to differentiate between incoming and outgoing
> >REG_PARM_STACK_SPACE for the PowerPC64 ELFv2 ABI, due to code like
> >function.c:assign_parm_is_stack_parm determining that a stack home
> >is available for incoming args if REG_PARM_STACK_SPACE is non-zero.
> >
> >Background: The ELFv2 ABI requires a parameter save area only when
> >stack is actually used to pass parameters, and since varargs are
> >passed on the stack, unprototyped calls must pass both on the stack
> >and in registers.  OK, easy you say, !prototype_p(fun) means a
> >parameter save area is needed.  However, a prototype might not be in
> >scope when compiling an old K&R style C function body, but this does
> >*not* mean a parameter save area has necesasrily been allocated.  A
> >caller may well have a prototype in scope at the point of the call.
> Ugh.  This reminds me a lot of the braindamage we had to deal with
> in the original PA abi's handling of FP values.
> 
> In the general case, how can any function ever be sure as to whether
> or not its prototype was in scope at a call site?  Yea, we can know
> for things with restricted scope, but if it's externally visible, I
> don't see how we're going to know the calling context with absolute
> certainty.
> 
> What am I missing here?

When compiling the function body you don't need to know whether a
prototype was in scope at the call site.  You just need to know the
rules.  :)  For functions with variable argument lists, you'll always
have a parameter save area.  For other functions, whether or not you
have a parameter save area just depends on the number of arguments and
their types (ie. whether you run out of registers for parameter
passing), and you have that whether or not the function is
prototyped.

A simple example might help clear up any confusion.

Given
 void fun1(int a, int b, double c);
 void fun2(int a, ...);
  ...
 fun1 (1, 2, 3.0);
 fun2 (1, 2, 3.0);

A call to fun1 with a prototype in scope won't allocate a parameter
save area, and will pass the first arg in r3, the second in r4, and
the third in f1.

A call to fun2 with a prototype in scope will allocate a parameter
save area of 64 bytes (the minimum size of a parameter save area), and
will pass the first arg in r3, the second in the second slot of the
parameter save area, and the third in the third slot of the parameter
save area.  Now the first eight slots/double-words of the parameter
save area are passed in r3 thru r10, so this means the second arg is
actually passed in r4 and the third in r5, not the stack!

A call to fun1 or fun2 without a prototype in scope will allocate a
parameter save area, and pass the first arg in r3, the second in r4,
and the third in both f1 and r5.

When compiling fun1 body, the first arg is known to be in r3, the
second in r4, and the third in f1, and we don't use the parameter save
area for storing incoming args to a stack slot.  (At least, after
PR61300 is fixed..)  It doesn't matter if the parameter save area was
allocated or not, we just don't use it.

When compiling fun2 body, the first arg is known to be in r3, the
second in r4 and the third in r5.  Since the function has a variable
argument list, registers r4 thru r10 are saved to the parameter
save area stack, and we set up our va_list pointer to the second
double-word of the parameter save area stack.  Of course, code
optimisation might lead to removing the saves and using the args
in their incoming regs, but this is conceptually what happens.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RFC] PR61300 K&R incoming args

2014-05-30 Thread Alan Modra
On Fri, May 30, 2014 at 09:22:30PM +0200, Florian Weimer wrote:
> On 05/26/2014 09:38 AM, Alan Modra wrote:
> 
> >Background: The ELFv2 ABI requires a parameter save area only when
> >stack is actually used to pass parameters, and since varargs are
> >passed on the stack, unprototyped calls must pass both on the stack
> >and in registers.  OK, easy you say, !prototype_p(fun) means a
> >parameter save area is needed.  However, a prototype might not be in
> >scope when compiling an old K&R style C function body, but this does
> >*not* mean a parameter save area has necesasrily been allocated.
> 
> It's fine to change ABI when compiling an old-style function
> definition for which a prototype exists (relative to the
> non-prototype case).  It happens on i386, too.

That might be so, but when compiling the function body you must assume
the worst case, whatever that might be, at the call site.  For K&R
code, our error was to assume the call was unprototyped (which
paradoxically is the best case) when compiling the function body.

-- 
Alan Modra
Australia Development Lab, IBM