Re: Stack offset computation for incoming arguments.
> ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM need to be pseudo-registers if > they do not represent real registers. The wording "pseudo registers" is obviously a bit confusing in this context... If ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM do not represent real registers then they need to be fake hard registers, i.e. hard registers according to the FIRST_PSEUDO_REGISTER macro but with an arbitrary REGNUM (typically just below the FIRST_PSEUDO_REGISTER macro). See the numerous examples in the tree. -- Eric Botcazou
Re: Stack offset computation for incoming arguments.
Dear Eric, Really Appreciate your reply here and made the following changes like #define ARG_POINTER_REGNUM 8 //Fake hard reg #define FRAME _POINTER_REGNUM 9 // Fake hard reg #define SP_REG 10 #define ELIMINABLE_REGS { {ARG_POINTER_REGNUM,STACK_POINTER_REGNUM},\ {ARG_POINTER_REGNUM, FRAME_POINTER_REGNUM}, \ {FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM} } The ARG and FRAME reg are not marked as fixed regs ,but marked as call used regs respectively. The reload pass is eliminating the arg and fp regs to sp ,when the sample is run with -fomit-frame-pointer option, i.e $t-gcc -S -fomit-frame-pointer sample.c But without the –fomit-frame-pointer option the arg and fp is not replaced with sp. Please help us regrading with any hints ?? Thank you ~Umesh On Fri, May 30, 2014 at 4:24 PM, Eric Botcazou wrote: >> ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM need to be pseudo-registers if >> they do not represent real registers. > > The wording "pseudo registers" is obviously a bit confusing in this context... > > If ARG_POINTER_REGNUM and FRAME_POINTER_REGNUM do not represent real registers > then they need to be fake hard registers, i.e. hard registers according to the > FIRST_PSEUDO_REGISTER macro but with an arbitrary REGNUM (typically just below > the FIRST_PSEUDO_REGISTER macro). See the numerous examples in the tree. > > -- > Eric Botcazou
Re: [RFC] PR61300 K&R incoming args
On 05/26/14 01:38, Alan Modra wrote: PR61300 shows a need to differentiate between incoming and outgoing REG_PARM_STACK_SPACE for the PowerPC64 ELFv2 ABI, due to code like function.c:assign_parm_is_stack_parm determining that a stack home is available for incoming args if REG_PARM_STACK_SPACE is non-zero. Background: The ELFv2 ABI requires a parameter save area only when stack is actually used to pass parameters, and since varargs are passed on the stack, unprototyped calls must pass both on the stack and in registers. OK, easy you say, !prototype_p(fun) means a parameter save area is needed. However, a prototype might not be in scope when compiling an old K&R style C function body, but this does *not* mean a parameter save area has necesasrily been allocated. A caller may well have a prototype in scope at the point of the call. Ugh. This reminds me a lot of the braindamage we had to deal with in the original PA abi's handling of FP values. In the general case, how can any function ever be sure as to whether or not its prototype was in scope at a call site? Yea, we can know for things with restricted scope, but if it's externally visible, I don't see how we're going to know the calling context with absolute certainty. What am I missing here? jeff
Re: RFA: [VAX] SUBREG of MEM with a mode dependent address
On 05/25/14 18:19, Matt Thomas wrote: But even if movhi is a define_expand, as far as I can tell there's isn't enough info to know whether that is possible. At that time, how can I tell that operands[0] will be a hard reg or operands[1] will be subreg of a mode dependent memory access? At that time, you can't know those things. Not even close ;-) You certainly don't want to try and rewrite the insn to just use SImode. This is all an indication something has gone wrong elsewhere and this would just paper over the problem. I've tried using secondary_reload and it called called with (subreg:HI (reg:SI 113 [ MEM[base: _154, offset: 0B] ]) 0) but it dies in change_address_1 before invoking the code returned in sri. I suspect if you dig deep enough, you can make a secondary reload do what you want. It's just amazingly painful. You want to allocate an SImode temporary, do the load of the SI memory location into that SImode temporary, then (subreg:SI (tempreg:SI)). Your best bet is going to be to look at how some other ports handle their secondary reloads. But I warn you, it's going to be painful. I've tracked this down to reload replacing (reg:SI 113) with reg_equiv_mem (133) in the rtx. However, it doesn't verify the rtx is actually valid. I added a gcc_assert to trap this and got: Right. reload will make that replacement and it's not going to do any verification at that point. Verification would have happened earlier. You have to look at the beginning of the main reload loop and poke at that for a while: /* For each pseudo register that has an equivalent location defined, try to eliminate any eliminable registers (such as the frame pointer) assuming initial offsets for the replacement register, which is the normal case. If the resulting location is directly addressable, substitute the MEM we just got directly for the old REG. If it is not addressable but is a constant or the sum of a hard reg and constant, it is probably not addressable because the constant is out of range, in that case record the address; we will generate hairy code to compute the address in a register each time it is needed. Similarly if it is a hard register, but one that is not valid as an address register. If the location is not addressable, but does not have one of the above forms, assign a stack slot. We have to do this to avoid the potential of producing lots of reloads if, e.g., a location involves a pseudo that didn't get a hard register and has an equivalent memory location that also involves a pseudo that didn't get a hard register. Perhaps at some point we will improve reload_when_needed handling so this problem goes away. But that's very hairy. */ Jeff
Re: [RFC] PR61300 K&R incoming args
On 05/26/2014 09:38 AM, Alan Modra wrote: Background: The ELFv2 ABI requires a parameter save area only when stack is actually used to pass parameters, and since varargs are passed on the stack, unprototyped calls must pass both on the stack and in registers. OK, easy you say, !prototype_p(fun) means a parameter save area is needed. However, a prototype might not be in scope when compiling an old K&R style C function body, but this does *not* mean a parameter save area has necesasrily been allocated. It's fine to change ABI when compiling an old-style function definition for which a prototype exists (relative to the non-prototype case). It happens on i386, too. -- Florian Weimer / Red Hat Product Security Team
PowerPC IEEE 128-bit floating point: Meta discussion
I'm sorry for a wide angle shotgun approach, but I wanted to discuss with all of the stakeholders how to phase in IEEE 128-bit floating point to the PowerPC toolchain. For those of you who are not on the gcc@gcc.gnu.org mailing list, this thread will be archived at: https://gcc.gnu.org/ml/gcc/ What I'm going to do is break this into several followups that each cover one topic, so that it is easier to address issues without having to requote the whole article. What I want to do in the GCC 4.10 time frame is add the ability of users to use IEEE 128-bit extended precision support, with as minimal disruption as possible to existing code. It is unfortunate that we could not have squeezed the support in the PowerPC little endian support with Elf v2, so that we would not have backwards compatibility issues on that platform, but alas it did not happen. Before IEEE 128-bit can be fully implemented, we will need to modify the compiler, libgcc, glibc, the debugger, and possibly other tools as well. Because different teams work on different schedules, it will likely be some time before all of the pieces are in place. However, we likely will need to make sure the compiler piece is in place before work can start on the libraries and debuggers. In terms of user impact, I really don't know how many people are using long double, or want to use IEEE 128-bit floating point. I would hope that if a user program does not use long double, there is not a flag day where they have to move to a compiler/library combination for a feature they don't use. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: PowerPC IEEE 128-bit floating point: Where we are
I'm going to try and summarize the current state of 128-bit floating point on the PowerPC here. There are 2 switches that control long double support in the compiler, but without supporting libraries, it isn't useful to users: -mabi=ieeelongdouble vs. -mabi=ibmlongdouble: These switches control which 128-bit format to use. If you use either switch, you get two warning messages (one from the gcc drive, one from the compiler proper). -mlong-double-128 vs. -mlong-double 64 These switches control whether long double is 128-bits (either ibm/ieee formats), or 64-bits. AIX and Darwin hardwires the choice to -mabi=ibmlongdouble, and you cannot use the switch to override things. Linux and Freebsd set the default to -mabi=ibmlongdouble. Any PowerPC system that is not AIX, Darwin, Linux, nor Freebsd appears to default to IEEE 128-bit (vxworks?). In terms of places where TFmode is mentioned in GCC, it is the following files: predicates.md, rs6000.c, rs6000.h, rs6000.md, rs6000-modes.def, spe.md -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: PowerPC IEEE 128-bit floating point: Emulation functions
Right now the rs6000 backend of the GCC compiler is rather inconsistant in the names used for the IBM extended double floating point. For the basic operations it used __gcc_q: __gcc_qadd __gcc_qsub __gcc_qmul __gcc_qdiv __gcc_qneg __gcc_qne __gcc_qgt __gcc_qlt __gcc_qle In theory the conversions also use the __gcc_ format, but while it is set in rs6000.c, the compiler ignores these names and uses: __dpd_extendddtf __dpd_extendsdtf __dpd_extendtftd __dpd_trunctdtf __dpd_trunctfdd __dpd_trunctfsd __fixtfdi __fixtfti __fixunstfdi __fixunstfti __floatditf __floattitf __floatunditf __floatuntitf __powitf2 This means if have a flag day and change all of long double to IEEE 128-bit, we run the risk of the user inadvertently calling the wrong function. As I see it, we have a choice to have something like multilibs where you select which library to use, or we have to use alternate names for all of the IEEE 128-bit emulation functions. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: PowerPC IEEE 128-bit floating point: Two 128-bit floating point types
I assume we do not want a flag day, where the user flips a switch, and long double is now IEEE 128-bit. That would involve having 2 sets of libraries, etc. to work with existing programs and new programs. Even if the user does not directly use long double, it still may be visible with the switch, since many of the math libraries use long double internally to get more precision. I assume that the current IBM extended double format gives better performance than the emulation functions. I assume the way forward is to initially have a __float128 type that gives IEEE 128-bit support for those that need/want it, and keep long double to be the current format. When all the bits and pieces are in place, we can think about flipping the switch. However, there we have to think about appropriate times for distributions to change over. In terms of calling sequence, there are 2 ways to go: Either pass/return the IEEE 128-bit value in 2 registers (like long double is now) or treat it like a 128-bit vector. The v2 ELF abi explicitly says that it is treated like a vector object, and I would prefer the v1 ELF on big endian server PowerPC's also treat it like a vector. If we are building a compiler for a server target, to prevent confusion, I think it should be a requirement that you must have -mvsx (or at least -maltivec -mabi=altivec) to use __float128. Or do I need to implement two sets of conversion functions, one if the user builds his/her programs with -mcpu=power5 and the other for more recent customers? I don't have a handle on the need for IEEE 128-bit floating point in non-server platforms. I assume in these environments, if we need IEEE 128-bit, it will be passed as two floating point values. Do we need this support? -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: PowerPC IEEE 128-bit floating point: Internal GCC types
Currently within the GCC rs6000 backend, the type TFmode is the default 128-bit floating point type. Depending on TARGET_IEEEQUAD (the internal name for the -mabi=ieeelongdouble option), this gets mapped into either the IEEE 128-bit format or the IBM extended double format. In theory, any place that checks for TFmode should also check for TARGET_IEEEQUAD to determine which type of floating point format is used. Most places do do the check, but I have to assume that a few places might not do it properly. One question for GCC folk is did we want to simplify this, and make TFmode always be IEEE 128-bit, and add another floating type for the IBM extended double format. This will involve fixing up both the compiler and libgcc. I tried to do this in one set of changes, and the compiler itself was fairly straight forward. However, it broke libgcc, due to the use of the mode attribute to get to TFmode instead of using long double. While this too can be fixed (as well as uses in glibc), I wonder users in general use the mode attribute declaration and explicitly use TFmode, and assuming they will get the IBM extended double support. For now, I think it is better to not change TFmode, and just invent a new mode for IEEE 128-bit support. I've looked at code that add two new modes, such as JFmode for explicitly IBM extended double and KFmode mode for the IEEE 128-bit support, and TFmode would be the same as either JFmode or KFmode. These changes were getting a little complex, so I've also looked at using a single mode for IEEE 128-bit (JFmode), and leaving most of the TFmode support alone. Do people have a preference for names? I think using XFmode may cause confusion with x86 One issue is the current mode setup is when you create new floating point types, the widening system kicks in and the compiler will generate all sorts of widening from one 128-bit floating point format to another (because internally the precision for IBM extended double is less than the precision of IEEE 128-bit, due to the size of the mantisas). Ideally we need a different way to create an alternate floating point mode than FRACITION_FLOAT_MODE that does no automatic widening. If there is a way under the current system, I am not aware of it. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: [RFC] PR61300 K&R incoming args
On Fri, May 30, 2014 at 11:27:52AM -0600, Jeff Law wrote: > On 05/26/14 01:38, Alan Modra wrote: > >PR61300 shows a need to differentiate between incoming and outgoing > >REG_PARM_STACK_SPACE for the PowerPC64 ELFv2 ABI, due to code like > >function.c:assign_parm_is_stack_parm determining that a stack home > >is available for incoming args if REG_PARM_STACK_SPACE is non-zero. > > > >Background: The ELFv2 ABI requires a parameter save area only when > >stack is actually used to pass parameters, and since varargs are > >passed on the stack, unprototyped calls must pass both on the stack > >and in registers. OK, easy you say, !prototype_p(fun) means a > >parameter save area is needed. However, a prototype might not be in > >scope when compiling an old K&R style C function body, but this does > >*not* mean a parameter save area has necesasrily been allocated. A > >caller may well have a prototype in scope at the point of the call. > Ugh. This reminds me a lot of the braindamage we had to deal with > in the original PA abi's handling of FP values. > > In the general case, how can any function ever be sure as to whether > or not its prototype was in scope at a call site? Yea, we can know > for things with restricted scope, but if it's externally visible, I > don't see how we're going to know the calling context with absolute > certainty. > > What am I missing here? When compiling the function body you don't need to know whether a prototype was in scope at the call site. You just need to know the rules. :) For functions with variable argument lists, you'll always have a parameter save area. For other functions, whether or not you have a parameter save area just depends on the number of arguments and their types (ie. whether you run out of registers for parameter passing), and you have that whether or not the function is prototyped. A simple example might help clear up any confusion. Given void fun1(int a, int b, double c); void fun2(int a, ...); ... fun1 (1, 2, 3.0); fun2 (1, 2, 3.0); A call to fun1 with a prototype in scope won't allocate a parameter save area, and will pass the first arg in r3, the second in r4, and the third in f1. A call to fun2 with a prototype in scope will allocate a parameter save area of 64 bytes (the minimum size of a parameter save area), and will pass the first arg in r3, the second in the second slot of the parameter save area, and the third in the third slot of the parameter save area. Now the first eight slots/double-words of the parameter save area are passed in r3 thru r10, so this means the second arg is actually passed in r4 and the third in r5, not the stack! A call to fun1 or fun2 without a prototype in scope will allocate a parameter save area, and pass the first arg in r3, the second in r4, and the third in both f1 and r5. When compiling fun1 body, the first arg is known to be in r3, the second in r4, and the third in f1, and we don't use the parameter save area for storing incoming args to a stack slot. (At least, after PR61300 is fixed..) It doesn't matter if the parameter save area was allocated or not, we just don't use it. When compiling fun2 body, the first arg is known to be in r3, the second in r4 and the third in r5. Since the function has a variable argument list, registers r4 thru r10 are saved to the parameter save area stack, and we set up our va_list pointer to the second double-word of the parameter save area stack. Of course, code optimisation might lead to removing the saves and using the args in their incoming regs, but this is conceptually what happens. -- Alan Modra Australia Development Lab, IBM
Re: [RFC] PR61300 K&R incoming args
On Fri, May 30, 2014 at 09:22:30PM +0200, Florian Weimer wrote: > On 05/26/2014 09:38 AM, Alan Modra wrote: > > >Background: The ELFv2 ABI requires a parameter save area only when > >stack is actually used to pass parameters, and since varargs are > >passed on the stack, unprototyped calls must pass both on the stack > >and in registers. OK, easy you say, !prototype_p(fun) means a > >parameter save area is needed. However, a prototype might not be in > >scope when compiling an old K&R style C function body, but this does > >*not* mean a parameter save area has necesasrily been allocated. > > It's fine to change ABI when compiling an old-style function > definition for which a prototype exists (relative to the > non-prototype case). It happens on i386, too. That might be so, but when compiling the function body you must assume the worst case, whatever that might be, at the call site. For K&R code, our error was to assume the call was unprototyped (which paradoxically is the best case) when compiling the function body. -- Alan Modra Australia Development Lab, IBM