Re: Function Multiversioning Usability.
On Wed, Aug 17, 2011 at 6:37 PM, Xinliang David Li wrote: > On Wed, Aug 17, 2011 at 8:12 AM, Richard Guenther > wrote: >> On Wed, Aug 17, 2011 at 4:52 PM, Xinliang David Li >> wrote: >>> The gist of previous discussion is to use function overloading instead >>> of exposing underlying implementation such as builtin_dispatch to the >>> user. This new refined proposal has not changed in that, but is more >>> elaborate on various use cases which has been carefully thought out. >>> Please be specific on which part needs to improvement. >> >> See below ... >> >>> Thanks, >>> >>> David >>> >>> On Wed, Aug 17, 2011 at 12:29 AM, Richard Guenther >>> wrote: On Tue, Aug 16, 2011 at 10:37 PM, Sriraman Tallam wrote: > Hi, > > I am working on supporting function multi-versioning in GCC and here > is a write-up on its usability. > > Multiversioning Usability > > > For a simple motivating example, > > int > find_popcount(unsigned int i) > { > return __builtin_popcount(i); > } > > Currently, compiling this with -mpopcnt will result in the “popcnt” > instruction being used and otherwise call a built-in generic > implementation. It is desirable to have two versions of this function > so that it can be run both on targets that support the popcnt insn and > those that do not. > > > * Case I - User Guided Versioning where only one function body is > provided by the user. > > This case addresses a use where the user wants multi-versioning but > provides only one function body. I want to add a new attribute called > “mversion” which will be used like this: > > int __attribute__(mversion(“popcnt”)) > find_popcount(unsigned int i) > { > return __builtin_popcount(i); > } > > With the attribute, the compiler should understand that it should > generate two versions for this function. The user makes a call to this > function like a regular call but the code generated would call the > appropriate function at run-time based on a check to determine if that > instruction is supported or not. >> >> The example seems to be particularly ill-suited. Trying to 2nd guess you >> here I think you want to direct the compiler to emit multiple versions >> with different target capabilities enabled, probably for elaborate code that >> _doesn't_ use any fancy builtins, right? It seems this is a shortcut for >> >> static inline __attribute__((always_iniline)) implementation () { ... } >> >> symbol __attribute__((target("msse2"))) { implementation(); } >> symbol __attribute__((target("msse3"))) { implementation(); } >> ... >> >> and so should be fully handled by the frontend (if at all, it seems to >> be purely syntactic sugar). > > Yes, it is a handy short cut -- I don't see the base for objection to > this convenience. And I don't see why we need to discuss it at this point. It also seems severely limited considering when I want to version for -msse2 -mpopcount and -msse4a - that doesn't look expressible. A more elaborate variant would be, say, foo () { ... }; foo __attribute__((version("sse2","popcount"))); foo __attribute__((version("sse4a"))); thus trigger a overload clone by a declaration as well, not just by a definition, similar to an explicit template instantiation. That sounds more scalable to me. >> > The attribute can be scaled to support many versions but allowing a > comma separated list of values for the mversion attribute. For > instance, “__attribute__(mversion(“sse3”, “sse4”, ...)) will provide a > version for each. For N attributes, N clones plus one clone for the > default case will have to be generated by the compiler. The arguments > to the "mversion" attribute will be similar to the arguments supported > by the "target" attribute. > > This attribute is useful if the same source is going to be used to > generate the different versions. If this has to be done manually, the > user has to duplicate the body of the function and specify a target > attribute of “popcnt” on one clone. Then, the user has to use > something like IFUNC support or manually write code to call the > appropriate version. All of this will be done automatically by the > compiler with this new attribute. > > * Case II - User Guided Versioning where the function bodies for each > version differ and is provided by the user. > > This case pertains to multi-versioning when the source bodies of the > two or more versions are different and are provided by the user. Here > too, I want to use a new attribute, “version”. Now, the user can > specify versioning intent like this: > > int __attribute__((version(“popcnt”)) > find_popcnt(unsigned int i) > { > // inline assembly of the popcnt instruction, specialized version. > asm(“popcnt ….”); > }
Re: Just what are rtx costs?
Thanks for the feedback. Paolo Bonzini writes: > On 08/17/2011 07:52 AM, Richard Sandiford wrote: >>cost = rtx_cost (SET_SRC (set), SET, speed); >>return cost> 0 ? cost : COSTS_N_INSNS (1); >> >> This ignores SET_DEST (the problem I'm trying to fix). It also means >> that constants that are slightly more expensive than a register -- >> somewhere in the range [0, COSTS_N_INSNS (1)] -- end up seeming >> cheaper than registers. > > This can be fixed by doing > >return cost >= COSTS_N_INSNS (1) ? cost : COSTS_N_INSNS (1); Is that really a fix though? Those sorts of constant are supposed to be more expensive than a normal register move, not the same cost. To put it another way: for real operations like PLUS, you have full control. If something is slightly more expensive than a single move, but not as expensive as two moves, you can return a cost in the range [COSTS_N_INSNS (1), COSTS_N_INSNS (2)]. But constants are generally given relative to the cost of a register, so the corresponding range would be [0, COSTS_N_INSNS (1)] instead. >> One approach I'm trying is to make sure that every target that doesn't >> explicitly handle SET does nothing with it. (Targets that do handle >> SET remain unchanged.) Then, if we see a SET whose SET_SRC is a >> register, constant, memory or subreg, we give it cost: >> >> COSTS_N_INSNS (1) >> + rtx_cost (SET_DEST (x), SET, speed) >> + rtx_cost (SET_SRC (x), SET, speed) >> >> as now. In other cases we give it a cost of: >> >> rtx_cost (SET_DEST (x), SET, speed) >> + rtx_cost (SET_SRC (x), SET, speed) >> >> But that hardly seems clean either. Perhaps we should instead make >> the SET_SRC always include the cost of the SET, even for registers, >> constants and the like. Thoughts? > > Similarly, this becomes > >dest_cost = rtx_cost (SET_DEST (x), SET, speed); >src_cost = MAX (rtx_cost (SET_SRC (x), SET, speed), >COSTS_N_INSNS (1)); >return dest_cost + src_cost; > > How does this look? This has the same problem as above. There's a second problem though: what about register stores? Say a load is as expensive as a store (often true, when optimising for size). The formula above would give the cost of a load as being: cost(REG) + MAX(cost(MEM), CNI1) == cost(MEM) whereas the cost of a store would be: cost(MEM) + MAX(cost(REG), CNI1) == cost(MEM) + CNI1 On RISCy machines you could try to compensate by making the cost of a MEM smaller for lvalues, but that doesn't seem very clean. It would also skew the costs of MEM-to-MEM moves on CISCy machines, where the MAX would have no effect. Richard
Re: Function Multiversioning Usability.
On Thu, Aug 18, 2011 at 12:51 AM, Richard Guenther wrote: > On Wed, Aug 17, 2011 at 6:37 PM, Xinliang David Li wrote: >> On Wed, Aug 17, 2011 at 8:12 AM, Richard Guenther >> wrote: >>> On Wed, Aug 17, 2011 at 4:52 PM, Xinliang David Li >>> wrote: The gist of previous discussion is to use function overloading instead of exposing underlying implementation such as builtin_dispatch to the user. This new refined proposal has not changed in that, but is more elaborate on various use cases which has been carefully thought out. Please be specific on which part needs to improvement. >>> >>> See below ... >>> Thanks, David On Wed, Aug 17, 2011 at 12:29 AM, Richard Guenther wrote: > On Tue, Aug 16, 2011 at 10:37 PM, Sriraman Tallam > wrote: >> Hi, >> >> I am working on supporting function multi-versioning in GCC and here >> is a write-up on its usability. >> >> Multiversioning Usability >> >> >> For a simple motivating example, >> >> int >> find_popcount(unsigned int i) >> { >> return __builtin_popcount(i); >> } >> >> Currently, compiling this with -mpopcnt will result in the “popcnt” >> instruction being used and otherwise call a built-in generic >> implementation. It is desirable to have two versions of this function >> so that it can be run both on targets that support the popcnt insn and >> those that do not. >> >> >> * Case I - User Guided Versioning where only one function body is >> provided by the user. >> >> This case addresses a use where the user wants multi-versioning but >> provides only one function body. I want to add a new attribute called >> “mversion” which will be used like this: >> >> int __attribute__(mversion(“popcnt”)) >> find_popcount(unsigned int i) >> { >> return __builtin_popcount(i); >> } >> >> With the attribute, the compiler should understand that it should >> generate two versions for this function. The user makes a call to this >> function like a regular call but the code generated would call the >> appropriate function at run-time based on a check to determine if that >> instruction is supported or not. >>> >>> The example seems to be particularly ill-suited. Trying to 2nd guess you >>> here I think you want to direct the compiler to emit multiple versions >>> with different target capabilities enabled, probably for elaborate code that >>> _doesn't_ use any fancy builtins, right? It seems this is a shortcut for >>> >>> static inline __attribute__((always_iniline)) implementation () { ... } >>> >>> symbol __attribute__((target("msse2"))) { implementation(); } >>> symbol __attribute__((target("msse3"))) { implementation(); } >>> ... >>> >>> and so should be fully handled by the frontend (if at all, it seems to >>> be purely syntactic sugar). >> >> Yes, it is a handy short cut -- I don't see the base for objection to >> this convenience. > > And I don't see why we need to discuss it at this point. It also seems > severely limited considering when I want to version for -msse2 -mpopcount > and -msse4a - that doesn't look expressible. A more elaborate variant > would be, say, > > foo () { ... }; > foo __attribute__((version("sse2","popcount"))); > foo __attribute__((version("sse4a"))); > > thus trigger a overload clone by a declaration as well, not just by a > definition, similar to an explicit template instantiation. That sounds more > scalable to me. This is a good point. And the support of multiple feature list is also nice. Thanks, David > > I don't see how auto-MV has any impact on the infrastructure, so we might > as well postpone any discussion until the infrastructure is set. > > Richard. > >> Thanks, >> >> David >> >>> This will be a lot of work if it shouldn't be very inefficient. >>> >>> Richard. >>> >> be that while “-m” generates only the specialized version, “-mversion” >> will generate both the specialized and the generic versions. There is >> no need to explicity mark any function for versioning, no source >> changes. >> >> The compiler will decide if it is beneficial to multi-version a >> function based on heuristics using hotness information, code size >> growth, etc. >> >> >> Runtime support >> === >> >> In order for the compiler to generate multi-versioned code, it needs >> to call functions that would test if a particular feature exists or >> not at run-time. For example, IsPopcntSupported() would be one such >> function. I have prepared a patch to do this which adds the runtime >> support in libgcc and supports new builtins to test the various >> features. I will send the patch separately to keep the dicussions >> focused. >> >> >> Thoughts? > > Please focus on one mechanism and re-use existin
Re: regrename and odd behaviour with early clobber operands
On 16 August 2011 16:24, Richard Sandiford wrote: > Ramana Radhakrishnan writes: >> I can't see how it is right to construct essentially 2 chains for the >> same register that have overlapping live ranges without an intervening >> conditional branch and since regrename sort of works inside a bb . >> Ideally the chain for 122 should have been terminated at the end of >> 123 rather than allowing this to remain open and have the use in insn >> 141 available for use in both chains starting at 122 and 140 . What >> I'm not sure is which part of regrename makes sure that this part of >> the comment for Stage 5 is ensured. >> >> `and earlier >> chains they would overlap with must have been closed at >> the previous insn at the latest, as such operands cannot >> possibly overlap with any input operands. */' > > Just to summarise on-list what we talked about on IRC: this is supposed > to happen through REG_DEAD notes. The bug in this case appears to be > that the required note is missing. > > The patch below seems to fix things. If it's right, I'm very surprised > we hadn't noticed until now. There must be something else going on... I've been digging a bit yesterday afternoon and reading the code it appears as though if you are to check for multiword-register uses you do need to check the macro DF_MWS_REG_USE_P (mws). I suspect if you don't look at that you really aren't looking at what multiword-registers an instruction really uses. Interestingly your patch managed to survive a bootstrap and testrun on x86 with no regressions. Ramana > > Richard > > > Index: gcc/df-problems.c > === > --- gcc/df-problems.c 2011-07-11 12:21:33.0 +0100 > +++ gcc/df-problems.c 2011-08-16 16:18:52.333237669 +0100 > @@ -3376,7 +3376,7 @@ df_note_bb_compute (unsigned int bb_inde > while (*mws_rec) > { > struct df_mw_hardreg *mws = *mws_rec; > - if ((DF_MWS_REG_DEF_P (mws)) > + if ((DF_MWS_REG_USE_P (mws)) > && !df_ignore_stack_reg (mws->start_regno)) > { > bool really_add_notes = debug_insn != 0; >
Re: Just what are rtx costs?
Hans-Peter Nilsson writes: > On Wed, 17 Aug 2011, Richard Sandiford wrote: >> It also means >> that constants that are slightly more expensive than a register -- >> somewhere in the range [0, COSTS_N_INSNS (1)] -- end up seeming >> cheaper than registers. > > Yes, perhaps some scale factor has to be applied to get > reasonable cost granularity in an improved cost model for the > job... Some constants *are* more expensive (extra words and/or > extra cycles), yet preferable to registers for one (or maybe two > or...) insns. You don't want to find that all insns except > constant-loads suddenly use register arguments and no > port-specific metric way to tweak it. Mentioned for the record. I was hoping that if the costs of SETs were "fixed", then that kind of situation should work without any new scale factors. E.g. two constant moves of equal execution frequency that cost COSTS_N_INSNS (1) + 1 (total COSTS_N_INSNS (2) + 2) shouldn't be CSEd (COSTS_N_INSNS (3) + 1) unless the single constant move can go in a less frequently-executed block. > I don't think you can get into trouble for trying to improve > this area for consistency: between releases a port already > usually arbitrarily loses on some type of codes and costs have > to be re-tweaked, unless performance is closely tracked. Yeah, probably true. Certainly a good excuse for me to use. :-) > Aiming for traceability can only help (like, "read the added doc > blurb on how to define the port RTX costs instead of gdb > stepping and code inspection"). Yeah, that'd be nice. In practice there's probably always going to be a bit of experimentation needed. E.g. getting the cost of multiplicaton correct wrt shifts and adds can be tricky on a superscalar target. (At least, it was when I last tried it too many years ago.) Richard
PATCH for Re: GCC 4.6.2 Status Report (2011-08-17)
And here is the web page patch... Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.809 diff -u -r1.809 index.html --- index.html 2 Aug 2011 17:02:57 - 1.809 +++ index.html 18 Aug 2011 08:55:18 - @@ -133,7 +133,7 @@ Status: - http://gcc.gnu.org/ml/gcc/2011-06/msg00341.html";>2011-06-27 + http://gcc.gnu.org/ml/gcc/2011-08/msg00320.html";>2011-08-17 (regression fixes and docs only).
[PATCH] for Re: New mirror
On Mon, 8 Aug 2011, Sergey Kutserey wrote: > Hopefully you can add this mirror into public mirror list for GCC project. Thanks, Sergey. This is how I added your mirror to our list. Gerald Index: mirrors.html === RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v retrieving revision 1.210 diff -u -r1.210 mirrors.html --- mirrors.html25 Jul 2011 23:06:43 - 1.210 +++ mirrors.html18 Aug 2011 09:07:44 - @@ -52,6 +52,7 @@ UK: ftp://ftp.mirrorservice.org/sites/sourceware.org/pub/gcc/";>ftp://ftp.mirrorservice.org/sites/sourceware.org/pub/gcc/, thanks to mirror at mirrorservice dot org UK, London: http://gcc-uk.internet.bs";>http://gcc-uk.internet.bs, thanks to Internet.bs (info at internet dot bs) US, Phoenix: http://fileboar.com/gcc/";>fileboar.com, thanks to Grigory Rayskin (rayskin73 at gmail dot com) +US, Saint Louis: http://gcc.petsads.us";>http://gcc.petsads.us, thanks to Sergey Kutserey (s.kutserey at gmail dot com) US, Tampa: http://mirrors-us.seosue.com/gcc/";>http://mirrors-us.seosue.com/gcc/, thanks to Peter Vrzak (petervrzak at gmail dot com)
Lack of libstdc++ compatibility (was: Revision 176335)
On Tue, 2 Aug 2011, Jakub Jelinek wrote: >> Revisions 176335 removed the traditional "#include " from >> gthr-posix.h. This breaks the build of many programs (Firefox, Chromium, >> etc.) that implicitly rely on it. > This isn't the first time the libstdc++ headers were cleaned up, and > each time there are dozens of programs that need to be fixed up. Each > time they just were fixed. I think that is a bit too much of a cavalier approach. For example, right now I am in the process of changing the compiler FreeBSD uses for Fortran packages from GCC 4.5 to 4.6 and 100% of the issues we run into are not related to Fortran but...C++. Breaking source code compatbility with a step like GCC 3 to GCC 4 is one thing, but constantly doing so with every release is just painful and makes as look bad. And it hinders the adoption of newer releases. Gerald
Re: Lack of libstdc++ compatibility (was: Revision 176335)
On Thu, Aug 18, 2011 at 11:34 AM, Gerald Pfeifer wrote: > On Tue, 2 Aug 2011, Jakub Jelinek wrote: >>> Revisions 176335 removed the traditional "#include " from >>> gthr-posix.h. This breaks the build of many programs (Firefox, Chromium, >>> etc.) that implicitly rely on it. >> This isn't the first time the libstdc++ headers were cleaned up, and >> each time there are dozens of programs that need to be fixed up. Each >> time they just were fixed. > > I think that is a bit too much of a cavalier approach. Because you ignore the fact that this change fixed a bug. > For example, right now I am in the process of changing the compiler > FreeBSD uses for Fortran packages from GCC 4.5 to 4.6 and 100% of > the issues we run into are not related to Fortran but...C++. > > Breaking source code compatbility with a step like GCC 3 to GCC 4 > is one thing, but constantly doing so with every release is just > painful and makes as look bad. And it hinders the adoption of > newer releases. Such visible changes are less of an issue when adopting a new release compared to hidden issues that only show up during testing. Richard. > Gerald >
Re: Lack of libstdc++ compatibility
On 08/18/2011 11:42 AM, Richard Guenther wrote: On Thu, Aug 18, 2011 at 11:34 AM, Gerald Pfeifer wrote: On Tue, 2 Aug 2011, Jakub Jelinek wrote: Revisions 176335 removed the traditional "#include" from gthr-posix.h. This breaks the build of many programs (Firefox, Chromium, etc.) that implicitly rely on it. This isn't the first time the libstdc++ headers were cleaned up, and each time there are dozens of programs that need to be fixed up. Each time they just were fixed. I think that is a bit too much of a cavalier approach. Because you ignore the fact that this change fixed a bug. Indeed and actually a rather serious one, already reported multiple times. When thanks to Jon's idea and Jukub refinements we have been able to fix it I was very happy! Paolo.
Re: Lack of libstdc++ compatibility
On 18 August 2011 11:19, Paolo Carlini wrote: > On 08/18/2011 11:42 AM, Richard Guenther wrote: >> >> On Thu, Aug 18, 2011 at 11:34 AM, Gerald Pfeifer >> wrote: >>> >>> On Tue, 2 Aug 2011, Jakub Jelinek wrote: > > Revisions 176335 removed the traditional "#include" from > gthr-posix.h. This breaks the build of many programs (Firefox, > Chromium, > etc.) that implicitly rely on it. This isn't the first time the libstdc++ headers were cleaned up, and each time there are dozens of programs that need to be fixed up. Each time they just were fixed. >>> >>> I think that is a bit too much of a cavalier approach. >> >> Because you ignore the fact that this change fixed a bug. > > Indeed and actually a rather serious one, already reported multiple times. > When thanks to Jon's idea and Jukub refinements we have been able to fix it > I was very happy! As I've already explained, we rejected certain valid programs, and accepted certain other invalid ones. Now we accept those valid ones and reject the invalid ones. Which do you prefer? Working around the GCC bug required renaming functions or variables to avoid collisions, not even possible for third-party code. Working around the new compiler failures involves adding a missing header, even if it's third-party code that's wrong you can include the missing header before including the third-party ones. Which is easier? In many cases those invalid programs would have been rejected by other compilers, which don't have the same namespace pollution problem that GCC did. If the change makes us look so bad that users switch to a different compiler then they'll probably still have to fix those invalid programs anyway.
Re: Lack of libstdc++ compatibility (was: Revision 176335)
On 18 August 2011 10:34, Gerald Pfeifer wrote: > On Tue, 2 Aug 2011, Jakub Jelinek wrote: >>> Revisions 176335 removed the traditional "#include " from >>> gthr-posix.h. This breaks the build of many programs (Firefox, Chromium, >>> etc.) that implicitly rely on it. >> This isn't the first time the libstdc++ headers were cleaned up, and >> each time there are dozens of programs that need to be fixed up. Each >> time they just were fixed. > > I think that is a bit too much of a cavalier approach. > > For example, right now I am in the process of changing the compiler > FreeBSD uses for Fortran packages from GCC 4.5 to 4.6 and 100% of > the issues we run into are not related to Fortran but...C++. > > Breaking source code compatbility with a step like GCC 3 to GCC 4 > is one thing, but constantly doing so with every release is just > painful and makes as look bad. And it hinders the adoption of > newer releases. Personally I *like* it when a new release identifies portability problems such as missing includes. I consider it an advantage, and an improvement in the compiler.
Re: i370 port
Hi Ulrich. I put in the following debug: op0 = find_replacement (&XEXP (in, 0)); op1 = find_replacement (&XEXP (in, 1)); /* Since constraint checking is strict, commutativity won't be checked, so we need to do that here to avoid spurious failure if the add instruction is two-address and the second operand of the add is the same as the reload reg, which is frequently the case. If the insn would be A = B + A, rearrange it so it will be A = A + B as constrain_operands expects. */ fprintf(stderr, "REGNO(out) is %d\n", REGNO(out)); fprintf(stderr, " REG in 1 is %d\n", REGNO(XEXP(in,1))); if (GET_CODE (XEXP (in, 1)) == REG && REGNO (out) == REGNO (XEXP (in, 1))) tem = op0, op0 = op1, op1 = tem; And it produced this output (for exactly the same code I showed you previously): C:\devel\pdos\s370>\devel\gcc\gcc\gccmvs -da -DUSE_MEMMGR -Os -DS390 -S -I . -I ../pdpclib pdos.c REGNO(out) is 3 REG in 1 is 32880 REGNO(out) is 2 REG in 1 is 32880 REGNO(out) is 2 REG in 1 is 32880 REGNO(out) is 2 REG in 1 is 112 REGNO(out) is 3 REG in 1 is 32880 REGNO(out) is 4 REG in 1 is 112 REGNO(out) is 2 REG in 1 is 112 which looks to me like it is not seeing a register, only a constant, so cannot perform a swap. Let me know if that is not the debugging required. Thanks. Paul. -Original Message- From: Ulrich Weigand Sent: Tuesday, August 16, 2011 11:25 PM To: Paul Edwards Cc: gcc@gcc.gnu.org Subject: Re: i370 port Paul Edwards wrote: >> Unfortunately it's not quite right, seemingly not loading R9 properly: >> >> LR9,13 >> AR9,13 >> MVC 0(10,9),0(2) > That's weird. What does the reload dump (.greg) say? I have trimmed the code down to a reasonably small size so that I could provide the .greg file (below) from the "-da" option. I don't know how to read it so I don't know if I've provided everything required. Here is the current problematic generated code: * Function pdosLoadExe code L 2,4(11) MVC 88(4,13),=A(LC0) ST2,92(13) LA1,88(,13) L 15,=V(PRINTF) BALR 14,15 LR3,13 <= probably wrong AR3,13 <= else this is wrong MVC 0(10,3),0(2) Reload decides on the following actions: Reloads for insn # 38 Reload 0: reload_in (SI) = (const_int 32880 [0x8070]) ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0) reload_in_reg: (const_int 32880 [0x8070]) reload_reg_rtx: (reg:SI 3 3) Reload 1: reload_in (SI) = (plus:SI (reg/f:SI 13 13) (const_int 32880 [0x8070])) ADDR_REGS, RELOAD_FOR_INPUT (opnum = 0) reload_in_reg: (plus:SI (reg/f:SI 13 13) (const_int 32880 [0x8070])) reload_reg_rtx: (reg:SI 3 3) That is, first: load the constant 32880 into register 3, and second: using that reloaded constant, compute the sum of register 13 plus 32880 and load the result also into register 3. Then, use that register for addressing. This leads to the following generated code: (insn 271 37 273 0 (set (reg:SI 3 3) (const_int 32880 [0x8070])) 15 {movsi} (nil) (nil)) Load constant into register 3. (insn 273 271 274 0 (set (reg:SI 3 3) (reg/f:SI 13 13)) 15 {movsi} (nil) (nil)) (insn 274 273 38 0 (set (reg:SI 3 3) (plus:SI (reg:SI 3 3) (reg:SI 3 3))) 41 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 13 13) (reg:SI 3 3)) (nil))) Compute the sum. Note that this code is wrong. (insn 38 274 41 0 (parallel [ (set (mem/s:BLK (reg:SI 3 3) [6 srchprog+0 S10 A64]) (mem:BLK (reg/v/f:SI 2 2 [orig:27 prog ] [27]) [0 S10 A8])) (use (const_int 10 [0xa])) ]) 25 {*i370.md:1623} (insn_list 37 (nil)) (nil)) Use register 3 for adressing. The wrong code comes in when generating the sum (insns 273/274). I would have expected this to be a simple addsi3 instruction, along the lines of (set (reg:SI 3 3) (plus:SI (reg:SI 3 3) (reg:SI 13 13))) Note that the incoming pattern: (set (reg:SI 3 3) (plus:SI (reg:SI 13 13) (reg:SI 3 3))) cannot be immediately resolved, since addsi3 requires the first operand of the plus to match the result. However, this could have been fixed by just swapping the operands. Instead, the code attempts to create the match by reloading the first operand (reg 13) into the output (reg 3) -- this is bogus, since it thereby clobbers the *second* input operand, which happens to match the output. The code that generates these insns is in reload1.c:gen_reload /* We need to compute the sum of a register or a MEM and another register, constant, or MEM, and put it into the reload register. The best possible way of doing this is if the machine has a three-operand ADD insn that accepts the required operands. The simplest a
Build report gcc 4.6.1 on Sparc Solaris 10
Hello, I picked up gcc-4.6.1 and startet a build process on a sparc-solaris10 box with /opt/sfw and SolStudio 12.2 installed. I am using mpc-0.9, mpfr-3.0.1 and gmp-5.0.2 which I extracted in the gcc source directory and created a link as stated in the installation instruction/prerequisites. I also made sure the /usr/ucb is NOT in my path! Here are my results: Trying to build using the installed gcc - ssol10% /opt/sfw/bin/gcc --version gcc (GCC) 3.4.2 Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. failed for various reasons, so I tried to use the native cc: ssol10% /usr/bin/cc -V cc: Sun C 5.11 SunOS_sparc 2010/08/13 I ran into huge problems, so I installed binutils 2.21.1, make-3.82 and some other stuff like flex, bison, less all in the latest versions and compiled with native cc. Finally the build of the multilibs failed so I decided to give GNU as/ld from binutils a chance, so I configured with the following command: ../configure --prefix=/usr2/gnu --enable-languages=c,c++ CC=/usr/bin/cc --with-gnu-as --with-as=/usr2/gnu/bin/gas --with-gnu-ld --with-ld=/usr2/gnu/bin/gld Do not be irritated when this detects executables in $prefix/libexec/sparc=sun-solaris2.10/bin (as, ld, nm,...) - they have been installed there from binutils. Doing a make stops when building in lto-plugin: make[4]: Entering directory `/archive/sparc-solaris/gcc-4.6.1/ssol/lto-plugin' /bin/bash ./libtool --tag=CC --tag=disable-static --mode=compile /usr/bin/cc -DHAVE_CONFIG_H -I. -I../../lto-plugin -I../../lto-plugin/../include -DHAVE_CONFIG_H -Wall -Werror -g -c -o lto-plugin.lo ../../lto-plugin/lto-plugin.c libtool: compile: /usr/bin/cc -DHAVE_CONFIG_H -I. -I../../lto-plugin -I../../lto-plugin/../include -DHAVE_CONFIG_H -Wall -Werror -g -c ../../lto-plugin/lto-plugin.c -KPIC -DPIC -o .libs/lto-plugin.o cc: -W option with unknown program all To fix this, I edited lto-plugin/Makefile - yes this is generated. I changed AM_CFLAGS = -Wall -Werror into AM_CFLAGS = Indeed, make it empty since SUN ld used by /usr/bin/cc) does not understand these options. Restarting the make will take some time and give another error in subdirectory gcc: /usr/bin/cc -g -DIN_GCC -DHAVE_CONFIG_H -o cc1 c-lang.o c-family/stub-objc.o attribs.o c-errors.o c-decl.o c-typeck.o c-convert.o c-aux-info.o c-objc-common.o c-parser.o tree-mudflap.o c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o c-family/c-ada-spec.o sol2-c.o \ cc1-checksum.o main.o tree-browser.o libbackend.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a ../libcpp/libcpp.a ./../intl/libintl.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -L/archive/sparc-solaris/gcc-4.6.1/ssol/./gmp/.libs -L/archive/sparc-solaris/gcc-4.6.1/ssol/./mpfr/.libs -L/archive/sparc-solaris/gcc-4.6.1/ssol/./mpc/src/.libs -lmpc -lmpfr -lgmp -L../zlib -lz Undefined first referenced symbol in file cimag /archive/sparc-solaris/gcc-4.6.1/ssol/./mpc/src/.libs/libmpc.a(set_x.o) creal /archive/sparc-solaris/gcc-4.6.1/ssol/./mpc/src/.libs/libmpc.a(set_x.o) cimagl /archive/sparc-solaris/gcc-4.6.1/ssol/./mpc/src/.libs/libmpc.a(set_x.o) creall /archive/sparc-solaris/gcc-4.6.1/ssol/./mpc/src/.libs/libmpc.a(set_x.o) ld: fatal: Symbol referencing errors. No output written to cc1 make[3]: *** [cc1] Error 2 make[3]: Leaving directory `/archive/sparc-solaris/gcc-4.6.1/ssol/gcc' make[2]: *** [all-stage1-gcc] Error 2 make[2]: Leaving directory `/archive/sparc-solaris/gcc-4.6.1/ssol' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/archive/sparc-solaris/gcc-4.6.1/ssol' make: *** [all] Error 2 This can be fixed by adding '-lm' to HOST_LIBS in gcc/Makefile: # Libraries to use on the host. #HOST_LIBS = HOST_LIBS = -lm Then I restarted the 'make' again and it took quite some time (hours on my machine) I ran into another problem with mpc-.0: libtool: compile: /archive/sparc-solaris/gcc-4.6.1/ssol/./prev-gcc/xgcc -B/archive/sparc-solaris/gcc-4.6.1/ssol/./prev-gcc/ -B/usr2/gnu/sparc-sun-solaris2.10/bin/ -B/usr2/gnu/sparc-sun-solaris2.10/bin/ -B/usr2/gnu/sparc-sun-solaris2.10/lib/ -isystem /usr2/gnu/sparc-sun-solaris2.10/include -isystem /usr2/gnu/sparc-sun-solaris2.10/sys-include -DHAVE_CONFIG_H -I. -I../../../mpc/src -I.. -I/archive/sparc-solaris/gcc-4.6.1/ssol/./gmp -I/archive/sparc-solaris/gcc-4.6.1/mpfr -g -O2 -MT get.lo -MD -MP -MF .deps/get.Tpo -c ../../../mpc/src/get.c -o get.o ../../../mpc/src/get.c: In function ‘mpc_get_dc’: ../../../mpc/src/get.c:33:11: error: ‘I’ undeclared (first use in this function) ../../../mpc/src/get.c:33:11: note: each undeclared identifier is reported only once for each function it appears in ../../../mpc/src/ge
Re: i370 port
Paul Edwards wrote: > Hi Ulrich. I put in the following debug: > > op0 = find_replacement (&XEXP (in, 0)); > op1 = find_replacement (&XEXP (in, 1)); > > /* Since constraint checking is strict, commutativity won't be > checked, so we need to do that here to avoid spurious failure > if the add instruction is two-address and the second operand > of the add is the same as the reload reg, which is frequently > the case. If the insn would be A = B + A, rearrange it so > it will be A = A + B as constrain_operands expects. */ > > fprintf(stderr, "REGNO(out) is %d\n", REGNO(out)); > fprintf(stderr, " REG in 1 is %d\n", REGNO(XEXP(in,1))); > if (GET_CODE (XEXP (in, 1)) == REG > && REGNO (out) == REGNO (XEXP (in, 1))) > tem = op0, op0 = op1, op1 = tem; > > And it produced this output (for exactly the same code I showed > you previously): > > C:\devel\pdos\s370>\devel\gcc\gcc\gccmvs -da -DUSE_MEMMGR -Os -DS390 -S -I > . -I ../pdpclib pdos.c > REGNO(out) is 3 > REG in 1 is 32880 > REGNO(out) is 2 > REG in 1 is 32880 > REGNO(out) is 2 > REG in 1 is 32880 > REGNO(out) is 2 > REG in 1 is 112 > REGNO(out) is 3 > REG in 1 is 32880 > REGNO(out) is 4 > REG in 1 is 112 > REGNO(out) is 2 > REG in 1 is 112 > > which looks to me like it is not seeing a register, only a constant, > so cannot perform a swap. Oops, there's clearly a bug here. "in" at this point is the original expression that has not yet been reloaded, so its second operand will indeed be a constant, not a register. However, reload has already decided that this constant will end up being replaced by a register, and that is what the "find_replacement" call is checking. So at this point in the program, XEXP (in, 1) will be the constant, but op1 will be the register it is going to be replaced with. Unfortunately the test whether to swap looks at XEXP (in, 1) -- it really needs to look at op1 instead. Can you try changing the lines if (GET_CODE (XEXP (in, 1)) == REG && REGNO (out) == REGNO (XEXP (in, 1))) to if (GET_CODE (op1) == REG && REGNO (out) == REGNO (op1)) instead? Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
[GSOC] code contribution + documentation
Hello, As GSOC is approching it's end, I would like to get precision of how project's result should be made available. I open a quite general topic however I guess it heavily depends of each project. In particular, if I made some contributions to GCC and MELT, the main part of my project is a plugin. For now it is hosted on a gitorious repository (https://gitorious.org/talpo). As it uses MELT, it will be integrated into MELT branch, as an example tools (I am going to see this with Basile Starynkevitch next week). However I would like to know if it makes sense to have documentation and/or link to Talpo available on the GCC wiki? I guess there should be a link at least in the page referencing existing plugins, I will add it. At the beginning of the GSOC, I created a wiki page about the project as I wanted visibility: http://gcc.gnu.org/wiki/CustomizableWarningPlugin. It is rather outdated now, I should use it as a complete documentation or delete it. What make sense ? I know plugins are often considered external, on the others side this could carries useful information about what a plugin can do, how I used GIMPLE representation... (it might be useful to GCC newcomers and people interested by the plugin). Thanks! Pierre Vittet PS: I you want to know more about my plugin: http://gcc.gnu.org/ml/gcc/2011-08/msg00251.html https://gitorious.org/talpo/talpo/blobs/master/README
Re: An unusual x86_64 code model
Hi, On Wed, 17 Aug 2011, Jed Davis wrote: > One thing I'm not so sure about is accepting any SYMBOLIC_CONST as a > legitimate address. That allows, for example, a symbol address cast > to uintptr_t and added to (6ULL << 32), which will never fit. On the > other hand, -fPIC allows offsets of up to +/- 16Mib for some unexplained > reason, The x86-64 ABI specifies this. All symbols have to be located between 0x0 and 2^31-2^24-1, and that is so that everything in memory objects of length less than 2^24 can be addressed directly. Otherwise only the base address of symbols would be addressable directly and any offsetted variant would have to be calculated explicitely. If it weren't for this provision, given this code: global char arr[4096]; char f () { return arr[2]; } the load couldn't use arr+2 directly as that possibly might not fit into 32 bit anymore. Similar things are true for the small PIC models including your new one. That is, as long as symbols are always at most 2^31-2^24-1 away from all ends of referring instructions you can happily accept offsets between +-2^24. Ciao, Michael.
Re: i370 port
Well done! That generated sensible code: L 15,=V(PRINTF) BALR 14,15 L 3,=F'32880' AR3,13 MVC 0(10,3),0(2) I still have the other knock-on effects from when I did this though: C:\devel\gcc\gcc\config\i370>cvs diff i370.h Index: i370.h === RCS file: c:\cvsroot/gcc/gcc/config/i370/i370.h,v retrieving revision 1.17 diff -r1.17 i370.h 599a600,602 #define EXTRA_MEMORY_CONSTRAINT(C, STR) \ ((C) == 'S') (like the 8 byte move from F'0'). I'll do my own investigation of that and report that later. BFN. Paul. -Original Message- From: Ulrich Weigand Sent: Thursday, August 18, 2011 11:14 PM To: Paul Edwards Cc: gcc@gcc.gnu.org Subject: Re: i370 port Paul Edwards wrote: Hi Ulrich. I put in the following debug: op0 = find_replacement (&XEXP (in, 0)); op1 = find_replacement (&XEXP (in, 1)); /* Since constraint checking is strict, commutativity won't be checked, so we need to do that here to avoid spurious failure if the add instruction is two-address and the second operand of the add is the same as the reload reg, which is frequently the case. If the insn would be A = B + A, rearrange it so it will be A = A + B as constrain_operands expects. */ fprintf(stderr, "REGNO(out) is %d\n", REGNO(out)); fprintf(stderr, " REG in 1 is %d\n", REGNO(XEXP(in,1))); if (GET_CODE (XEXP (in, 1)) == REG && REGNO (out) == REGNO (XEXP (in, 1))) tem = op0, op0 = op1, op1 = tem; And it produced this output (for exactly the same code I showed you previously): C:\devel\pdos\s370>\devel\gcc\gcc\gccmvs -da -DUSE_MEMMGR -Os -DS390 -S -I . -I ../pdpclib pdos.c REGNO(out) is 3 REG in 1 is 32880 REGNO(out) is 2 REG in 1 is 32880 REGNO(out) is 2 REG in 1 is 32880 REGNO(out) is 2 REG in 1 is 112 REGNO(out) is 3 REG in 1 is 32880 REGNO(out) is 4 REG in 1 is 112 REGNO(out) is 2 REG in 1 is 112 which looks to me like it is not seeing a register, only a constant, so cannot perform a swap. Oops, there's clearly a bug here. "in" at this point is the original expression that has not yet been reloaded, so its second operand will indeed be a constant, not a register. However, reload has already decided that this constant will end up being replaced by a register, and that is what the "find_replacement" call is checking. So at this point in the program, XEXP (in, 1) will be the constant, but op1 will be the register it is going to be replaced with. Unfortunately the test whether to swap looks at XEXP (in, 1) -- it really needs to look at op1 instead. Can you try changing the lines if (GET_CODE (XEXP (in, 1)) == REG && REGNO (out) == REGNO (XEXP (in, 1))) to if (GET_CODE (op1) == REG && REGNO (out) == REGNO (op1)) instead? Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [named address] ice-on-valid: in postreload.c:reload_cse_simplify_operands
Georg-Johann Lay wrote: > http://gcc.gnu.org/ml/gcc/2011-08/msg00131.html > > Are you going to install that patch? Or maybe you already installed it? No, it isn't approved yet (in fact, it isn't even posted for approval). Usually, patches that add new target macros, or new arguments to target macros, but do not actually add any *exploiter* of the new features, are frowned upon ... Thus, I'd prefer to wait until you have patch ready that exploits this in the AVR backend, and then submit the whole series. > Then, I wonder how the following named AS code translates: > > int a = *((__as int*) 1000); > > As const_int don't have a machmode (yet), I guess the above line just > reads from generic AS and reading from a specific address from named AS > cannot work. This should work fine. Address space processing doesn't really depend on the machine mode; the address space is annotated to the MEM RTX directly. Code like the above ought to generate a MEM with either an immediate CONST_INT operand or one with the immediate loaded into a register, depending on what the target supports, but in either case the MEM_ADDR_SPACE will be set correctly. It's then up the target to implement the access as appropriate. > Moreover, INDEX_REG_CLASS, REGNO_OK_FOR_INDEX_P, HAVE_PRE_INCREMENT et > al. and maybe others are AS-dependent. I agree for INDEX_REG_CLASS and REGNO_OK_FOR_INDEX_P. In fact, I'd suggest they should first be converted to look similar to base registers (i.e. add MODE_CODE_INDEX_REG_CLASS and REGNO_MODE_CODE_OK_FOR_INDEX_P) and then add address space support to those extended macros, so as to avoid having to change lots of back-ends. Do you need this for AVR? I could add that to the patch I posted previously ... Now for HAVE_PRE_INCREMENT, I don't think we need anything special. This is used just as a short-cut to bypass pre-increment handling completely on targets that don't support it at all. On targets that *do*, there will always be additional requirement on just which memory accesses support pre-increment. Therefore, the middle-end will still always check the target's legitimate_address callback to ensure any particular pre-incremented memory access is actually valid. This mechanism can already look at the address space to make its decision ... Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: Fwd: C6X fails to build in FSF mainline
On 08/17/2011 06:45 PM, Andrew Pinski wrote: > gcc/libgcc2.c: In function ‘__gnu_mulsc3’: > gcc/libgcc2.c:1928:1: internal compiler error: in scan_trace, at > dwarf2cfi.c:2433 > Please submit a full bug report, > > I assume that it is because the C6X has more than one delay slot ? Ug. I knew c6x has more than 1 delay slot (indeed, 5). But I'd hoped that they were not annulled. r~
Re: Build report gcc 4.6.1 on Sparc Solaris 10
> I picked up gcc-4.6.1 and startet a build process on a sparc-solaris10 > box with /opt/sfw and SolStudio 12.2 installed. I am using mpc-0.9, > mpfr-3.0.1 and gmp-5.0.2 which I extracted in the gcc source directory > and created a link as stated in the installation > instruction/prerequisites. I also made sure the /usr/ucb is NOT in my path! The installation instructions for SPARC also recommend specific versions of the GMP, MPFR and MPC libraries. And building with the Sun compiler is probably not very well tested indeed. The compiler is known to build flawlessly on this platform with the right libraries and GCC as base compiler. -- Eric Botcazou
Re: Fwd: C6X fails to build in FSF mainline
On 08/18/2011 08:16 AM, Richard Henderson wrote: > On 08/17/2011 06:45 PM, Andrew Pinski wrote: >> gcc/libgcc2.c: In function ‘__gnu_mulsc3’: >> gcc/libgcc2.c:1928:1: internal compiler error: in scan_trace, at >> dwarf2cfi.c:2433 >> Please submit a full bug report, >> >> I assume that it is because the C6X has more than one delay slot ? > > Ug. I knew c6x has more than 1 delay slot (indeed, 5). > But I'd hoped that they were not annulled. ... and it doesn't. The problem is that RTL_CONST_CALL_P and INSN_ANNULLED_BRANCH_P use the same bit. Sigh. r~
Re: Build report gcc 4.6.1 on Sparc Solaris 10
On Thu, 18 Aug 2011, Wolfgang S. Kechel wrote: I ran into huge problems, so I installed binutils 2.21.1, make-3.82 and some other stuff like flex, bison, less all in the latest versions and compiled with native cc. Finally the build of the multilibs failed so I decided to give GNU as/ld from binutils a chance, so I configured with the following command: ../configure --prefix=/usr2/gnu --enable-languages=c,c++ CC=/usr/bin/cc --with-gnu-as --with-as=/usr2/gnu/bin/gas --with-gnu-ld --with-ld=/usr2/gnu/bin/gld You know that the documentation recommends using the Sun linker (and on sparc the Sun assembler is fine), so if there are errors, you should report them. To fix this, I edited lto-plugin/Makefile - yes this is generated. I changed AM_CFLAGS = -Wall -Werror into AM_CFLAGS = http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49907 Already fixed on trunk, not backported (yet). Undefined first referenced symbol in file cimag /archive/sparc-solaris/gcc-4.6.1/ssol/./mpc/src/.libs/libmpc.a(set_x.o) http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49908 Not fixed yet. There is also a RFE at Oracle to inline those functions. ../../../mpc/src/get.c: In function ‘mpc_get_dc’: ../../../mpc/src/get.c:33:11: error: ‘I’ undeclared (first use in this function) ../../../mpc/src/get.c:33:11: note: each undeclared identifier is reported only once for each function it appears in ../../../mpc/src/get.c: In function ‘mpc_get_ldc’: ../../../mpc/src/get.c:39:11: error: ‘I’ undeclared (first use in this function) Strange, this is expected with gcc-3.4 (although there is a workaround in the development version of mpc) but not with gcc-4.6. -- Marc Glisse
gcc-4.5-20110818 is now available
Snapshot gcc-4.5-20110818 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110818/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 177884 You'll find: gcc-4.5-20110818.tar.bz2 Complete GCC MD5=66fda963e1b131c040e323a2ddf5c081 SHA1=11dff30995b6d56d156ccd6a0f414b7f7f120a8d Diffs from 4.5-20110811 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.