New C++ IPA fails
Hi, is somebody already working on the regressions which appeared yesterday, see: https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg01920.html ie: FAIL: g++.dg/ipa/devirt-15.C -std=gnu++98 scan-ipa-dump devirt "Speculatively devirtualizing call" FAIL: g++.dg/ipa/devirt-15.C -std=gnu++11 scan-ipa-dump devirt "Speculatively devirtualizing call" FAIL: g++.dg/ipa/devirt-15.C -std=gnu++1y scan-ipa-dump devirt "Speculatively devirtualizing call" FAIL: g++.dg/ipa/devirt-16.C -std=gnu++98 scan-ipa-dump whole-program "Devirtualizing" FAIL: g++.dg/ipa/devirt-16.C -std=gnu++11 scan-ipa-dump whole-program "Devirtualizing" FAIL: g++.dg/ipa/devirt-16.C -std=gnu++1y scan-ipa-dump whole-program "Devirtualizing" FAIL: g++.dg/ipa/devirt-17.C -std=gnu++98 scan-ipa-dump whole-program "Devirtualizing" FAIL: g++.dg/ipa/devirt-17.C -std=gnu++11 scan-ipa-dump whole-program "Devirtualizing" FAIL: g++.dg/ipa/devirt-17.C -std=gnu++1y scan-ipa-dump whole-program "Devirtualizing" FAIL: g++.dg/ipa/devirt-26.C -std=gnu++98 scan-ipa-dump devirt "Speculatively devirtualizing" FAIL: g++.dg/ipa/devirt-26.C -std=gnu++11 scan-ipa-dump devirt "Speculatively devirtualizing" FAIL: g++.dg/ipa/devirt-26.C -std=gnu++1y scan-ipa-dump devirt "Speculatively devirtualizing" FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++98 scan-tree-dump fre1 "Replacing call target with foo" FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++11 scan-tree-dump fre1 "Replacing call target with foo" FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++1y scan-tree-dump fre1 "Replacing call target with foo" FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++98 scan-tree-dump fre1 "Replacing call target" FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++11 scan-tree-dump fre1 "Replacing call target" FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++1y scan-tree-dump fre1 "Replacing call target" FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++98 scan-tree-dump fre1 "Replacing call target with f" FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++11 scan-tree-dump fre1 "Replacing call target with f" FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++1y scan-tree-dump fre1 "Replacing call target with f" Thanks! Paolo. PS: by the way, I have been also seeing the same guality fails that H.J. sees: FAIL: g++.dg/guality/pr55665.C -O2 line 23 p == 40 FAIL: g++.dg/guality/pr55665.C -O3 -fomit-frame-pointer line 23 p == 40 FAIL: g++.dg/guality/pr55665.C -O3 -g line 23 p == 40 but these are very old, XFAIL maybe?!?
Re: soft-fp functions support without using libgcc
On 21 May 2014 14:13, Sheheryar Zahoor Qazi wrote: >>>Building libgcc is not optional. It is required for all targets. > > So, irrespective whether i provide floating point implementation by > soft-fp, fpu-bit or ieeelib, an error free libgcc build is a MUST? > > What if I dont want to generate calls to libgcc.a but want want gcc to > generate inline code? While this is not possible for all calls, a lot of library calls can be avoided or emitted with a custome ABI by having a suitable expander in the .md file that emits whatever you want. E.g., several of the ARC subtargets/mulitlibs emit inline code for the simpler soft-fp functions, and custom calls to optimized assembler for medium complexity operations. You should really read md.texi and look at optabs.def to get a glimpse of the code generation customization potential of GCC.
Re: New C++ IPA fails
On Thu, May 22, 2014 at 10:49 AM, Paolo Carlini wrote: > Hi, > > is somebody already working on the regressions which appeared yesterday, > see: David, did you forget to run the testsuite? Richard. > https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg01920.html > > ie: > > FAIL: g++.dg/ipa/devirt-15.C -std=gnu++98 scan-ipa-dump devirt > "Speculatively devirtualizing call" > FAIL: g++.dg/ipa/devirt-15.C -std=gnu++11 scan-ipa-dump devirt > "Speculatively devirtualizing call" > FAIL: g++.dg/ipa/devirt-15.C -std=gnu++1y scan-ipa-dump devirt > "Speculatively devirtualizing call" > FAIL: g++.dg/ipa/devirt-16.C -std=gnu++98 scan-ipa-dump whole-program > "Devirtualizing" > FAIL: g++.dg/ipa/devirt-16.C -std=gnu++11 scan-ipa-dump whole-program > "Devirtualizing" > FAIL: g++.dg/ipa/devirt-16.C -std=gnu++1y scan-ipa-dump whole-program > "Devirtualizing" > FAIL: g++.dg/ipa/devirt-17.C -std=gnu++98 scan-ipa-dump whole-program > "Devirtualizing" > FAIL: g++.dg/ipa/devirt-17.C -std=gnu++11 scan-ipa-dump whole-program > "Devirtualizing" > FAIL: g++.dg/ipa/devirt-17.C -std=gnu++1y scan-ipa-dump whole-program > "Devirtualizing" > FAIL: g++.dg/ipa/devirt-26.C -std=gnu++98 scan-ipa-dump devirt > "Speculatively devirtualizing" > FAIL: g++.dg/ipa/devirt-26.C -std=gnu++11 scan-ipa-dump devirt > "Speculatively devirtualizing" > FAIL: g++.dg/ipa/devirt-26.C -std=gnu++1y scan-ipa-dump devirt > "Speculatively devirtualizing" > FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++98 scan-tree-dump fre1 "Replacing > call target with foo" > FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++11 scan-tree-dump fre1 "Replacing > call target with foo" > FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++1y scan-tree-dump fre1 "Replacing > call target with foo" > FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++98 scan-tree-dump fre1 "Replacing > call target" > FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++11 scan-tree-dump fre1 "Replacing > call target" > FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++1y scan-tree-dump fre1 "Replacing > call target" > FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++98 scan-tree-dump fre1 "Replacing > call target with f" > FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++11 scan-tree-dump fre1 "Replacing > call target with f" > FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++1y scan-tree-dump fre1 "Replacing > call target with f" > > > Thanks! > Paolo. > > PS: by the way, I have been also seeing the same guality fails that H.J. > sees: > > FAIL: g++.dg/guality/pr55665.C -O2 line 23 p == 40 > FAIL: g++.dg/guality/pr55665.C -O3 -fomit-frame-pointer line 23 p == 40 > FAIL: g++.dg/guality/pr55665.C -O3 -g line 23 p == 40 > > but these are very old, XFAIL maybe?!? > >
Re: Zero/Sign extension elimination using value ranges
On 21/05/14 17:05, Jakub Jelinek wrote: > On Wed, May 21, 2014 at 12:53:47PM +1000, Kugan wrote: >> On 20/05/14 16:52, Jakub Jelinek wrote: >>> On Tue, May 20, 2014 at 12:27:31PM +1000, Kugan wrote: 1. Handling NOP_EXPR or CONVERT_EXPR that are in the IL because they are required for type correctness. We have two cases here: A) Mode is smaller than word_mode. This is usually from where the zero/sign extensions are showing up in final assembly. For example : int = (int) short which usually expands to (set (reg:SI ) (sext:SI (subreg:HI (reg:SI We can expand this (set (reg:SI ) (((reg:SI If following is true: 1. Value stored in RHS and LHS are of the same signedness 2. Type can hold the value. i.e., In cases like char = (char) short, we check that the value in short is representable char type. (i.e. look at the value range in RHS SSA_NAME and see if that can be represented in types of LHS without overflowing) Subreg here is not a paradoxical subreg. We are removing the subreg and zero/sign extend here. I am assuming here that QI/HI registers are represented in SImode (basically word_mode) with zero/sign extend is used as in (zero_extend:SI (subreg:HI (reg:SI 117)). >>> >>> Wouldn't it be better to just set proper flags on the SUBREG based on value >>> range info (SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_P)? >>> Then not only the optimizers could eliminate in zext/sext when possible, but >>> all other optimizations could benefit from that. >> >> Thanks for the comments. Here is an attempt (attached) that sets >> SUBREG_PROMOTED_VAR_P based on value range into. Is this the good place >> to do this ? > > But you aren't setting it in your patch in any way, you are just resetting > it instead. The thing is, start with a testcase where you get that > (subreg:HI (reg:SI)) as the RTL of some SSA_NAME (is that the case on ARM?, > I believe on e.g. i?86/x86_64 you'd just get (reg:HI) instead and thus you > can't take advantage of that), and at that point where it is created check > the range info and if it is properly sign or zero extended, set > SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_SET. Here is another attempt (a quick hack patch is attached). Is this a reasonable direction? I think I will have to look for other places where SUBREG_PROMOTED_UNSIGNED_P are used for possible optimisations. Before that I want to make sure I am on the right track. > Note that right now we use 2 bits for the latter, which encode values > -1 (weirdo pointer extension), 0 (sign extension), 1 (zero extension). > Perhaps it would be nice to allow encoding value 2 (zero and sign extension) > for cases where the range info tells you that the value is both zero and > sign extended (i.e. minimum of range is >= 0 and maximum is <= signed type > maximum). Do you suggest changing rtx_def to achieve this like the following to be able to store 2 in SUBREG_PROMOTED_UNSIGNED_SET? probably not. - unsigned int unchanging : 1; + unsigned int unchanging : 2; Thanks, Kugan diff --git a/gcc/expr.c b/gcc/expr.c index 2868d9d..15183fa 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -328,7 +328,8 @@ convert_move (rtx to, rtx from, int unsignedp) if (GET_CODE (from) == SUBREG && SUBREG_PROMOTED_VAR_P (from) && (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (from))) >= GET_MODE_PRECISION (to_mode)) - && SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp) + && (SUBREG_PROMOTED_UNSIGNED_P (from) == 2 + || SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp)) from = gen_lowpart (to_mode, from), from_mode = to_mode; gcc_assert (GET_CODE (to) != SUBREG || !SUBREG_PROMOTED_VAR_P (to)); @@ -9195,6 +9196,51 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, } #undef REDUCE_BIT_FIELD +static bool +is_value_extended (tree lhs, enum machine_mode rhs_mode, bool rhs_uns) +{ + wide_int type_min, type_max; + wide_int min, max; + unsigned int prec; + tree lhs_type; + bool lhs_uns; + + if (TREE_CODE (lhs) != SSA_NAME) +return false; + + lhs_type = lang_hooks.types.type_for_mode (rhs_mode, rhs_uns); + lhs_uns = TYPE_UNSIGNED (TREE_TYPE (lhs)); + + /* We remove extension for integrals. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))) +return false; + + /* Get the value range. */ + if (POINTER_TYPE_P (TREE_TYPE (lhs)) + || get_range_info (lhs, &min, &max) != VR_RANGE) +return false; + + prec = min.get_precision (); + type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, TYPE_SIGN (lhs_type)); + type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, TYPE_SIGN (lhs_type)); + + /* Signedness of LHS and RHS should match. */ + if ((rhs_uns != lhs_uns) + && ((rhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type))) + || (!rhs_uns && wi::neg_p (max, TYPE_SIGN (lhs_type) +lhs_uns = !lhs_uns
Re: Zero/Sign extension elimination using value ranges
On Thu, May 22, 2014 at 08:01:45PM +1000, Kugan wrote: > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -328,7 +328,8 @@ convert_move (rtx to, rtx from, int unsignedp) >if (GET_CODE (from) == SUBREG && SUBREG_PROMOTED_VAR_P (from) >&& (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (from))) > >= GET_MODE_PRECISION (to_mode)) > - && SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp) > + && (SUBREG_PROMOTED_UNSIGNED_P (from) == 2 > + || SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp)) > from = gen_lowpart (to_mode, from), from_mode = to_mode; > >gcc_assert (GET_CODE (to) != SUBREG || !SUBREG_PROMOTED_VAR_P (to)); Yeah, something like that, though you'd need to tweak all places where SUBREG_PROMOTED_UNSIGNED_P is used, not just this one. > @@ -9498,7 +9544,11 @@ expand_expr_real_1 (tree exp, rtx target, enum > machine_mode tmode, > > temp = gen_lowpart_SUBREG (mode, decl_rtl); > SUBREG_PROMOTED_VAR_P (temp) = 1; > - SUBREG_PROMOTED_UNSIGNED_SET (temp, unsignedp); > + if (is_value_extended (ssa_name, mode, !unsignedp)) > + SUBREG_PROMOTED_UNSIGNED_SET (temp, 2); > + else > + SUBREG_PROMOTED_UNSIGNED_SET (temp, unsignedp); > + > return temp; > } This will not do what you want, most of the SUBREGs right now are not SUBREG_PROMOTED_VAR_P, the point was to 1) come up with a testcase where you believe this optimization would be useful and you get that you were talking about SUBREG created (don't remember seeing it) 2) find out where that subreg is created, and do set the SUBREG_PROMOTED_VAR_P there if the range info tells you it is ok (all 3 possibilities, either that it is known to be already zero, or sign, or both zero and sign extended). > > diff --git a/gcc/rtl.h b/gcc/rtl.h > index 10ae1e9..f34a7b5 100644 > --- a/gcc/rtl.h > +++ b/gcc/rtl.h > @@ -299,7 +299,7 @@ struct GTY((chain_next ("RTX_NEXT (&%h)"), > 1 in a CONCAT is VAL_EXPR_IS_CLOBBERED in var-tracking.c. > 1 in a preserved VALUE is PRESERVED_VALUE_P in cselib.c. > 1 in a clobber temporarily created for LRA. */ > - unsigned int unchanging : 1; > + unsigned int unchanging : 2; >/* 1 in a MEM or ASM_OPERANDS expression if the memory reference is > volatile. > 1 in an INSN, CALL_INSN, JUMP_INSN, CODE_LABEL, BARRIER, or NOTE > if it has been deleted. No way. SUBREG_PROMOTED_UNSIGNED_P right now resides in two separate bits, volatil and unchanging. Right now volatile != 0, unchanging ignored is -1, volatile == 0, then the value is unchanging. What I meant is change this representation, e.g. to x->volatil * 2 + x->unchanging - 1 so you can represent the values -1, 0, 1, 2 in there. Of course, adjust SUBREG_PROMOTED_UNSIGNED_SET correspondingly too. As SUBREG_PROMOTED_UNSIGNED_P is only valid if SUBREG_PROMOTED_VAR_P, I'd hope that you don't need to care about what 0, 0 in those bits means, because everything should actually SUBREG_PROMOTED_UNSIGNED_SET around setting SUBREG_PROMOTED_VAR_P to non-zero. Jakub
GCC 4.8.4 Status Report (2014-05-22)
Status == GCC 4.8.3 has been released, the branch is now open again under the usual release branch rules (regression fixes and documentation fixes only). Quality Data Priority # Change from last report --- --- P10 P2 92- 8 P3 43+ 1 --- --- Total 135- 7 Previous Report === https://gcc.gnu.org/ml/gcc/2014-05/msg00146.html
GCC 4.8.3 Released
The GNU Compiler Collection version 4.8.3 has been released. GCC 4.8.3 is the third bug-fix release containing important fixes for regressions and serious bugs in GCC 4.8.2 with over 141 bugs fixed since the previous release. This release is available from the FTP servers listed at: http://www.gnu.org/order/ftp.html Please do not contact me directly regarding questions or comments about this release. Instead, use the resources available from http://gcc.gnu.org. As always, a vast number of people contributed to this GCC release -- far too many to thank them individually!
gcc 4.8.3 / PR60901
PR60901 is listed in the bug fixes for gcc 4.8.3, but I don't see the patch ever applied to the 4_8 branch in the bug report, and 4.8.3 is listed as known to fail. -Kenny
Debugging LTO.
Hi, Are there any tricks I can use to debug an LTO ICE? Lto1 --help does not seem to give me an option to output trace dumps etc. What I suspect is happening is that cc1 builds erroneous LTO IR info in the objects that causes the ICEs. Is there a reader that will dump the IR from these LTO objects? AFAICS, this page https://gcc.gnu.org/wiki/LinkTimeOptimization says such a reader is still a TODO. Thanks, Tejas.
Re: New C++ IPA fails
I did -- but very likely there was a process error in my side. Will fix them soon. David On Thu, May 22, 2014 at 2:12 AM, Richard Biener wrote: > On Thu, May 22, 2014 at 10:49 AM, Paolo Carlini > wrote: >> Hi, >> >> is somebody already working on the regressions which appeared yesterday, >> see: > > David, did you forget to run the testsuite? > > Richard. > >> https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg01920.html >> >> ie: >> >> FAIL: g++.dg/ipa/devirt-15.C -std=gnu++98 scan-ipa-dump devirt >> "Speculatively devirtualizing call" >> FAIL: g++.dg/ipa/devirt-15.C -std=gnu++11 scan-ipa-dump devirt >> "Speculatively devirtualizing call" >> FAIL: g++.dg/ipa/devirt-15.C -std=gnu++1y scan-ipa-dump devirt >> "Speculatively devirtualizing call" >> FAIL: g++.dg/ipa/devirt-16.C -std=gnu++98 scan-ipa-dump whole-program >> "Devirtualizing" >> FAIL: g++.dg/ipa/devirt-16.C -std=gnu++11 scan-ipa-dump whole-program >> "Devirtualizing" >> FAIL: g++.dg/ipa/devirt-16.C -std=gnu++1y scan-ipa-dump whole-program >> "Devirtualizing" >> FAIL: g++.dg/ipa/devirt-17.C -std=gnu++98 scan-ipa-dump whole-program >> "Devirtualizing" >> FAIL: g++.dg/ipa/devirt-17.C -std=gnu++11 scan-ipa-dump whole-program >> "Devirtualizing" >> FAIL: g++.dg/ipa/devirt-17.C -std=gnu++1y scan-ipa-dump whole-program >> "Devirtualizing" >> FAIL: g++.dg/ipa/devirt-26.C -std=gnu++98 scan-ipa-dump devirt >> "Speculatively devirtualizing" >> FAIL: g++.dg/ipa/devirt-26.C -std=gnu++11 scan-ipa-dump devirt >> "Speculatively devirtualizing" >> FAIL: g++.dg/ipa/devirt-26.C -std=gnu++1y scan-ipa-dump devirt >> "Speculatively devirtualizing" >> FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++98 scan-tree-dump fre1 "Replacing >> call target with foo" >> FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++11 scan-tree-dump fre1 "Replacing >> call target with foo" >> FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++1y scan-tree-dump fre1 "Replacing >> call target with foo" >> FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++98 scan-tree-dump fre1 "Replacing >> call target" >> FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++11 scan-tree-dump fre1 "Replacing >> call target" >> FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++1y scan-tree-dump fre1 "Replacing >> call target" >> FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++98 scan-tree-dump fre1 "Replacing >> call target with f" >> FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++11 scan-tree-dump fre1 "Replacing >> call target with f" >> FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++1y scan-tree-dump fre1 "Replacing >> call target with f" >> >> >> Thanks! >> Paolo. >> >> PS: by the way, I have been also seeing the same guality fails that H.J. >> sees: >> >> FAIL: g++.dg/guality/pr55665.C -O2 line 23 p == 40 >> FAIL: g++.dg/guality/pr55665.C -O3 -fomit-frame-pointer line 23 p == 40 >> FAIL: g++.dg/guality/pr55665.C -O3 -g line 23 p == 40 >> >> but these are very old, XFAIL maybe?!? >> >>
Re: negative latencies
On 05/21/2014 05:30 PM, Vladimir Makarov wrote: On 2014-05-20, 5:18 PM, shmeel gutl wrote: The problem that I see is that the haifa scheduler schedules one cycle at a time, in a forward order, by picking from a list of instructions that can be scheduled without delays. So, in the above example, if instruction one is scheduled during cycle 3, it can't schedule instruction two during cycle 0, 1, or 2 because its producer dependency (instruction one) hasn't been scheduled yet. It won't be able to schedule it until cycle 3. So I am asking if there is an existing mechanism to back schedule instruction two once instruction one is issued. I see, thanks. There is no such mechanism in the current insn scheduler. Well, the scheduler has support for an exposed pipeline that is used by the C6X port. Insns are split into multiple pieces which are forced to be scheduled at a fixed distance in time from each other, each piece describing the effects that occur at that point in time. This could probably be made to work for this target's requirements, but it might run quite slowly. Bernd
Re: Debugging LTO.
Tejas Belagod wrote: Are there any tricks I can use to debug an LTO ICE? See LTO section on https://gcc.gnu.org/wiki/A_guide_to_testcase_reduction Tobias
Re: New C++ IPA fails
The fix is attached. Ok to commit? David On Thu, May 22, 2014 at 9:11 AM, Xinliang David Li wrote: > I did -- but very likely there was a process error in my side. Will > fix them soon. > > David > > On Thu, May 22, 2014 at 2:12 AM, Richard Biener > wrote: >> On Thu, May 22, 2014 at 10:49 AM, Paolo Carlini >> wrote: >>> Hi, >>> >>> is somebody already working on the regressions which appeared yesterday, >>> see: >> >> David, did you forget to run the testsuite? >> >> Richard. >> >>> https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg01920.html >>> >>> ie: >>> >>> FAIL: g++.dg/ipa/devirt-15.C -std=gnu++98 scan-ipa-dump devirt >>> "Speculatively devirtualizing call" >>> FAIL: g++.dg/ipa/devirt-15.C -std=gnu++11 scan-ipa-dump devirt >>> "Speculatively devirtualizing call" >>> FAIL: g++.dg/ipa/devirt-15.C -std=gnu++1y scan-ipa-dump devirt >>> "Speculatively devirtualizing call" >>> FAIL: g++.dg/ipa/devirt-16.C -std=gnu++98 scan-ipa-dump whole-program >>> "Devirtualizing" >>> FAIL: g++.dg/ipa/devirt-16.C -std=gnu++11 scan-ipa-dump whole-program >>> "Devirtualizing" >>> FAIL: g++.dg/ipa/devirt-16.C -std=gnu++1y scan-ipa-dump whole-program >>> "Devirtualizing" >>> FAIL: g++.dg/ipa/devirt-17.C -std=gnu++98 scan-ipa-dump whole-program >>> "Devirtualizing" >>> FAIL: g++.dg/ipa/devirt-17.C -std=gnu++11 scan-ipa-dump whole-program >>> "Devirtualizing" >>> FAIL: g++.dg/ipa/devirt-17.C -std=gnu++1y scan-ipa-dump whole-program >>> "Devirtualizing" >>> FAIL: g++.dg/ipa/devirt-26.C -std=gnu++98 scan-ipa-dump devirt >>> "Speculatively devirtualizing" >>> FAIL: g++.dg/ipa/devirt-26.C -std=gnu++11 scan-ipa-dump devirt >>> "Speculatively devirtualizing" >>> FAIL: g++.dg/ipa/devirt-26.C -std=gnu++1y scan-ipa-dump devirt >>> "Speculatively devirtualizing" >>> FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++98 scan-tree-dump fre1 "Replacing >>> call target with foo" >>> FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++11 scan-tree-dump fre1 "Replacing >>> call target with foo" >>> FAIL: g++.dg/ipa/imm-devirt-1.C -std=gnu++1y scan-tree-dump fre1 "Replacing >>> call target with foo" >>> FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++98 scan-tree-dump fre1 "Replacing >>> call target" >>> FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++11 scan-tree-dump fre1 "Replacing >>> call target" >>> FAIL: g++.dg/ipa/imm-devirt-2.C -std=gnu++1y scan-tree-dump fre1 "Replacing >>> call target" >>> FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++98 scan-tree-dump fre1 "Replacing >>> call target with f" >>> FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++11 scan-tree-dump fre1 "Replacing >>> call target with f" >>> FAIL: g++.dg/tree-ssa/pr8781.C -std=gnu++1y scan-tree-dump fre1 "Replacing >>> call target with f" >>> >>> >>> Thanks! >>> Paolo. >>> >>> PS: by the way, I have been also seeing the same guality fails that H.J. >>> sees: >>> >>> FAIL: g++.dg/guality/pr55665.C -O2 line 23 p == 40 >>> FAIL: g++.dg/guality/pr55665.C -O3 -fomit-frame-pointer line 23 p == 40 >>> FAIL: g++.dg/guality/pr55665.C -O3 -g line 23 p == 40 >>> >>> but these are very old, XFAIL maybe?!? >>> >>> Index: testsuite/g++.dg/ipa/devirt-17.C === --- testsuite/g++.dg/ipa/devirt-17.C(revision 210819) +++ testsuite/g++.dg/ipa/devirt-17.C(working copy) @@ -1,7 +1,7 @@ /* We shall devirtualize to B::foo since it is the only live candidate of an anonymous type. */ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-ipa-whole-program" } */ +/* { dg-options "-O2 -fdump-ipa-whole-program-details" } */ namespace { class B { public: @@ -37,7 +37,7 @@ main() return b->foo(); } -/* { dg-final { scan-ipa-dump "Devirtualizing" "whole-program"} } */ +/* { dg-final { scan-ipa-dump "devirtualizing" "whole-program"} } */ /* { dg-final { scan-ipa-dump-not "builtin_unreachable" "whole-program"} } */ /* { dg-final { scan-ipa-dump "B::foo" "whole-program"} } */ /* { dg-final { scan-ipa-dump-not "A::foo" "whole-program"} } */ Index: testsuite/g++.dg/ipa/devirt-26.C === --- testsuite/g++.dg/ipa/devirt-26.C(revision 210819) +++ testsuite/g++.dg/ipa/devirt-26.C(working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -fdump-ipa-devirt" } */ +/* { dg-options "-O3 -fdump-ipa-devirt-details" } */ struct A { int a; @@ -25,5 +25,5 @@ int test(void) /* The call to b->foo() is perfectly devirtualizable because C can not be in construction when &c was used, but we can not analyze that so far. Test that we at least speculate that type is in the construction. */ -/* { dg-final { scan-ipa-dump "Speculatively devirtualizing" "devirt" } } */ +/* { dg-final { scan-ipa-dump "speculatively devirtualizing" "devirt" } } */ /* { dg-final { cleanup-ipa-dump "devirt" } } */ Index: testsuite/g++.dg/ipa/imm-devirt-1.C === --- testsuite/g++.dg/ipa/imm-devirt-1.
Re: New C++ IPA fails
> The fix is attached. Ok to commit? OK, thanks! Honza
help understanding behaviour of unsuffixed float constants
I wonder if someone could help shed light on this for me. Based on the statement in ISO C spec (ISO 9899:1990) section '6.1.3.1 Floating Constants' statement "An unsuffixed floating constant has type double." it appears to me that the gcc compiler may not be in strict compliance. --sample code start /* would have expected conversion from float constant double to variable single */ static float a = 100.0; int test(void) { /* force a double and get implicit conversion as expected */ if (a < 200.0E40) { a = 0.0; } /* per ANSI 6.1.3.1 this constant should also be double and implicit conversion is expected */ if (a < 200.0) { a = 0.0; } return 0; } __extendsfdf2(){ ; /* dummy */ } __ltdf2(){ ; /* dummy */ } --sample code end When compiled with '-std=iso9899:199409 -pedantic' I nevertheless see the implicit conversion of 'a' to double only in the case of the first 'if' and not in the second as expected (i.e. 200.0 compiles to float not double as expected by the standard). Also, the binary value of the constant 100.0 is compiled into 32 bits only. The GCC version is 4.2.3 and the build was configured as '-target=powerpc-eabispe'. Does this have anything to do with it? Thanks in advance for any feedback provided. Brian
Re: help understanding behaviour of unsuffixed float constants
On Thu, May 22, 2014 at 07:16:43PM +, Regan, Brian (EPC COE) wrote: > I wonder if someone could help shed light on this for me. Why do you think this is a problem? Conversion from float to double can't raise any exceptions, no bits are lost, and (double) a < 200.0 and a < 200.0f have the exact same set of floating point values for which the condition is true resp. false. So, the compiler chooses to optimize and narrow the comparison back to float comparison instead of double. > static float a = 100.0; > Also, the binary value of the constant 100.0 is compiled into 32 bits only. Sure, because you initialize a float with 100.0, therefore it needs to be converted. In any case, this should have been posted to gcc-help, it has nothing to do with development of gcc. Jakub
RE: help understanding behaviour of unsuffixed float constants
Don't misunderstand - I like the behaviour. I don't want the unnecessary implicit conversions). My concern stems only from the compliance to the standard. Some of our internal software standards require ISO99 compliance, as do standards imposed by our customers (e.g. Boeing via their D6). If what I think is happening is actually happening, I can document a deviation. I just want to make sure that I am not missing something before going that route. Regards Brian -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Thursday, May 22, 2014 3:27 PM To: Regan, Brian (EPC COE) Cc: 'gcc@gcc.gnu.org' Subject: Re: help understanding behaviour of unsuffixed float constants On Thu, May 22, 2014 at 07:16:43PM +, Regan, Brian (EPC COE) wrote: > I wonder if someone could help shed light on this for me. Why do you think this is a problem? Conversion from float to double can't raise any exceptions, no bits are lost, and (double) a < 200.0 and a < 200.0f have the exact same set of floating point values for which the condition is true resp. false. So, the compiler chooses to optimize and narrow the comparison back to float comparison instead of double. > static float a = 100.0; > Also, the binary value of the constant 100.0 is compiled into 32 bits only. Sure, because you initialize a float with 100.0, therefore it needs to be converted. In any case, this should have been posted to gcc-help, it has nothing to do with development of gcc. Jakub
Re: help understanding behaviour of unsuffixed float constants
On Thu, May 22, 2014 at 12:36 PM, Regan, Brian (EPC COE) wrote: > Don't misunderstand - I like the behaviour. I don't want the unnecessary > implicit conversions). > > My concern stems only from the compliance to the standard. Some of our > internal software standards require ISO99 compliance, as do standards imposed > by our customers (e.g. Boeing via their D6). The behavior is compliant. The results are as-if the implicit conversion happened, in all respects that ISO 9899 specifies. > If what I think is happening is actually happening, I can document a > deviation. What do you think is non-compliant? This appears to be trivially allowed under the as-if rule. > I just want to make sure that I am not missing something before going that > route. You appear to be looking at things that are not observable behavior. The compiler is permitted to do anything that doesn't change observable behavior, such as optimizing to avoid a redundant conversion. The standard doesn't specify _how_ the generated code gets the right results, so long as it does. -- James
Re: Reducing Register Pressure through Live range Shrinking through Loops!!
On 05/21/2014 12:25 AM, Ajit Kumar Agarwal wrote: > Hello All: > > Simpson does the Live range shrinking and reduction of register pressure by > using the computation that are not load and store but the arithmetic > computation. The computation > where the operands and registers are live at the entry and exit of the basic > block but not touched inside the block then the computation is moved at the > end of the block the reducing the register pressure inside the block by one. > Extension of the Simpson work by extending the computation not being touched > inside the basic block to the spanning of the Loops. If the Live ranges spans > the Loops and live at the entry and exit of the Loop but the computation is > not being touched inside the Loops then the computation is moved after the > exit of the Loop. > > REDUCTION OF REGISTER PRESSURE THROUGH LIVE RANGE SHRINKING INSIDE THE LOOPS > > for each Loop starting from inner to outer do the following > begin > RELIEFIN(i) = null if i is the entry of the cfg. > Else > For all predecessors j RELIEFOUT(j) > RELIEFOUT(i) = RELIEFIN(i) exposed union relief > INSERT(I,j) = RELIEFOUT(i) RELIEFIN(i) Intersection > Live(i) > end > > The Simpson approach does takes the nesting depth into consideration of > placing the computation and the relieve of the register pressure. Simpson > approach doesn't takes into > consideration the computation which spans throughout the loop and the > operands and results are live at the entry of the Loop and exit of the Loop > but not touched inside the Loops can be useful in reduction of register > pressure inside the Loops. This approach will be useful in Region Based > Register Allocator for Live Range Splitting at the Region Boundaries. > > Extension of the Simpson approach is to consider the data flow analysis with > respect to the given Loop rather than having it for entire control flow > graph. This data flow analysis starts from the inner loop and extends it to > the outer loop. If the reference is not through the nested depth or with some > depth then the computation can be placed accordingly. For register allocator > by Graph coloring the live ranges that are with respect to operands and > results of the computation are taken into consideration and for the above > approach put into the stack during simplification phase of Graph Coloring so > that there is a chance of getting such Live ranges colorable and thus reduces > the register pressure. This is extended to splitting > approach based on containment of Live ranges > > OPTIMAL PLACEMENT OF THE COMPUTATION FOR SINGLE ENTRY AND MULTIPLE EXIT LOOPS > > The placement of the computation to reduce the register pressure for Single > Entry and Multiple exit by Simpson approach lead to unoptimal solution. The > unoptimal Solution > is because of the exit node of the loop does not post dominates all the basic > block inside the Loops. Due to this the placement of the computation just > after the tail block of the Loop will lead to incorrect results. In order to > perform the Optimal Solution of the placement of the computation, the > computation needs to be placed the block just after all the exit points of > the Loop reconverge and which will post dominates all the blocks of the > Loops. This will take care of reducing the register pressure for the Loops > that are single Entry and Multiple Exit. For irreducible Loops the > optimization to convert to reducible is done before the register allocation > that reduces the register pressure and will be applicable to structured > control flow and thus reduces the register pressure. > > The Live range shrinkage reducing register pressure takes load and store into > consideration but not computation as proposed by Simpson. I am proposing to > extend in GCC for the computation to reduce register pressure and for the > Loop as given above for both Single Entry and Single Exit and Single Entry > and Multiple Exit Loops. > > Thanks for sharing with this. I have plans to work on a new register pressure relief pass too this summer (more accurately this work is in my company plans -- so I should work on this anyway). It will use Simpson's approach also. I prefer to do it as a separate pass not as a part of IRA because IRA is already complicated. It also permits to rematerialize not only on loop borders (although it is the most important points). It is hard for me to say what approach will be better as there are too many transformations even after IRA (e.g. in IRA you can think that pseudos got hard registers and rematerilize from that but LRA may spill this pseudos and it is more risk of this for x86/x86-64 than for other architectures as x86/x86-64 uses irregular reg file and has smaller number of regs). So we could implement two approaches and choose the best if you want. As for optimal placement of the computation for single entry/multiple exit loops, I don't think it is really important. IRA alrea
Re: Reducing Register Pressure through Live range Shrinking through Loops!!
On 05/22/2014 10:16 PM, Vladimir Makarov wrote: It also permits to rematerialize not only on loop borders (although it is the most important points). That would certainly be interesting for the following hot subroutine in our weather forecasting model (attached). Note the loop from (line 157): +IF (KINT.EQ.3) THEN C CUBIC INTERPOLATION to (line 242): + + PALFA(JX,JY,4)*PARG(IDX+1,IDY+1,ILEV+1) ) ) ENDDO ENDDO Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news # 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F" # 1 "" # 1 "" # 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F" c Library:grdy $RCSfile$, $Revision: 7536 $ c checked in by $Author: ovignes $ at $Date: 2009-12-18 14:23:36 +0100 (Fri, 18 Dec 2009) $ c $State$, $Locker$ c $Log$ c Revision 1.3 1999/04/22 09:30:45 DagBjoerge c MPP code c c Revision 1.2 1999/03/09 10:23:13 GerardCats c Add SGI paralllellisation directives DOACROSS c c Revision 1.1 1996/09/06 13:12:18 GCats c Created from grdy.apl, 1 version 2.6.1, by Gerard Cats c SUBROUTINE VERINT ( I KLON , KLAT , KLEV , KINT , KHALO I , KLON1 , KLON2 , KLAT1 , KLAT2 I , KP , KQ , KR R , PARG , PRES R , PALFH , PBETH R , PALFA , PBETA , PGAMA ) C C*** C C VERINT - THREE DIMENSIONAL INTERPOLATION C C PURPOSE: C C THREE DIMENSIONAL INTERPOLATION C C INPUT PARAMETERS: C C KLON NUMBER OF GRIDPOINTS IN X-DIRECTION C KLAT NUMBER OF GRIDPOINTS IN Y-DIRECTION C KLEV NUMBER OF VERTICAL LEVELS C KINT TYPE OF INTERPOLATION C= 1 - LINEAR C= 2 - QUADRATIC C= 3 - CUBIC C= 4 - MIXED CUBIC/LINEAR C KLON1 FIRST GRIDPOINT IN X-DIRECTION C KLON2 LAST GRIDPOINT IN X-DIRECTION C KLAT1 FIRST GRIDPOINT IN Y-DIRECTION C KLAT2 LAST GRIDPOINT IN Y-DIRECTION C KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KRARRAY OF INDEXES FOR VERTICAL DISPLACEMENTS C PARG ARRAY OF ARGUMENTS C PALFH ALFA HAT C PBETH BETA HAT C PALFA ARRAY OF WEIGHTS IN X-DIRECTION C PBETA ARRAY OF WEIGHTS IN Y-DIRECTION C PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION C C OUTPUT PARAMETERS: C C PRES INTERPOLATED FIELD C C HISTORY: C C J.E. HAUGEN 1 1992 C C*** C IMPLICIT NONE C INTEGER KLON , KLAT , KLEV , KINT , KHALO, IKLON1 , KLON2 , KLAT1 , KLAT2 C INTEGER KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT) REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV) , RPRES(KLON,KLAT) , R PALFH(KLON,KLAT) , PBETH(KLON,KLAT) , R PALFA(KLON,KLAT,4) , PBETA(KLON,KLAT,4), R PGAMA(KLON,KLAT,4) C INTEGER JX, JY, IDX, IDY, ILEV REAL Z1MAH, Z1MBH C IF (KINT.EQ.1) THEN C LINEAR INTERPOLATION C DO JY = KLAT1,KLAT2 DO JX = KLON1,KLON2 IDX = KP(JX,JY) IDY = KQ(JX,JY) ILEV = KR(JX,JY) C PRES(JX,JY) = PGAMA(JX,JY,1)*( C + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV-1) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV-1) ) ) C+ + + PGAMA(JX,JY,2)*( C+ + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV ) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV ) ) ) ENDDO ENDDO C ELSE +IF (KINT.EQ.2) THEN C QUADRATIC INTERPOLATION C DO JY = KLAT1,KLAT2 DO JX = KLON1,KLON2 IDX = KP(JX,JY) IDY = KQ(JX,JY) ILEV = KR(JX,JY) C PRES(JX,JY) = PGAMA(JX,JY,1)*( C + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV-1) + + PALFA(JX,JY,3)*PARG(IDX+1,IDY-1,ILEV-1) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV-1) + + PALFA(JX,JY,3)*PARG(IDX+1,IDY ,ILEV-1) ) + + PBETA(JX,JY,3)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY+1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY+1,ILEV-1) + + PALFA(JX,JY,3)*PARG(IDX+1,IDY+1,ILEV-1) ) ) C+ + + PGAMA(JX,JY,2)*( C+
Re: Zero/Sign extension elimination using value ranges
On 05/22/2014 03:12 AM, Jakub Jelinek wrote: > No way. SUBREG_PROMOTED_UNSIGNED_P right now resides in two separate bits, > volatil and unchanging. Right now volatile != 0, unchanging ignored > is -1, volatile == 0, then the value is unchanging. > What I meant is change this representation, e.g. to > x->volatil * 2 + x->unchanging - 1 > so you can represent the values -1, 0, 1, 2 in there. > Of course, adjust SUBREG_PROMOTED_UNSIGNED_SET correspondingly too. > As SUBREG_PROMOTED_UNSIGNED_P is only valid if SUBREG_PROMOTED_VAR_P, > I'd hope that you don't need to care about what 0, 0 in those bits > means, because everything should actually SUBREG_PROMOTED_UNSIGNED_SET > around setting SUBREG_PROMOTED_VAR_P to non-zero. It would be helpful to redo these, now that we don't simply have a tri-state value. const unsigned int SRP_POINTER = 0; const unsigned int SRP_SIGNED = 1; const unsigned int SRP_UNSIGNED = 2; #define SUBREG_PROMOTED_SET(RTX, VAL) \ do { \ rtx const _rtx = RTL_FLAG_CHECK1 ("SUBREG_PROMOTED_SET", \ (RTX), SUBREG);\ unsigned int _val = (VAL); \ _rtx->volatil = _val;\ _rtx->unchanging = _val >> 1;\ } while (0) #define SUBREG_PROMOTED_GET(RTX) \ ({ const rtx _rtx = RTL_FLAG_CHECK1 ("SUBREG_PROMOTED_GET", \ (RTX), SUBREG); \ _rtx->volail + _rtx->unchanging * 2; \ }) The bits are arranged such that e.g. SUBREG_PROMOTED_GET (x) & SRP_UNSIGNED is meaningful. For conciseness, we'd probably want SUBREG_PROMOTED_POINTER_P SUBREG_PROMOTED_UNSIGNED_P SUBREG_PROMOTED_SIGNED_P as boolean macros. I dunno if "both" (whatever you want to call that) is used enough to warrant its own macro. I can more often see this being used when examining a given ZERO_/SIGN_EXTEND rtx, so "both" probably won't come up. r~
Re: GCC 4.8.3 Released
On Thu, May 22, 2014 at 01:36:47PM +0200, Richard Biener wrote: > The GNU Compiler Collection version 4.8.3 has been released. > > GCC 4.8.3 is the third bug-fix release containing important fixes for > regressions and serious bugs in GCC 4.8.2 with over 141 bugs fixed since > the previous release. Would it be possible to get 4.8.3 tag to git mirror any soon (BTW, also 4.9.0 is missing)? I would like to try to bisect some regression (PR 60925). A.
gcc-4.8-20140522 is now available
Snapshot gcc-4.8-20140522 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20140522/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 210829 You'll find: gcc-4.8-20140522.tar.bz2 Complete GCC MD5=c2fb95cabfd470c8dd991aca05cb359b SHA1=4dfad9e1f1450af3c185be9429fbfce74e97488f Diffs from 4.8-20140515 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.