[PATCH] Fix recent avx512vl-vpsh*dvd-2.c FAILs (PR target/91124)
Hi! A recent SCCVN change results in more constants (in these testcases constant vectors) into arguments of builtins and because of a bug in the expansion of some of them we ended up with: UNRESOLVED: gcc.target/i386/avx512vl-vpshldvd-2.c compilation failed to produce executable FAIL: gcc.target/i386/avx512vl-vpshldvq-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512vl-vpshldvq-2.c compilation failed to produce executable FAIL: gcc.target/i386/avx512vl-vpshrdvd-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512vl-vpshrdvd-2.c compilation failed to produce executable FAIL: gcc.target/i386/avx512vl-vpshrdvq-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512vl-vpshrdvq-2.c compilation failed to produce executable The following patch fixes that. The bug was using wrong (in fact, totally unneeded) function types for the builtins. All of them return an integral vector and have 4 arguments, where the first 3 are same kind vectors as the return value and the last one is the mask (in one case __mmask32, in two cases __mmask16, in the rest __mmask8). Kirill added types like V32HI_FTYPE_V32HI_V32HI_V32HI for the non-masked ones (haven't really checked if those builtins without _mask or _maskz suffixes couldn't be dropped and use -1 masks in the headers instead for _mask), but for the masked ones added types like V32HI_FTYPE_V32HI_V32HI_V32HI_INT where we already had V32HI_FTYPE_V32HI_V32HI_V32HI_USI to represent such builtins with __mmask32 last argument. What is worse, the handling of those new function types has been added to the nargs = 4; mask_pos = 1; nargs_constant = 1; cases, which is what is done for builtins which require some the third argument of four to be immediate and the last argument is a mask integer, instead of the nargs = 4; case we want (we don't want to ask for any immediates). This bug didn't trigger before because if that third argument satisfies the predicate, which means it is either a register_operand or memory_operand in these cases, all is fine, it is just when it doesn't satisfy that we mishandle it (and CONST_VECTOR doesn't satisfy nonimmediate_operand). Fixed by removing all those unnecessary function types and using the right ones. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-07-11 Jakub Jelinek PR target/91124 * config/i386/i386-builtin-types.def (V32HI_FTYPE_V32HI_V32HI_V32HI_INT, V16HI_FTYPE_V16HI_V16HI_V16HI_INT, V8HI_FTYPE_V8HI_V8HI_V8HI_INT, V8SI_FTYPE_V8SI_V8SI_V8SI_INT, V4DI_FTYPE_V4DI_V4DI_V4DI_INT, V8DI_FTYPE_V8DI_V8DI_V8DI_INT, V16SI_FTYPE_V16SI_V16SI_V16SI_INT, V2DI_FTYPE_V2DI_V2DI_V2DI_INT, V4SI_FTYPE_V4SI_V4SI_V4SI_INT): Remove. * config/i386/i386-builtin.def (__builtin_ia32_vpshrdv_v32hi_mask, __builtin_ia32_vpshrdv_v32hi_maskz, __builtin_ia32_vpshrdv_v16hi_mask, __builtin_ia32_vpshrdv_v16hi_maskz, __builtin_ia32_vpshrdv_v8hi_mask, __builtin_ia32_vpshrdv_v8hi_maskz, __builtin_ia32_vpshrdv_v16si_mask, __builtin_ia32_vpshrdv_v16si_maskz, __builtin_ia32_vpshrdv_v8si_mask, __builtin_ia32_vpshrdv_v8si_maskz, __builtin_ia32_vpshrdv_v4si_mask, __builtin_ia32_vpshrdv_v4si_maskz, __builtin_ia32_vpshrdv_v8di_mask, __builtin_ia32_vpshrdv_v8di_maskz, __builtin_ia32_vpshrdv_v4di_mask, __builtin_ia32_vpshrdv_v4di_maskz, __builtin_ia32_vpshrdv_v2di_mask, __builtin_ia32_vpshrdv_v2di_maskz, __builtin_ia32_vpshldv_v32hi_mask, __builtin_ia32_vpshldv_v32hi_maskz, __builtin_ia32_vpshldv_v16hi_mask, __builtin_ia32_vpshldv_v16hi_maskz, __builtin_ia32_vpshldv_v8hi_mask, __builtin_ia32_vpshldv_v8hi_maskz, __builtin_ia32_vpshldv_v16si_mask, __builtin_ia32_vpshldv_v16si_maskz, __builtin_ia32_vpshldv_v8si_mask, __builtin_ia32_vpshldv_v8si_maskz, __builtin_ia32_vpshldv_v4si_mask, __builtin_ia32_vpshldv_v4si_maskz, __builtin_ia32_vpshldv_v8di_mask, __builtin_ia32_vpshldv_v8di_maskz, __builtin_ia32_vpshldv_v4di_mask, __builtin_ia32_vpshldv_v4di_maskz, __builtin_ia32_vpshldv_v2di_mask, __builtin_ia32_vpshldv_v2di_maskz, __builtin_ia32_vpdpbusd_v16si_mask, __builtin_ia32_vpdpbusd_v16si_maskz, __builtin_ia32_vpdpbusd_v8si_mask, __builtin_ia32_vpdpbusd_v8si_maskz, __builtin_ia32_vpdpbusd_v4si_mask, __builtin_ia32_vpdpbusd_v4si_maskz, __builtin_ia32_vpdpbusds_v16si_mask, __builtin_ia32_vpdpbusds_v16si_maskz, __builtin_ia32_vpdpbusds_v8si_mask, __builtin_ia32_vpdpbusds_v8si_maskz, __builtin_ia32_vpdpbusds_v4si_mask, __builtin_ia32_vpdpbusds_v4si_maskz, __builtin_ia32_vpdpwssd_v16si_mask, __builtin_ia32_vpdpwssd_v16si_maskz, __builtin_ia32_vpdpwssd_v8si_mask, __builtin_ia32_vpdpwssd_v8si_maskz, __builtin_ia32_vpdpwssd_v4si_mask, __builtin_ia32_vpdpwssd_v4si_maskz, __builtin_ia32_vpdpws
Re: [PATCH] Deprecate -frepo on gcc-9 branch (PR c++/91125).
On Thu, Jul 11, 2019 at 08:48:52AM +0200, Martin Liška wrote: > --- a/gcc/c-family/c-opts.c > +++ b/gcc/c-family/c-opts.c > @@ -501,6 +501,8 @@ c_common_handle_option (size_t scode, const char *arg, > HOST_WIDE_INT value, >flag_use_repository = value; >if (value) > flag_implicit_templates = 0; > + warning (0, "%<-frepo%> is deprecated and will be removed " > +"in a future release"); >break; Maybe just warn when value is true, or do we want to warn for -fno-repo too? Jakub
[PATCH] Fix recent avx512vl-vcvttpd2*dq-2.c FAILs (PR target/91124)
Hi! The same SCCVN change also introduced: FAIL: gcc.target/i386/avx512vl-vcvttpd2dq-2.c execution test FAIL: gcc.target/i386/avx512vl-vcvttpd2udq-2.c execution test failures, the reason in the generic code is similar, more constants propagated to builtins, but the actual bug is very different. The 128-bit instructions like vcvtpd2dqx when using non-{z} masking don't behave like their RTL pattern describes. The instructions only use low 2 bits of the mask register and the upper 2 elements of the 128-bit destination register are always cleared regardless of what the bit 2 and 3 in the mask register say and what is in the destination before, so it is incorrect to use: (set (match_operand:V4SI 0 "register_operand" "=v") (vec_merge:V4SI (vec_concat:V4SI (fix:V2SI (match_operand:V2DF 1 "vector_operand" "vBm")) (const_vector:V2SI [ (const_int 0 [0]) (const_int 0 [0]) ])) (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") (match_operand:QI 3 "register_operand" "Yk"))) which is correct for the low 2 elements, but for the upper 2 elements says that if the mask bit 2 (or 3) is set, then the element will be zero (correct), but if it is clear, then it will be operands[2] element (correct if it is const0_operand, but if it is some nonimmediate and doesn't have zeros there, it is incorrect). The RTL has been correct already for similar instructions, e.g. *floatv2div2sf2{,_mask{,_1}} or some narrowing movs, this patch just uses similar RTL for those. The _mask_1 patterns are added as PR69671 will trigger on these new patterns as well, for {z} masking RTL simplification can fold that second vec_merge operand into CONST0_RTX. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-07-11 Jakub Jelinek PR target/91124 * config/i386/sse.md (sse2_cvtpd2dq): Change into ... (sse2_cvtpd2dq): ... this. Remove mask substitution macros. (sse2_cvtpd2dq_mask, sse2_cvtpd2dq_mask_1): New define_insns. (ufix_notruncv2dfv2si2): Change into ... (ufix_notruncv2dfv2si2): ... this. Remove mask substitution macros. (ufix_notruncv2dfv2si2_mask, ufix_notruncv2dfv2si2_mask_1): New define_insns. (ufix_truncv2dfv2si2): Change into ... (ufix_truncv2dfv2si2): ... this. Remove mask substitution macros. (ufix_truncv2dfv2si2_mask, ufix_truncv2dfv2si2_mask_1): New define_insns. (sse2_cvttpd2dq): Change into ... (sse2_cvttpd2dq): ... this. Remove mask substitution macros. (sse2_cvttpd2dq_mask, sse2_cvttpd2dq_mask_1): New define_insns. (*sse2_cvtpd2dq): Change into ... (*sse2_cvtpd2dq): ... this. Remove mask substitution macros. Add "C" constraint to const0_operand. (*sse2_cvtpd2dq_mask, *sse2_cvtpd2dq_mask_1): New define_insns. (sse2_cvtpd2ps_mask): Adjust expand to match *sse2_cvtpd2ps_mask changes. --- gcc/config/i386/sse.md.jj 2019-07-08 23:52:40.239757801 +0200 +++ gcc/config/i386/sse.md 2019-07-10 22:50:15.021381492 +0200 @@ -5927,16 +5927,16 @@ (define_insn "*avx_cvtpd2dq256_2" (set_attr "btver2_decode" "vector") (set_attr "mode" "OI")]) -(define_insn "sse2_cvtpd2dq" +(define_insn "sse2_cvtpd2dq" [(set (match_operand:V4SI 0 "register_operand" "=v") (vec_concat:V4SI (unspec:V2SI [(match_operand:V2DF 1 "vector_operand" "vBm")] UNSPEC_FIX_NOTRUNC) (const_vector:V2SI [(const_int 0) (const_int 0)])))] - "TARGET_SSE2 && " + "TARGET_SSE2" { if (TARGET_AVX) -return "vcvtpd2dq{x}\t{%1, %0|%0, %1}"; +return "vcvtpd2dq{x}\t{%1, %0|%0, %1}"; else return "cvtpd2dq\t{%1, %0|%0, %1}"; } @@ -5949,6 +5949,38 @@ (define_insn "sse2_cvtpd2dq" (set_attr "athlon_decode" "vector") (set_attr "bdver1_decode" "double")]) +(define_insn "sse2_cvtpd2dq_mask" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] + UNSPEC_FIX_NOTRUNC) + (vec_select:V2SI + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") + (parallel [(const_int 0) (const_int 1)])) + (match_operand:QI 3 "register_operand" "Yk")) + (const_vector:V2SI [(const_int 0) (const_int 0)])))] + "TARGET_AVX512VL" + "vcvtpd2dq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "*sse2_cvtpd2dq_mask_1" + [(set (match_operand:V4SI 0 "register_operand" "=v") + (vec_concat:V4SI + (vec_merge:V2SI + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] + UNSPEC_FIX_NOTRUNC) + (const_vector:V2SI [(const_int 0) (const_
Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)
On 08.07.19 23:19, Matthias Klose wrote: > On 14.06.19 15:09, Gaius Mulley wrote: >> >> Hello, >> >> here is version two of the patches which introduce Modula-2 into the >> GCC trunk. The patches include: >> >> (*) a patch to allow all front ends to register a lang spec function. >>(included are patches for all front ends to provide an empty >> callback function). >> (*) patch diffs to allow the Modula-2 front end driver to be >>built using GCC Makefile and friends. >> >> The compressed tarball includes: >> >> (*) gcc/m2 (compiler driver and lang-spec stuff for Modula-2). >>Including the need for registering lang spec functions. >> (*) gcc/testsuite/gm2 (a Modula-2 dejagnu test to ensure that >>the gm2 driver is built and can understands --version). >> >> These patches have been re-written after taking on board the comments >> found in this thread: >> >>https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02620.html >> >> it is a revised patch set from: >> >>https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00220.html >> >> I've run make bootstrap and run the regression tests on trunk and no >> extra failures occur for all languages touched in the ChangeLog. >> >> I'm currently tracking gcc trunk and gcc-9 with gm2 (which works well >> with amd64/arm64/i386) - these patches are currently simply for the >> driver to minimise the patch size. There are also > 1800 tests in a >> dejagnu testsuite for gm2 which can be included at some future time. > > I had a look at the GCC 9 version of the patches, with a build including a > make > install. Some comments: Had a test build based on the gcc-9 branch, https://launchpad.net/~doko/+archive/ubuntu/toolchain/+sourcepub/10331180/+listing-archive-extra powerpc64le-linux-gnu fails to build (search for "unfinished" in the build log) during RTL pass: final ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def: In function '_M2_SYSTEM_init': ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def:20: internal compiler error: in rs6000_output_function_epilogue, at conf ig/rs6000/rs6000.c:29169 20 | DEFINITION MODULE SYSTEM ; | 0x10b6b7c7 rs6000_output_function_epilogue ../../src/gcc/config/rs6000/rs6000.c:29169 0x1043f80f final_end_function() ../../src/gcc/final.c:1887 0x10445313 rest_of_handle_final ../../src/gcc/final.c:4667 0x10445313 execute ../../src/gcc/final.c:4737 Please submit a full bug report, with preprocessed source if appropriate. this is using GCC 8 as the bootstrap compiler. search the build logs for "test_summary" to see the test results. The binary packages gcc-9-test-results contain the log/sum files for the tests. all the link tests fail with: xgm2: fatal error: cannot execute 'gm2l': execvp: No such file or directory compilation terminated. compiler exited with status 1 Matthias
Re: PR90723
On Thu, 11 Jul 2019 at 01:48, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > Hi, > > For following test-case: > > > > typedef double v4df __attribute__ ((vector_size (32))); > > void foo(v4df); > > > > int > > main () > > { > > volatile v4df x1; > > x1 = (v4df) { 10.0, 20.0, 30.0, 40.0 }; > > foo (x1); > > return 0; > > } > > > > Compiling with -msve-vector-bits=256, the compiler goes into infinite > > recursion and eventually segfaults due to stack overflow. > > > > This happens during expansion of: > > x1.0_1 ={v} x1; > > > > aarch64_expand_sve_mem_move calls aarch64_emit_sve_pred_move with > > dest = (reg:VNx2DF 93) and > > src = (mem/u/c:VNx2DF > >(plus:DI (reg/f:DI 94) (const_int 32 [0x20])) [0 S32 A128]) > > > > aarch64_emit_sve_pred_move calls expand_insn with above ops. > > Eventually we hit EXPAND_INPUT case in maybe_legitimize_operand for > > src (opno == 2) > > > > Since the operand is marked with volatile, and volatile_ok is set to false, > > insn_operand_matches return false and we call: > > op->value = copy_to_mode_reg (mode, op->value); > > break; > > > > copy_to_mode_reg however, creates a fresh register and calls emit_move_insn: > > rtx temp = gen_reg_rtx (mode); > > if (x != temp) > > emit_move_insn (temp, x); > > > > and we again end up in aarch64_emit_sve_pred_move, with dest assigned > > the new register and src remaining unchanged, and thus the cycle continues. > > > > As suggested by Richard, the attached patch temporarily sets volatile_ok to > > true > > using RAII class volatile_ok_temp in aarch64_emit_sve_pred_move which > > avoids the recursion. > > > > Bootstrap + tested on x86_64-unknown-linux-gnu, aarch64-linux-gnu. > > Cross-testing with SVE in progress. > > OK to commit ? > > > > Thanks, > > Prathamesh > > > > 2019-07-09 Prathamesh Kulkarni > > > > PR target/90723 > > * recog.h (volatile_ok_temp): New class. > > * config/aarch64/aarch64.c (aarch64_emit_sve_pred_move): Set > > volatile_ok temporarily to true using volatile_ok_temp. > > * expr.c (emit_block_move_via_cpymem): Likewise. > > * optabs.c (maybe_legitimize_operand): Likewise. > > I'm the last person who should be suggesting names, but how about > temporary_volatile_ok? Putting "temp" after the name IMO makes it > sound too much like it's describing the RAII object rather than the > value of volatile_ok itself. > > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > > index 5a923ca006b..50afba1a2b6 100644 > > --- a/gcc/config/aarch64/aarch64.c > > +++ b/gcc/config/aarch64/aarch64.c > > @@ -3457,6 +3457,7 @@ aarch64_emit_sve_pred_move (rtx dest, rtx pred, rtx > > src) > >create_output_operand (&ops[0], dest, mode); > >create_input_operand (&ops[1], pred, GET_MODE(pred)); > >create_input_operand (&ops[2], src, mode); > > + volatile_ok_temp v(true); > > Missing space before "(". Same for later uses. > > > [...] > > diff --git a/gcc/recog.h b/gcc/recog.h > > index 75cbbdc10ad..8a8eaf7e0c3 100644 > > --- a/gcc/recog.h > > +++ b/gcc/recog.h > > @@ -186,6 +186,22 @@ skip_alternative (const char *p) > > /* Nonzero means volatile operands are recognized. */ > > extern int volatile_ok; > > > > +/* RAII class for temporarily setting volatile_ok. */ > > + > > +class volatile_ok_temp > > +{ > > +public: > > + volatile_ok_temp(int value): save_volatile_ok (volatile_ok) > > Missing space before "(" and before ":". > > > + { > > +volatile_ok = value; > > + } > > + > > + ~volatile_ok_temp() { volatile_ok = save_volatile_ok; } > > Missing space before "(". > > > + > > +private: > > + int save_volatile_ok; > > For extra safety we should probably have a private copy constructor > (declaration only, no implementation). We're still C++03 and so can't > use "= delete". Thanks for the suggestions. Does the attached version look OK ? Thanks, Prathamesh > > Thanks, > Richard > > > > +}; > > + > > /* Set by constrain_operands to the number of the alternative that > > matched. */ > > extern int which_alternative; 2019-07-11 Prathamesh Kulkarni PR target/90723 * recog.h (temporary_volatile_ok): New class. * config/aarch64/aarch64.c (aarch64_emit_sve_pred_move): Set volatile_ok temporarily to true using temporary_volatile_ok. * expr.c (emit_block_move_via_cpymem): Likewise. * optabs.c (maybe_legitimize_operand): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5a923ca006b..abdf4b1c87a 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3457,6 +3457,7 @@ aarch64_emit_sve_pred_move (rtx dest, rtx pred, rtx src) create_output_operand (&ops[0], dest, mode); create_input_operand (&ops[1], pred, GET_MODE(pred)); create_input_operand (&ops[2], src, mode); + temporary_volatile_ok v (true); expand_insn (code_for_aarch64_pred_mov (mode), 3, ops); } diff --git a/gcc/
[Ada] No warning for guaranteed accessibility check failures
This patch corrects the generation of dynamic accessibility checks which are guaranteed to trigger errors during run time so as to give the user proper warning during unit compiliation. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Justin Squirek gcc/ada/ * checks.adb (Apply_Accessibility_Check): Add check for constant folded conditions on accessibility checks. gcc/testsuite/ * gnat.dg/access7.adb: New testcase.--- gcc/ada/checks.adb +++ gcc/ada/checks.adb @@ -577,6 +577,7 @@ package body Checks is Typ : Entity_Id; Insert_Node : Node_Id) is + Check_Cond : Node_Id; Loc : constant Source_Ptr := Sloc (N); Param_Ent : Entity_Id := Param_Entity (N); Param_Level : Node_Id; @@ -638,15 +639,29 @@ package body Checks is -- Raise Program_Error if the accessibility level of the access -- parameter is deeper than the level of the target access type. + Check_Cond := Make_Op_Gt (Loc, + Left_Opnd => Param_Level, + Right_Opnd => Type_Level); + Insert_Action (Insert_Node, Make_Raise_Program_Error (Loc, - Condition => - Make_Op_Gt (Loc, - Left_Opnd => Param_Level, - Right_Opnd => Type_Level), - Reason => PE_Accessibility_Check_Failed)); + Condition => Check_Cond, + Reason=> PE_Accessibility_Check_Failed)); Analyze_And_Resolve (N); + + -- If constant folding has happened on the condition for the + -- generated error, then warn about it being unconditional. + + if Nkind (Check_Cond) = N_Identifier + and then Entity (Check_Cond) = Standard_True + then +Error_Msg_Warn := SPARK_Mode /= On; +Error_Msg_N + ("accessibility check fails<<", N); +Error_Msg_N + ("\Program_Error [<<", N); + end if; end if; end Apply_Accessibility_Check; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/access7.adb @@ -0,0 +1,79 @@ +-- { dg-do run } + +with Interfaces; use Interfaces; + +procedure Access7 is + type t_p_string is access constant String; + subtype t_hash is Unsigned_32; + + -- Return a hash value for a given string + function hash(s: String) return t_hash is + h: t_hash := 0; + g: t_hash; + begin + for i in s'Range loop + h := Shift_Left(h, 4) + t_hash'(Character'Pos(s(i))); + g := h and 16#F000_#; + if (h and g) /= 0 then +h := h xor ((Shift_Right(g, 24) and 16#FF#) or g); + end if; + end loop; + return h; + end hash; + + type hash_entry is record + v: t_p_string; + hash: t_hash; + next: access hash_entry; + end record; + + type hashtable is array(t_hash range <>) of access hash_entry; + + protected pool is + procedure allocate (sp: out t_p_string; s: String; h: t_hash); + private + tab: hashtable(0..19-1) := (others => null); + end pool; + + protected body pool is + procedure allocate(sp: out t_p_string; s: String; h: t_hash) is + p: access hash_entry; + slot: t_hash; + begin + slot := h mod tab'Length; + p := tab(slot); + while p /= null loop +-- quickly check hash, then length, only then slow comparison +if p.hash = h and then p.v.all'Length = s'Length + and then p.v.all = s +then + sp := p.v; -- shared string + return; +end if; +p := p.next; + end loop; + -- add to table + p := new hash_entry'(v=> new String'(s), + hash => h, + next => tab(slot)); + tab(slot) := p; -- { dg-warning "accessibility check fails|Program_Error will be raised at run time" } + sp := p.v; -- shared string + end allocate; + end pool; + + -- Return the pooled string equal to a given String + function new_p_string(s: String) return t_p_string is + sp: t_p_string; + begin + pool.allocate(sp, s, hash(s)); + return sp; + end new_p_string; + + foo_string : t_p_string; +begin + foo_string := new_p_string("foo"); + raise Constraint_Error; +exception + when Program_Error => + null; +end Access7;
Rework constant subreg folds and handle more variable-length cases
This patch rewrites the way simplify_subreg handles constants. It uses similar native_encode/native_decode routines to the tree-level handling of VIEW_CONVERT_EXPR, meaning that we can move between rtx constants and the target memory image of them. The main point of this patch is to support subregs of constant-length vectors for VLA vectors, beyond the very simple cases that were already handled. Many of the new tests failed before the patch for variable- length vectors. The boolean side is tested more by the upcoming SVE ACLE work. Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. OK to install? Richard 2019-07-11 Richard Sandiford gcc/ * defaults.h (TARGET_UNIT): New macro. (target_unit): New type. * rtl.h (native_encode_rtx, native_decode_rtx) (native_decode_vector_rtx, subreg_size_lsb): Declare. (subreg_lsb_1): Turn into an inline wrapper around subreg_size_lsb. * rtlanal.c (subreg_lsb_1): Delete. (subreg_size_lsb): New function. * simplify-rtx.c: Include rtx-vector-builder.h (simplify_immed_subreg): Delete. (native_encode_rtx, native_decode_vector_rtx, native_decode_rtx) (simplify_const_vector_byte_offset, simplify_const_vector_subreg): New functions. (simplify_subreg): Use them. (test_vector_subregs_modes, test_vector_subregs_repeating) (test_vector_subregs_fore_back, test_vector_subregs_stepped) (test_vector_subregs): New functions. (test_vector_ops): Call test_vector_subregs for integer vector modes with at least 2 elements. Index: gcc/defaults.h === *** gcc/defaults.h 2019-07-11 08:33:57.0 +0100 --- gcc/defaults.h 2019-07-11 08:33:58.069250175 +0100 *** #define TARGET_VTABLE_USES_DESCRIPTORS 0 *** 1459,1462 --- 1459,1474 #define DWARF_GNAT_ENCODINGS_DEFAULT DWARF_GNAT_ENCODINGS_GDB #endif + /* Done this way to keep gengtype happy. */ + #if BITS_PER_UNIT == 8 + #define TARGET_UNIT uint8_t + #elif BITS_PER_UNIT == 16 + #define TARGET_UNIT uint16_t + #elif BITS_PER_UNIT == 32 + #define TARGET_UNIT uint32_t + #else + #error Unknown BITS_PER_UNIT + #endif + typedef TARGET_UNIT target_unit; + #endif /* ! GCC_DEFAULTS_H */ Index: gcc/rtl.h === *** gcc/rtl.h 2019-07-11 08:33:57.0 +0100 --- gcc/rtl.h 2019-07-11 08:33:58.069250175 +0100 *** extern int rtx_cost (rtx, machine_mode, *** 2400,2411 extern int address_cost (rtx, machine_mode, addr_space_t, bool); extern void get_full_rtx_cost (rtx, machine_mode, enum rtx_code, int, struct full_rtx_costs *); extern poly_uint64 subreg_lsb (const_rtx); ! extern poly_uint64 subreg_lsb_1 (machine_mode, machine_mode, poly_uint64); extern poly_uint64 subreg_size_offset_from_lsb (poly_uint64, poly_uint64, poly_uint64); extern bool read_modify_subreg_p (const_rtx); /* Return the subreg byte offset for a subreg whose outer mode is OUTER_MODE, whose inner mode is INNER_MODE, and where there are LSB_SHIFT *bits* between the lsb of the outer value and the lsb of --- 2400,2429 extern int address_cost (rtx, machine_mode, addr_space_t, bool); extern void get_full_rtx_cost (rtx, machine_mode, enum rtx_code, int, struct full_rtx_costs *); + extern bool native_encode_rtx (machine_mode, rtx, vec &, + unsigned int, unsigned int); + extern rtx native_decode_rtx (machine_mode, vec, + unsigned int); + extern rtx native_decode_vector_rtx (machine_mode, vec, +unsigned int, unsigned int, unsigned int); extern poly_uint64 subreg_lsb (const_rtx); ! extern poly_uint64 subreg_size_lsb (poly_uint64, poly_uint64, poly_uint64); extern poly_uint64 subreg_size_offset_from_lsb (poly_uint64, poly_uint64, poly_uint64); extern bool read_modify_subreg_p (const_rtx); + /* Given a subreg's OUTER_MODE, INNER_MODE, and SUBREG_BYTE, return the +bit offset at which the subreg begins (counting from the least significant +bit of the operand). */ + + inline poly_uint64 + subreg_lsb_1 (machine_mode outer_mode, machine_mode inner_mode, + poly_uint64 subreg_byte) + { + return subreg_size_lsb (GET_MODE_SIZE (outer_mode), + GET_MODE_SIZE (inner_mode), subreg_byte); + } + /* Return the subreg byte offset for a subreg whose outer mode is OUTER_MODE, whose inner mode is INNER_MODE, and where there are LSB_SHIFT *bits* between the lsb of the outer value and the lsb of Index: gcc/rtlanal.c === *** gcc/rtlanal.c 2019-07-11 08:33:57.
[Ada] Avoid spurious warning on wrong order of operator call arguments
GNAT issues a warning under -gnatwa when actuals for a call are named like the formals, but in a different order. This is inappropriate for calls to operators in infix form, when e.g. Right <= Left is in general the intended order. Special case calls to operators to avoid that spurious warning. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Yannick Moy gcc/ada/ * sem_res.adb (Check_Argument_Order): Special case calls to operators. gcc/testsuite/ * gnat.dg/warn21.adb, gnat.dg/warn21.ads: New testcase.--- gcc/ada/sem_res.adb +++ gcc/ada/sem_res.adb @@ -3458,12 +3458,17 @@ package body Sem_Res is begin -- Nothing to do if no parameters, or original node is neither a -- function call nor a procedure call statement (happens in the - -- operator-transformed-to-function call case), or the call does + -- operator-transformed-to-function call case), or the call is to an + -- operator symbol (which is usually in infix form), or the call does -- not come from source, or this warning is off. if not Warn_On_Parameter_Order or else No (Parameter_Associations (N)) or else Nkind (Original_Node (N)) not in N_Subprogram_Call + or else (Nkind (Name (N)) = N_Identifier + and then Present (Entity (Name (N))) + and then Nkind (Entity (Name (N))) + = N_Defining_Operator_Symbol) or else not Comes_From_Source (N) then return; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/warn21.adb @@ -0,0 +1,6 @@ +-- { dg-do compile } +-- { dg-options "-gnata -gnatwa" } + +package body Warn21 is + procedure Foo is null; +end Warn21; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/warn21.ads @@ -0,0 +1,18 @@ +package Warn21 is + + type Set is new Integer; + + function "<=" (Left : Set; Right : Set) return Boolean; + + function "=" (Left : Set; Right : Set) return Boolean with + Post => "="'Result = (Left <= Right and Right <= Left); + + procedure Foo; + +private + + function "<=" (Left : Set; Right : Set) return Boolean is (True); + function "=" (Left : Set; Right : Set) return Boolean is + (Left <= Right and Right <= Left); + +end Warn21;
[Ada] Compile-time evaluation of predicate checks
This patch recognizes case of dynamic predicates on integer subtypes that are simple enough to be evaluated statically when the argument is itself a literal. Even though in many cases such predicate checks will be removed by the back-end with any level of optimization, it is preferable to perform this constant folding in the front-end, wich also cleans up the output of CCG, as well as producing explicit warnings when the test will fail. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Ed Schonberg gcc/ada/ * exp_ch6.adb (Can_Fold_Predicate_Call): New function, subsidiary of Expand_Call_Helper, to compute statically a predicate check when the argument is a static integer. gcc/testsuite/ * gnat.dg/predicate11.adb: New testcase.--- gcc/ada/exp_ch6.adb +++ gcc/ada/exp_ch6.adb @@ -2319,6 +2319,13 @@ package body Exp_Ch6 is -- Adds invariant checks for every intermediate type between the range -- of a view converted argument to its ancestor (from parent to child). + function Can_Fold_Predicate_Call (P : Entity_Id) return Boolean; + -- Try to constant-fold a predicate check, which often enough is a + -- simple arithmetic expression that can be computed statically if + -- its argument is static. This cleans up the output of CCG, even + -- though useless predicate checks will be generally removed by + -- back-end optimizations. + function Inherited_From_Formal (S : Entity_Id) return Entity_Id; -- Within an instance, a type derived from an untagged formal derived -- type inherits from the original parent, not from the actual. The @@ -2467,6 +2474,89 @@ package body Exp_Ch6 is end if; end Add_View_Conversion_Invariants; + - + -- Can_Fold_Predicate_Call -- + - + + function Can_Fold_Predicate_Call (P : Entity_Id) return Boolean is + Actual : constant Node_Id := +First (Parameter_Associations (Call_Node)); + Subt : constant Entity_Id := Etype (First_Entity (P)); + Pred : Node_Id; + + function May_Fold (N : Node_Id) return Traverse_Result; + -- The predicate expression is foldable if it only contains operators + -- and literals. During this check, we also replace occurrences of + -- the formal of the constructed predicate function with the static + -- value of the actual. This is done on a copy of the analyzed + -- expression for the predicate. + + function May_Fold (N : Node_Id) return Traverse_Result is + begin +case Nkind (N) is + when N_Binary_Op | N_Unary_Op => + return OK; + + when N_Identifier | N_Expanded_Name => + if Ekind (Entity (N)) = E_In_Parameter +and then Entity (N) = First_Entity (P) + then + Rewrite (N, New_Copy (Actual)); + Set_Is_Static_Expression (N); + return OK; + + elsif Ekind (Entity (N)) = E_Enumeration_Literal then + return OK; + + else + return Abandon; + end if; + + when N_If_Expression | N_Case_Expression => + return OK; + + when N_Integer_Literal => + return OK; + + when others => + return Abandon; +end case; + end May_Fold; + + function Try_Fold is new Traverse_Func (May_Fold); + + -- Start of processing for Can_Fold_Predicate_Call + + begin + -- Folding is only interesting if the actual is static and its type + -- has a Dynamic_Predicate aspect. For CodePeer we preserve the + -- function call. + + if Nkind (Actual) /= N_Integer_Literal + or else not Has_Dynamic_Predicate_Aspect (Subt) + or else CodePeer_Mode + then +return False; + end if; + + -- Retrieve the analyzed expression for the predicate + + Pred := +New_Copy_Tree + (Expression (Find_Aspect (Subt, Aspect_Dynamic_Predicate))); + + if Try_Fold (Pred) = OK then +Rewrite (Call_Node, Pred); +Analyze_And_Resolve (Call_Node, Standard_Boolean); +return True; + + else +-- Continue expansion of function call + +return False; + end if; + end Can_Fold_Predicate_Call; + --- -- Inherited_From_Formal -- --- @@ -2815,6 +2905,17 @@ package body Exp_Ch6 is end; end if; + -- if this is a call to a predicate function, try to constant + -- fold it. + + if Nkind (Call_Node) = N_Function_Call +and then Is_Entity_
[Ada] New Repinfo.Input unit to read back JSON representation info.
For some time the Repinfo unit has been able to output the representation information in the JSON data interchange format in addition to the usual text and binary formats. The new Repinfo.Input unit makes it possible to read back this information under this format and make it available to clients, the main one being ASIS. The big advantage of using this approach over manipulating a binary blob is that the writer and the reader of the JSON representation need not be binary compatible, i.e. in practice need not be the same version of the compiler or ASIS for the same target. The patch also adds a -gnatd_j switch to read back the information in the compiler itself, which makes it easy to keep the writer and the reader in sync using only one tool, namely the compiler. The typical usage is: gcc -c p.ads -gnatR4js gcc -c p.ads -gnatd_j to exercise respectively the writer and the reader from the compiler. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Eric Botcazou gcc/ada/ * alloc.ads (Rep_JSON_Table_Initial): New constant. (Rep_JSON_Table_Increment): Likewise. * debug.adb: Document -gnatd_j switch. * gcc-interface/Make-lang.in (GNAT_ADA_OBJS): Add repinfo-input.o. * gnat1drv.adb: Add with clause for Repinfo.Input. Add with and use clauses for Sinput. (Read_JSON_Files_For_Repinfo): New procedure. (Gnat1drv1): Deal with -gnatd_j switch. * repinfo-input.ad[sb]: New unit. * snames.ads-tmpl (Name_Discriminant): New constant. (Name_Operands): Likewise. patch.diff.gz Description: application/gzip
[Ada] Fix crash on dynamic predicate when generating SCOs
A pragma Check for Dynamic_Predicate does not correspond to any source construct that has a provisionally-disabled SCO. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Thomas Quinot gcc/ada/ * sem_prag.adb (Analyze_Pragma, case pragma Check): Do not call Set_SCO_Pragma_Enabled for the dynamic predicate case. gcc/testsuite/ * gnat.dg/scos1.adb: New testcase.--- gcc/ada/sem_prag.adb +++ gcc/ada/sem_prag.adb @@ -14113,9 +14113,14 @@ package body Sem_Prag is Expr := Get_Pragma_Arg (Arg2); --- Deal with SCO generation +-- Mark the pragma (or, if rewritten from an aspect, the original +-- aspect) as enabled. Nothing to do for an internally generated +-- check for a dynamic predicate. -if Is_Checked (N) and then not Split_PPC (N) then +if Is_Checked (N) + and then not Split_PPC (N) + and then Cname /= Name_Dynamic_Predicate +then Set_SCO_Pragma_Enabled (Loc); end if; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/scos1.adb @@ -0,0 +1,26 @@ +-- { dg-do compile } +-- { dg-options "-gnata -gnateS" } + +procedure SCOs1 with SPARK_Mode => On is + + LEN_IN_BITS : constant := 20; + + M_SIZE_BYTES : constant := 2 ** LEN_IN_BITS; + ET_BYTES : constant := (M_SIZE_BYTES - 4); + + type T_BYTES is new Integer range 0 .. ET_BYTES with Size => 32; + subtype TYPE5_SCALAR is T_BYTES + with Dynamic_Predicate => TYPE5_SCALAR mod 4 = 0; + + type E_16_BYTES is new Integer; + subtype RD_BYTES is E_16_BYTES + with Dynamic_Predicate => RD_BYTES mod 4 = 0; + + function "-" (left : TYPE5_SCALAR; right : RD_BYTES) return TYPE5_SCALAR + is ( left - TYPE5_SCALAR(right) ) + with Pre => TYPE5_SCALAR(right) <= left and then + left - TYPE5_SCALAR(right) <= T_BYTES'Last, Inline_Always; + +begin + null; +end SCOs1;
[Ada] Avoid spurious errors on dimensionality checking in GNATprove
In the special GNATprove mode of the frontend, automatic inlining is performed, which may lead to spurious errors on dimensionality checking. Avoid performing this checking on inlined code, which has already been checked for dimensionality errors. There is no impact on compilation. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Yannick Moy gcc/ada/ * sem_res.adb (Resolve_Call): Do not perform dimensionality checking on inlined bodies.--- gcc/ada/sem_res.adb +++ gcc/ada/sem_res.adb @@ -6949,7 +6949,9 @@ package body Sem_Res is -- Check the dimensions of the actuals in the call. For function calls, -- propagate the dimensions from the returned type to N. - Analyze_Dimension_Call (N, Nam); + if not In_Inlined_Body then + Analyze_Dimension_Call (N, Nam); + end if; -- All done, evaluate call and deal with elaboration issues
[Ada] Elaboration order v4.0 and infinite loops
This patch introduces binder switch -d_S which prompts the various phases of the elaboration order mechanism to output a short message concerning their commencement and completion. The output is useful when trying to determine whether a phase is stuck in an infinite loop. -- Source -- -- main.adb procedure Main is begin null; end Main; -- Compilation and output -- $ gnatmake -q main.adb -bargs -d_S -d_V elaborating units... collecting units... units collected. constructing library graph... validating library graph... library graph validated. library graph constructed. constructing invocation graph... validating invocation graph... invocation graph validated. invocation graph constructed. augmenting library graph... library graph augmented. discovering components... components discovered. validating elaboration order... elaboration order validated. units elaborated. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Hristian Kirtchev gcc/ada/ * bindo.adb: Update the section of switches and debugging elaboration issues. * bindo.ads: Add type Elaboration_Phase. * bindo-augmentors.adb: Add use clause for Bindo.Writers.Phase_Writers. (Augment_Library_Graph): Signal the start and end of the aubmentation phase. * bindo-builders.adb: Add with and use clause for Bindo.Writers. Add use clause for Bindo.Writers.Phase_Writers. (Build_Invocation_Graph): Signal the start and end of the invocation graph construction phase. (Build_Library_Graph): Signal the start and end of the library graph construction phase. * bindo-diagnostics.adb: Add use clause for Bindo.Writers.Phase_Writers. (Diagnose_Cycle): Signal the start and end of the cycle diagnostic phase. * bindo-elaborators.adb: Add use clause for Bindo.Writers.Phase_Writers. (Elaborate_Units): Signal the start and end of the unit elaboration phase. * bindo-graphs.adb: Add use clause for Bindo.Writers.Phase_Writers. (Find_Components): Signal the start and end of the component discovery phase. (Find_Cycles): Signal the start and end of the cycle discovery phase. * bindo-units.adb: Add with and use clause for Bindo.Writers. Add use clause for Bindo.Writers.Phase_Writers. (Collect_Elaborable_Units): Signal the start and end of the unit collection phase. * bindo-validators.adb: Add with and use clause for Bindo.Writers. Add use clause for Bindo.Writers.Phase_Writers. (Validate_Cycles, Validate_Elaboration_Order, Validate_Invocation_Graph, Validate_Library_Graph): Signal the start and end of the libray graph validation phase. * bindo-writers.ads, bindo-writers.adb: Add new nested package Phase_Writers. * debug.adb: Update the documentation of switch d_S.--- gcc/ada/bindo-augmentors.adb +++ gcc/ada/bindo-augmentors.adb @@ -27,7 +27,9 @@ with Debug; use Debug; with Output; use Output; with Types; use Types; -with Bindo.Writers; use Bindo.Writers; +with Bindo.Writers; +use Bindo.Writers; +use Bindo.Writers.Phase_Writers; package body Bindo.Augmentors is @@ -124,6 +126,8 @@ package body Bindo.Augmentors is return; end if; + Start_Phase (Library_Graph_Augmentation); + -- Prepare the statistics data Longest_Path := 0; @@ -131,6 +135,8 @@ package body Bindo.Augmentors is Visit_Elaboration_Roots (Inv_Graph, Lib_Graph); Write_Statistics; + + End_Phase (Library_Graph_Augmentation); end Augment_Library_Graph; --- gcc/ada/bindo-builders.adb +++ gcc/ada/bindo-builders.adb @@ -37,6 +37,10 @@ use Bindo.Validators; use Bindo.Validators.Invocation_Graph_Validators; use Bindo.Validators.Library_Graph_Validators; +with Bindo.Writers; +use Bindo.Writers; +use Bindo.Writers.Phase_Writers; + with GNAT; use GNAT; with GNAT.Dynamic_HTables; use GNAT.Dynamic_HTables; @@ -99,6 +103,8 @@ package body Bindo.Builders is begin pragma Assert (Present (Lib_G)); + Start_Phase (Invocation_Graph_Construction); + -- Prepare the global data Inv_Graph := @@ -111,6 +117,7 @@ package body Bindo.Builders is For_Each_Elaborable_Unit (Create_Edges'Access); Validate_Invocation_Graph (Inv_Graph); + End_Phase (Invocation_Graph_Construction); return Inv_Graph; end Build_Invocation_Graph; @@ -375,6 +382,8 @@ package body Bindo.Builders is function Build_Library_Graph return Library_Graph is begin + Start_Phase (Library_Graph_Construction); + -- Prepare the global data Lib_Graph := @@ -388,6 +397,
[Ada] Infinite loop on illegal declaration
This patch updates predicate Null_Status to prevent an infinite recursion when the argument is an illegal object declaration of an access type. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Hristian Kirtchev gcc/ada/ * sem_util.adb (Null_Status): Assume that an erroneous construct has an undefined null status. gcc/testsuite/ * gnat.dg/self_ref1.adb: New testcase.--- gcc/ada/sem_util.adb +++ gcc/ada/sem_util.adb @@ -22367,9 +22367,15 @@ package body Sem_Util is -- Start of processing for Null_Status begin + -- Prevent cascaded errors or infinite loops when trying to determine + -- the null status of an erroneous construct. + + if Error_Posted (N) then + return Unknown; + -- An allocator always creates a non-null value - if Nkind (N) = N_Allocator then + elsif Nkind (N) = N_Allocator then return Is_Non_Null; -- Taking the 'Access of something yields a non-null value --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/self_ref1.adb @@ -0,0 +1,11 @@ +-- { dg-do compile } + +procedure Self_Ref1 is + type Integer_Ptr is access all Integer; + Ptr : constant Integer_Ptr := Integer_Ptr (Ptr); -- { dg-error "object \"Ptr\" cannot be used before end of its declaration" } + +begin + if Ptr /= null then + null; + end if; +end Self_Ref1;
[Ada] Pragma Unreferenced triggers undefined reference
This patch corrects the generation of protected body declarations so that instances of pragma Unreferenced applied to formals don't falsly trigger undefined references. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Justin Squirek gcc/ada/ * exp_ch9.adb (Build_Private_Protected_Declaration): Add exception for the moving of pragmas to internally generated specs for pragma Unreferenced. gcc/testsuite/ * gnat.dg/unreferenced2.adb: New testcase.--- gcc/ada/exp_ch9.adb +++ gcc/ada/exp_ch9.adb @@ -3493,6 +3493,8 @@ package body Exp_Ch9 is procedure Move_Pragmas (From : Node_Id; To : Node_Id); -- Find all suitable source pragmas at the top of subprogram body From's -- declarations and insert them after arbitrary node To. + -- + -- Very similar to Move_Pragmas in sem_ch6 ??? - -- Analyze_Pragmas -- @@ -3544,7 +3546,14 @@ package body Exp_Ch9 is Next_Decl := Next (Decl); -if Nkind (Decl) = N_Pragma then +-- We add an exception here for Unreferenced pragmas since the +-- internally generated spec gets analyzed within +-- Build_Private_Protected_Declaration and will lead to spurious +-- warnings due to the way references are checked. + +if Nkind (Decl) = N_Pragma + and then Pragma_Name_Unmapped (Decl) /= Name_Unreferenced +then Remove (Decl); Insert_After (Insert_Nod, Decl); Insert_Nod := Decl; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/unreferenced2.adb @@ -0,0 +1,34 @@ +-- { dg-do compile } +-- { dg-options "-gnatf" } + +procedure Unreferenced2 is + + protected Example is + procedure Callme; + end Example; + + procedure Other (X : Boolean) is + begin + null; + end; + + protected body Example is + + procedure Internal (X : Boolean) is + pragma Unreferenced (X); +Y : Integer; + begin + Y := 3; + end Internal; + + procedure Callme is + begin + Internal (X => True); + end Callme; + + end Example; + +begin + Example.Callme; + Other (True); +end Unreferenced2;
[Ada] Internal crash on illegal renaming
This patch updates the retrieval of the renamed object name of an object renaming declaration to handle various name forms. No need for a test because one already exists, and reproducing requires a compiler built with assertions. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Hristian Kirtchev gcc/ada/ * sem_ch8.adb (Analyze_Object_Renaming): Obtain the object being renamed using routine Get_Object_Name which takes care of various name forms. (Get_Object_Name): New routine.--- gcc/ada/sem_ch8.adb +++ gcc/ada/sem_ch8.adb @@ -774,6 +774,10 @@ package body Sem_Ch8 is -- has already established its actual subtype. This is only relevant -- if the renamed object is an explicit dereference. + function Get_Object_Name (Nod : Node_Id) return Node_Id; + -- Obtain the name of the object from node Nod which is being renamed by + -- the object renaming declaration N. + -- -- Check_Constrained_Object -- -- @@ -840,6 +844,33 @@ package body Sem_Ch8 is end if; end Check_Constrained_Object; + - + -- Get_Object_Name -- + - + + function Get_Object_Name (Nod : Node_Id) return Node_Id is + Obj_Nam : Node_Id; + + begin + Obj_Nam := Nod; + while Present (Obj_Nam) loop +if Nkind_In (Obj_Nam, N_Attribute_Reference, + N_Explicit_Dereference, + N_Indexed_Component, + N_Slice) +then + Obj_Nam := Prefix (Obj_Nam); + +elsif Nkind (Obj_Nam) = N_Selected_Component then + Obj_Nam := Selector_Name (Obj_Nam); +else + exit; +end if; + end loop; + + return Obj_Nam; + end Get_Object_Name; + -- Start of processing for Analyze_Object_Renaming begin @@ -1156,18 +1187,10 @@ package body Sem_Ch8 is elsif Ada_Version >= Ada_2005 and then Nkind (Nam) in N_Has_Entity then declare -Nam_Decl : Node_Id; -Nam_Ent : Entity_Id; +Nam_Ent : constant Entity_Id := Entity (Get_Object_Name (Nam)); +Nam_Decl : constant Node_Id := Declaration_Node (Nam_Ent); begin -if Nkind (Nam) = N_Attribute_Reference then - Nam_Ent := Entity (Prefix (Nam)); -else - Nam_Ent := Entity (Nam); -end if; - -Nam_Decl := Parent (Nam_Ent); - if Has_Null_Exclusion (N) and then not Has_Null_Exclusion (Nam_Decl) then
[Ada] Fix inconsistent documentation for gnatmetric
One part said all metrics are enabled by default (which is now true), and another part said the coupling metrics are disabled by default (which is no longer true). Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Bob Duff gcc/ada/ * doc/gnat_ugn/gnat_utility_programs.rst: Fix inconsistent documentation for gnatmetric. * gnat_ugn.texi: Regenerate.--- gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst +++ gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst @@ -1214,7 +1214,7 @@ The following switches are available: :samp:`f` By default, gnathtml will generate html links only for global entities - ('with'ed units, global variables and types,...). If you specify + ('with'ed units, global variables and types,...). If you specify :switch:`-f` on the command line, then links will be generated for local entities too. @@ -1310,7 +1310,7 @@ Alternatively, you may run the script using the following command line: ``gnat2xml`` is a project-aware tool (see :ref:`Using_Project_Files_with_GNAT_Tools` for a description of - the project-related switches). The project file package that can specify + the project-related switches). The project file package that can specify ``gnat2xml`` switches is named ``gnat2xml``. .. _Switches_for_``gnat2xml``: @@ -1780,7 +1780,7 @@ Alternatively, you may run the script using the following command line: ``gnatcheck`` is a project-aware tool (see :ref:`Using_Project_Files_with_GNAT_Tools` for a description of - the project-related switches). The project file package that can specify + the project-related switches). The project file package that can specify ``gnatcheck`` switches is named ``Check``. For full details, plese refer to :title:`GNATcheck Reference Manual`. @@ -1804,11 +1804,11 @@ Alternatively, you may run the script using the following command line: for computing various program metrics. It takes an Ada source file as input and generates a file containing the metrics data as output. Various switches control which - metrics are computed and output. + metrics are reported. ``gnatmetric`` is a project-aware tool (see :ref:`Using_Project_Files_with_GNAT_Tools` for a description of - the project-related switches). The project file package that can specify + the project-related switches). The project file package that can specify ``gnatmetric`` switches is named ``Metrics``. The ``gnatmetric`` command has the form @@ -1921,9 +1921,9 @@ Alternatively, you may run the script using the following command line: .. index:: --short-file-names (gnatmetric) :switch:`--short-file-names` -Use 'short' source file names in the output. (The ``gnatmetric`` +Use 'short' source file names in the output. (The ``gnatmetric`` output includes the name(s) of the Ada source file(s) from which the -metrics are computed. By default each name includes the absolute +metrics are computed. By default each name includes the absolute path. The :switch:`--short-file-names` switch causes ``gnatmetric`` to exclude all directory information from the file names that are output.) @@ -1980,12 +1980,11 @@ Alternatively, you may run the script using the following command line: Specifying a set of metrics to compute -- - By default all the metrics are computed and reported. The switches - described in this subsection allow you to control, on an individual - basis, whether metrics are computed and reported. If at least one - positive metric switch is specified (that is, a switch that defines - that a given metric or set of metrics is to be computed), then only - explicitly specified metrics are reported. + By default all the metrics are reported. The switches described in this + subsection allow you to control, on an individual basis, whether metrics are + reported. If at least one positive metric switch is specified (that is, a + switch that defines that a given metric or set of metrics is to be computed), + then only explicitly specified metrics are reported. .. _Line_Metrics_Control: @@ -2023,7 +2022,7 @@ Alternatively, you may run the script using the following command line: code lines in bodies. You can use the following switches to select the specific line metrics - to be computed and reported. + to be reported. .. index:: --lines (gnatmetric) @@ -2089,10 +2088,9 @@ Alternatively, you may run the script using the following command line: :switch:`--lines-average` -Report the average number of code lines in subprogram bodies, task -bodies, entry bodies and statement sequences in package bodies. The -metric is computed and reported for the whole set of processed Ada -sources only. +Report the average number of code lines in subprogram bodies, task bodies, +entry bodies and statement sequences in package bodies. The metric is +reported for the whole s
[Ada] Memory corruption when using formal hashed sets or maps
Add a check to avoid causing a buffer overflow when the map is empty Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Claire Dross gcc/ada/ * libgnat/a-cfhama.adb, libgnat/a-cfhase.adb (Free): Do not reset the Has_Element flag if no element is freed.--- gcc/ada/libgnat/a-cfhama.adb +++ gcc/ada/libgnat/a-cfhama.adb @@ -509,8 +509,11 @@ is procedure Free (HT : in out Map; X : Count_Type) is begin - HT.Nodes (X).Has_Element := False; - HT_Ops.Free (HT, X); + if X /= 0 then + pragma Assert (X <= HT.Capacity); + HT.Nodes (X).Has_Element := False; + HT_Ops.Free (HT, X); + end if; end Free; -- --- gcc/ada/libgnat/a-cfhase.adb +++ gcc/ada/libgnat/a-cfhase.adb @@ -760,8 +760,11 @@ is procedure Free (HT : in out Set; X : Count_Type) is begin - HT.Nodes (X).Has_Element := False; - HT_Ops.Free (HT, X); + if X /= 0 then + pragma Assert (X <= HT.Capacity); + HT.Nodes (X).Has_Element := False; + HT_Ops.Free (HT, X); + end if; end Free; --
[Ada] Minimal binder
The new minimal binder option ("-minimal") suppresses the generation of binder objects that are not strictly required for program operation. This option is suitable for space constrained applications and comes with the restriction that programs can no longer be debugged using GDB. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Patrick Bernardi gcc/ada/ * bindgen.adb (Gen_Main): Do not generate a reference to Ada_Main_Program_Name when the Minimal_Binder flag is set. (Gen_Output_File_Ada): Do not output GNAT_Version and Ada_Main_Program_Name info if Minimal_Binder flag is set. * bindusg.adb: Add documentation for new -minimal switch. * gnatbind.adb (Scan_Bind_Arg): Scan -minimal switch. * opt.ads: Add new global flag Minimal_Binder to indicate if the binder should not produce global variables. * doc/gnat_ugn/building_executable_programs_with_gnat.rst: Update documentation with new binder -minimal switch. * gnat_ugn.texi: Regenerate.--- gcc/ada/bindgen.adb +++ gcc/ada/bindgen.adb @@ -1810,9 +1810,11 @@ package body Bindgen is -- with a pragma Volatile in order to tell the compiler to preserve -- this variable at any level of optimization. - -- CodePeer and CCG do not need this extra code on the other hand + -- CodePeer and CCG do not need this extra code. The code is also not + -- needed if the binder is in "Minimal Binder" mode. if Bind_Main_Program +and then not Minimal_Binder and then not CodePeer_Mode and then not Generate_C_Code then @@ -2354,25 +2356,27 @@ package body Bindgen is -- program uses two Ada libraries). Also zero terminate the string -- so that its end can be found reliably at run time. - WBI (""); - WBI (" GNAT_Version : constant String :="); - WBI ("""" & Ver_Prefix & - Gnat_Version_String & - """ & ASCII.NUL;"); - WBI (" pragma Export (C, GNAT_Version, ""__gnat_version"");"); + if not Minimal_Binder then +WBI (""); +WBI (" GNAT_Version : constant String :="); +WBI ("""" & Ver_Prefix & + Gnat_Version_String & + """ & ASCII.NUL;"); +WBI (" pragma Export (C, GNAT_Version, ""__gnat_version"");"); - WBI (""); - Set_String (" Ada_Main_Program_Name : constant String := """); - Get_Name_String (Units.Table (First_Unit_Entry).Uname); +WBI (""); +Set_String (" Ada_Main_Program_Name : constant String := """); +Get_Name_String (Units.Table (First_Unit_Entry).Uname); - Set_Main_Program_Name; - Set_String (""" & ASCII.NUL;"); +Set_Main_Program_Name; +Set_String (""" & ASCII.NUL;"); - Write_Statement_Buffer; +Write_Statement_Buffer; - WBI - (" pragma Export (C, Ada_Main_Program_Name, " & -"""__gnat_ada_main_program_name"");"); +WBI + (" pragma Export (C, Ada_Main_Program_Name, " & + """__gnat_ada_main_program_name"");"); + end if; end if; WBI (""); --- gcc/ada/bindusg.adb +++ gcc/ada/bindusg.adb @@ -178,6 +178,12 @@ package body Bindusg is (" -mnnn Limit number of detected errors/warnings to nnn " & "(1-99)"); + -- Line for -minimal switch + + Write_Line +(" -minimal Generate binder file suitable for space-constrained " + & "applications"); + -- Line for -M switch Write_Line --- gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst +++ gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst @@ -6475,7 +6475,6 @@ be presented in subsequent sections. Rename generated main program from main to xyz. This option is supported on cross environments only. - .. index:: -m (gnatbind) :switch:`-m{n}` @@ -6488,6 +6487,16 @@ be presented in subsequent sections. A value of zero means that no limit is enforced. The equal sign is optional. + .. index:: -minimal (gnatbind) + +:switch:`-minimal` + Generate a binder file suitable for space-constrained applications. When + active, binder-generated objects not required for program operation are no + longer generated. **Warning:** this option comes with the following + limitations: + + * Debugging will not work; + * Programs using GNAT.Compiler_Version will not link. .. index:: -n (gnatbind) --- gcc/ada/gnat_ugn.texi +++ gcc/ada/gnat_ugn.texi @@ -15892,6 +15892,25 @@ limit, then a message is output and the bind is abandoned. A value of zero means that no limit is enforced. The equal sign is optional. +@geindex -minimal (gnatbind) +
[Ada] Link error due to negated intrinsic comparison
This patch corrects the resolution of operator "not" when the expression being negated is an equality operator to prevent the transformation of an intrinsic equality operator into a function call. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Hristian Kirtchev gcc/ada/ * sem_res.adb (Resolve_Op_Not): Do not rewrite an equality operator into a function call when the operator is intrinsic. gcc/testsuite/ * gnat.dg/equal9.adb: New testcase.--- gcc/ada/sem_res.adb +++ gcc/ada/sem_res.adb @@ -10091,15 +10091,20 @@ package body Sem_Res is declare Opnd : constant Node_Id := Right_Opnd (N); +Op_Id : Entity_Id; + begin if B_Typ = Standard_Boolean and then Nkind_In (Opnd, N_Op_Eq, N_Op_Ne) and then Is_Overloaded (Opnd) then Resolve_Equality_Op (Opnd, B_Typ); + Op_Id := Entity (Opnd); - if Ekind (Entity (Opnd)) = E_Function then - Rewrite_Operator_As_Call (Opnd, Entity (Opnd)); + if Ekind (Op_Id) = E_Function + and then not Is_Intrinsic_Subprogram (Op_Id) + then + Rewrite_Operator_As_Call (Opnd, Op_Id); end if; if not Inside_A_Generic or else Is_Entity_Name (Opnd) then --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/equal9.adb @@ -0,0 +1,26 @@ +-- { dg-do run } + +with Ada.Text_IO; use Ada.Text_IO; +with System; use System; + +procedure Equal9 is + Val : Address := Null_Address; +begin + if Val = Null_Address then + Put_Line ("= OK"); + else + raise Program_Error; + end if; + + if Val /= Null_Address then + raise Program_Error; + else + Put_Line ("/= OK"); + end if; + + if not (Val = Null_Address) then + raise Program_Error; + else + Put_Line ("not = OK"); + end if; +end Equal9;
[Ada] Avoid spurious warning on assertions with Loop_Entry
When the Loop_Entry attribute is used inside a loop invariant or another assertion where it is allowed, it may lead to spurious warnings on conditions that are detected to be always valid. Now fixed. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Yannick Moy gcc/ada/ * sem_eval.adb (Is_Same_Value): Add special case for rewritten Loop_Entry attribute. gcc/testsuite/ * gnat.dg/loop_entry1.adb: New testcase.--- gcc/ada/sem_eval.adb +++ gcc/ada/sem_eval.adb @@ -986,6 +986,13 @@ package body Sem_Eval is Lf : constant Node_Id := Compare_Fixup (L); Rf : constant Node_Id := Compare_Fixup (R); + function Is_Rewritten_Loop_Entry (N : Node_Id) return Boolean; + -- An attribute reference to Loop_Entry may have been rewritten into + -- its prefix as a way to avoid generating a constant for that + -- attribute when the corresponding pragma is ignored. These nodes + -- should be ignored when deciding if they can be equal to one + -- another. + function Is_Same_Subscript (L, R : List_Id) return Boolean; -- L, R are the Expressions values from two attribute nodes for First -- or Last attributes. Either may be set to No_List if no expressions @@ -993,6 +1000,19 @@ package body Sem_Eval is -- expressions represent the same subscript (note one case is where -- one subscript is missing and the other is explicitly set to 1). + - + -- Is_Rewritten_Loop_Entry -- + - + + function Is_Rewritten_Loop_Entry (N : Node_Id) return Boolean is +Orig_N : constant Node_Id := Original_Node (N); + begin +return Orig_N /= N + and then Nkind (Orig_N) = N_Attribute_Reference + and then Get_Attribute_Id (Attribute_Name (Orig_N)) = +Attribute_Loop_Entry; + end Is_Rewritten_Loop_Entry; + --- -- Is_Same_Subscript -- --- @@ -1018,23 +1038,32 @@ package body Sem_Eval is -- Start of processing for Is_Same_Value begin - -- Values are the same if they refer to the same entity and the - -- entity is non-volatile. This does not however apply to Float - -- types, since we may have two NaN values and they should never - -- compare equal. + -- Loop_Entry nodes rewritten into their prefix inside ignored + -- pragmas should never lead to a decision of equality. - -- If the entity is a discriminant, the two expressions may be bounds - -- of components of objects of the same discriminated type. The - -- values of the discriminants are not static, and therefore the - -- result is unknown. + if Is_Rewritten_Loop_Entry (Lf) + or else Is_Rewritten_Loop_Entry (Rf) + then +return False; - -- It would be better to comment individual branches of this test ??? + -- Values are the same if they refer to the same entity and the + -- entity is nonvolatile. - if Nkind_In (Lf, N_Identifier, N_Expanded_Name) + elsif Nkind_In (Lf, N_Identifier, N_Expanded_Name) and then Nkind_In (Rf, N_Identifier, N_Expanded_Name) and then Entity (Lf) = Entity (Rf) + + -- If the entity is a discriminant, the two expressions may be + -- bounds of components of objects of the same discriminated type. + -- The values of the discriminants are not static, and therefore + -- the result is unknown. + and then Ekind (Entity (Lf)) /= E_Discriminant and then Present (Entity (Lf)) + + -- This does not however apply to Float types, since we may have + -- two NaN values and they should never compare equal. + and then not Is_Floating_Point_Type (Etype (L)) and then not Is_Volatile_Reference (L) and then not Is_Volatile_Reference (R) --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/loop_entry1.adb @@ -0,0 +1,13 @@ +-- { dg-do compile } +-- { dg-options "-gnatwc" } + +procedure Loop_Entry1 (X, Y : in out Integer) is +begin + while X < Y loop + pragma Loop_Invariant +(X >= X'Loop_Entry and then Y <= Y'Loop_Entry); + + X := X + 1; + Y := Y - 1; + end loop; +end Loop_Entry1;
[Ada] Missing finalization of private protected type
This patch updates the analysis of protected types to properly mark the type as having controlled components when it contains at least one such component. This in turn marks a potential partial view as requiring finalization actions. -- Source -- -- types.ads with Ada.Finalization; use Ada.Finalization; package Types is type Ctrl_Typ is new Controlled with null record; procedure Finalize (Obj : in out Ctrl_Typ); type Prot_Typ is limited private; private protected type Prot_Typ is private Comp : Ctrl_Typ; end Prot_Typ; end Types; -- types.adb with Ada.Text_IO; use Ada.Text_IO; package body Types is procedure Finalize (Obj : in out Ctrl_Typ) is begin Put_Line ("finalize"); end Finalize; protected body Prot_Typ is end Prot_Typ; end Types; -- main.adb with Types; use Types; procedure Main is Obj : Prot_Typ; begin null; end Main; Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Hristian Kirtchev gcc/ada/ * exp_util.ads, exp_util.adb (Needs_Finalization): Move to Sem_Util. * sem_ch9.adb (Analyze_Protected_Definition): Code cleanup. Mark the protected type as having controlled components when it contains at least one such component. * sem_util.ads, sem_util.adb (Needs_Finalization): New function.--- gcc/ada/exp_util.adb +++ gcc/ada/exp_util.adb @@ -10554,94 +10554,6 @@ package body Exp_Util is end if; end Needs_Constant_Address; - - -- Needs_Finalization -- - - - function Needs_Finalization (Typ : Entity_Id) return Boolean is - function Has_Some_Controlled_Component -(Input_Typ : Entity_Id) return Boolean; - -- Determine whether type Input_Typ has at least one controlled - -- component. - - --- - -- Has_Some_Controlled_Component -- - --- - - function Has_Some_Controlled_Component -(Input_Typ : Entity_Id) return Boolean - is - Comp : Entity_Id; - - begin - -- When a type is already frozen and has at least one controlled - -- component, or is manually decorated, it is sufficient to inspect - -- flag Has_Controlled_Component. - - if Has_Controlled_Component (Input_Typ) then -return True; - - -- Otherwise inspect the internals of the type - - elsif not Is_Frozen (Input_Typ) then -if Is_Array_Type (Input_Typ) then - return Needs_Finalization (Component_Type (Input_Typ)); - -elsif Is_Record_Type (Input_Typ) then - Comp := First_Component (Input_Typ); - while Present (Comp) loop - if Needs_Finalization (Etype (Comp)) then - return True; - end if; - - Next_Component (Comp); - end loop; -end if; - end if; - - return False; - end Has_Some_Controlled_Component; - - -- Start of processing for Needs_Finalization - - begin - -- Certain run-time configurations and targets do not provide support - -- for controlled types. - - if Restriction_Active (No_Finalization) then - return False; - - -- C++ types are not considered controlled. It is assumed that the non- - -- Ada side will handle their clean up. - - elsif Convention (Typ) = Convention_CPP then - return False; - - -- Class-wide types are treated as controlled because derivations from - -- the root type may introduce controlled components. - - elsif Is_Class_Wide_Type (Typ) then - return True; - - -- Concurrent types are controlled as long as their corresponding record - -- is controlled. - - elsif Is_Concurrent_Type (Typ) -and then Present (Corresponding_Record_Type (Typ)) -and then Needs_Finalization (Corresponding_Record_Type (Typ)) - then - return True; - - -- Otherwise the type is controlled when it is either derived from type - -- [Limited_]Controlled and not subject to aspect Disable_Controlled, or - -- contains at least one controlled component. - - else - return - Is_Controlled (Typ) or else Has_Some_Controlled_Component (Typ); - end if; - end Needs_Finalization; - -- New_Class_Wide_Subtype -- @@ -12170,9 +12082,7 @@ package body Exp_Util is Typ : Entity_Id; begin - if No (L) -or else Is_Empty_List (L) - then + if No (L) or else Is_Empty_List (L) then return False; end if; --- gcc/ada/exp_util.ads +++ gcc/ada/exp_util.ads @@ -944,10 +944,6 @@ package Exp_Util is -- consist of constants, when the object has a nontrivial initialization -- or is contr
[Ada] Remove redundant predicate checks
This patch removes redundant dynamic predicate checks generated by type conversions in various contexts. The patch also recognizes additional such checks that can be evaluated statically when applied to constant values. No simple test available. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Ed Schonberg gcc/ada/ * exp_ch4.adb (Expand_N_Type_Conversion): If a predicate check is generated, analyze it with range check suppressed, because that check has been previously applied. * exp_ch5.adb (Expand_N_Assignment_Statement): If the RHS is a type conversion to the type of the LHS, do not apply a predicate check to the RHS because it will have been generated already during its expansion. * exp_ch6.adb (Can_Fold_Predicate_Call): Extend processing to handle a predicate check on a constant entity whose value is static.--- gcc/ada/exp_ch4.adb +++ gcc/ada/exp_ch4.adb @@ -12050,10 +12050,13 @@ package body Exp_Ch4 is begin -- Avoid infinite recursion on the subsequent expansion of --- of the copy of the original type conversion. +-- of the copy of the original type conversion. When needed, +-- a range check has already been applied to the expression. Set_Comes_From_Source (New_Expr, False); -Insert_Action (N, Make_Predicate_Check (Target_Type, New_Expr)); +Insert_Action (N, + Make_Predicate_Check (Target_Type, New_Expr), + Suppress => Range_Check); end; end if; end Expand_N_Type_Conversion; --- gcc/ada/exp_ch5.adb +++ gcc/ada/exp_ch5.adb @@ -2021,15 +2021,21 @@ package body Exp_Ch5 is if not Suppress_Assignment_Checks (N) then - -- First deal with generation of range check if required + -- First deal with generation of range check if required, + -- and then predicate checks if the type carries a predicate. + -- If the Rhs is an expression these tests may have been applied + -- already. This is the case if the RHS is a type conversion. + -- Other such redundant checks could be removed ??? + + if Nkind (Rhs) /= N_Type_Conversion + or else Entity (Subtype_Mark (Rhs)) /= Typ + then +if Do_Range_Check (Rhs) then + Generate_Range_Check (Rhs, Typ, CE_Range_Check_Failed); +end if; - if Do_Range_Check (Rhs) then -Generate_Range_Check (Rhs, Typ, CE_Range_Check_Failed); +Apply_Predicate_Check (Rhs, Typ); end if; - - -- Then generate predicate check if required - - Apply_Predicate_Check (Rhs, Typ); end if; -- Check for a special case where a high level transformation is --- gcc/ada/exp_ch6.adb +++ gcc/ada/exp_ch6.adb @@ -2479,8 +2479,7 @@ package body Exp_Ch6 is - function Can_Fold_Predicate_Call (P : Entity_Id) return Boolean is - Actual : constant Node_Id := -First (Parameter_Associations (Call_Node)); + Actual : Node_Id; function May_Fold (N : Node_Id) return Traverse_Result; -- The predicate expression is foldable if it only contains operators @@ -2533,10 +2532,11 @@ package body Exp_Ch6 is function Try_Fold is new Traverse_Func (May_Fold); - -- Local variables + -- Other lLocal variables - Subt : constant Entity_Id := Etype (First_Entity (P)); - Pred : Node_Id; + Subt : constant Entity_Id := Etype (First_Entity (P)); + Aspect : Node_Id; + Pred : Node_Id; -- Start of processing for Can_Fold_Predicate_Call @@ -2545,8 +2545,21 @@ package body Exp_Ch6 is -- has a Dynamic_Predicate aspect. For CodePeer we preserve the -- function call. - if Nkind (Actual) /= N_Integer_Literal + Actual := First (Parameter_Associations (Call_Node)); + Aspect := Find_Aspect (Subt, Aspect_Dynamic_Predicate); + + -- If actual is a declared constant, retrieve its value + + if Is_Entity_Name (Actual) + and then Ekind (Entity (Actual)) = E_Constant + then +Actual := Constant_Value (Entity (Actual)); + end if; + + if No (Actual) + or else Nkind (Actual) /= N_Integer_Literal or else not Has_Dynamic_Predicate_Aspect (Subt) + or else No (Aspect) or else CodePeer_Mode then return False; @@ -2554,9 +2567,7 @@ package body Exp_Ch6 is -- Retrieve the analyzed expression for the predicate - Pred := - New_Copy_Tree - (Expression (Find_Aspect (Subt, Aspect_Dynamic_Predicate))); + Pred := New_Copy_Tree (Expression (Aspect)); if Try_Fold (Pred) = OK then Rewrite (Call_Nod
[Ada] Elaboration order v4.0 and output of dependencies
This patch adds a missing case to the mechanism that outputs the elaboration order dependencies of units. -- Source -- -- pack.ads package Pack is procedure Force_Body; end Pack; -- pack.adb package body Pack is procedure Force_Body is null; end Pack; -- main.adb with Pack; procedure Main is begin null; end Main; -- Compilation and output -- $ gnatmake -q main.adb -bargs -e ELABORATION ORDER DEPENDENCIES unit "pack (spec)" must be elaborated before unit "main (body)" reason: unit "main (body)" has with clause for unit "pack (spec)" unit "pack (spec)" must be elaborated before unit "pack (body)" reason: spec must be elaborated before body Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Hristian Kirtchev gcc/ada/ * bindo.adb: Remove the documentation of switch -d_N because it is no longer in use. * bindo-graphs.ads, bindo-graphs.adb (Is_Spec_Before_Body_Edge): New routine. * bindo-writers.adb (Write_Dependency_Edge): Add the missing case of a spec-before-body edge.--- gcc/ada/bindo-graphs.adb +++ gcc/ada/bindo-graphs.adb @@ -4183,6 +4183,21 @@ package body Bindo.Graphs is return U_Rec.Utype = Is_Spec or else U_Rec.Utype = Is_Spec_Only; end Is_Spec; + -- + -- Is_Spec_Before_Body_Edge -- + -- + + function Is_Spec_Before_Body_Edge +(G: Library_Graph; + Edge : Library_Graph_Edge_Id) return Boolean + is + begin + pragma Assert (Present (G)); + pragma Assert (Present (Edge)); + + return Kind (G, Edge) = Spec_Before_Body_Edge; + end Is_Spec_Before_Body_Edge; + --- -- Is_Spec_With_Body -- --- --- gcc/ada/bindo-graphs.ads +++ gcc/ada/bindo-graphs.ads @@ -1166,6 +1166,13 @@ package Bindo.Graphs is pragma Inline (Is_Spec); -- Determine whether vertex Vertex of library graph G denotes a spec + function Is_Spec_Before_Body_Edge +(G: Library_Graph; + Edge : Library_Graph_Edge_Id) return Boolean; + pragma Inline (Is_Spec_Before_Body_Edge); + -- Determine whether edge Edge of library graph G links a predecessor + -- spec and a successor body belonging to the same unit. + function Is_Spec_With_Body (G : Library_Graph; Vertex : Library_Graph_Vertex_Id) return Boolean; --- gcc/ada/bindo-writers.adb +++ gcc/ada/bindo-writers.adb @@ -610,6 +610,11 @@ package body Bindo.Writers is & "elaboration time", Info => True); + elsif Is_Spec_Before_Body_Edge (G, Edge) then +Error_Msg_Output + (Msg => " reason: spec must be elaborated before body", + Info => True); + else pragma Assert (Is_With_Edge (G, Edge)); --- gcc/ada/bindo.adb +++ gcc/ada/bindo.adb @@ -347,13 +347,9 @@ package body Bindo is --GNATbind outputs the library graph in textual format to standard --output. -- - -- -d_N New bindo order - -- - --GNATbind utilizes the new bindo elaboration order - -- -- -d_P Output cycle paths -- - --GNATbind output the cycle paths in text format to standard output + --GNATbind outputs the cycle paths in text format to standard output -- -- -d_S Output elaboration-order status information --
[Ada] Refactor ownership pointer checking in SPARK as a generic
Ownership checking as done in SPARK should be applied only to SPARK code, which requires GNATprove knowledge of the SPARK_Mode boundary. Transform the checking unit into a generic to allow passing in the knowledge from GNATprove to that unit in GNAT sources. Keeping the code in GNAT sources makes it possible in the future to adapt it further (or simply instantiate it differently) to be used on Ada code, independently of GNATprove. There is no impact on compilation. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Claire Dross gcc/ada/ * gnat1drv.adb: SPARK checking rules for pointer aliasing are moved to GNATprove backend. * sem_spark.ads, sem_spark.adb (Sem_SPARK): Is now a generic unit. Takes as parameters: - Retysp which is used to query the most underlying type visible in SPARK. We do not introduce aliasing checks for types which are not visibly deep. - Component_Is_Visible_In_SPARK is used to avoid doing pointer aliasing checks on components which are not visible in SPARK. - Emit_Messages returns True in the second phase of SPARK analysis. Error messages for failed aliasing checks are only output in this case. Additionally, errors on constructs not supported in SPARK are removed as duplicates of marking errors. Components are stored in the permission map using their original component to avoid inconsistencies between components of different views of the same type. (Check_Expression): Handle delta constraints. (Is_Deep): Exported so that we can check for SPARK restrictions on deep types inside SPARK semantic checkings. (Is_Traversal_Function): Exported so that we can check for SPARK restrictions on traversal functions inside SPARK semantic checkings. (Check_Call_Statement, Read_Indexes): Check wether we are dealing with a subprogram pointer type before querying called entity. (Is_Subpath_Expression): Image attribute can appear inside a path. (Check_Loop_Statement): Correct order of statements in the loop. (Check_Node): Ignore raise nodes. (Check_Statement): Use Last_Non_Pragma to get the object declaration in an extended return statement. patch.diff.gz Description: application/gzip
Implement more rtx vector folds on variable-length vectors
This patch extends the tree-level folding of variable-length vectors so that it can also be used on rtxes. The first step is to move the tree_vector_builder new_unary/binary_operator routines to the parent vector_builder class (which in turn means adding a new template parameter). The second step is to make simplify-rtx.c use a direct rtx analogue of the VECTOR_CST handling in fold-const.c. Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. OK to install? Richard 2019-07-11 Richard Sandiford gcc/ * vector-builder.h (vector_builder): Add a shape template parameter. (vector_builder::new_unary_operation): New function, generalizing the old tree_vector_builder function. (vector_builder::new_binary_operation): Likewise. (vector_builder::binary_encoded_nelts): Likewise. * int-vector-builder.h (int_vector_builder): Update template parameters to vector_builder. (int_vector_builder::shape_nelts): New function. * rtx-vector-builder.h (rtx_vector_builder): Update template parameters to vector_builder. (rtx_vector_builder::shape_nelts): New function. (rtx_vector_builder::nelts_of): Likewise. (rtx_vector_builder::npatterns_of): Likewise. (rtx_vector_builder::nelts_per_pattern_of): Likewise. * tree-vector-builder.h (tree_vector_builder): Update template parameters to vector_builder. (tree_vector_builder::shape_nelts): New function. (tree_vector_builder::nelts_of): Likewise. (tree_vector_builder::npatterns_of): Likewise. (tree_vector_builder::nelts_per_pattern_of): Likewise. * tree-vector-builder.c (tree_vector_builder::new_unary_operation) (tree_vector_builder::new_binary_operation): Delete. (tree_vector_builder::binary_encoded_nelts): Likewise. * simplify-rtx.c (distributes_over_addition_p): New function. (simplify_const_unary_operation) (simplify_const_binary_operation): Generalize handling of vector constants to include variable-length vectors. (test_vector_ops_series): Add more tests. Index: gcc/vector-builder.h === --- gcc/vector-builder.h2019-06-07 08:39:43.126344672 +0100 +++ gcc/vector-builder.h2019-07-11 08:55:03.187049079 +0100 @@ -45,8 +45,11 @@ #define GCC_VECTOR_BUILDER_H variable-length vectors. finalize () then canonicalizes the encoding to a simpler form if possible. - The derived class Derived provides this functionality for specific Ts. - Derived needs to provide the following interface: + Shape is the type that specifies the number of elements in the vector + and (where relevant) the type of each element. + + The derived class Derived provides the functionality of this class + for specific Ts. Derived needs to provide the following interface: bool equal_p (T elt1, T elt2) const; @@ -82,9 +85,30 @@ #define GCC_VECTOR_BUILDER_H Record that ELT2 is being elided, given that ELT1_PTR points to the last encoded element for the containing pattern. This is - again provided for TREE_OVERFLOW handling. */ + again provided for TREE_OVERFLOW handling. + + static poly_uint64 shape_nelts (Shape shape); + + Return the number of elements in SHAPE. + +The class provides additional functionality for the case in which +T can describe a vector constant as well as an individual element. +This functionality requires: + + static poly_uint64 nelts_of (T x); + + Return the number of elements in vector constant X. + + static unsigned int npatterns_of (T x); + + Return the number of patterns used to encode vector constant X. + + static unsigned int nelts_per_pattern_of (T x); -template + Return the number of elements used to encode each pattern + in vector constant X. */ + +template class vector_builder : public auto_vec { public: @@ -101,8 +125,13 @@ #define GCC_VECTOR_BUILDER_H bool operator == (const Derived &) const; bool operator != (const Derived &x) const { return !operator == (x); } + bool new_unary_operation (Shape, T, bool); + bool new_binary_operation (Shape, T, T, bool); + void finalize (); + static unsigned int binary_encoded_nelts (T, T); + protected: void new_vector (poly_uint64, unsigned int, unsigned int); void reshape (unsigned int, unsigned int); @@ -121,16 +150,16 @@ #define GCC_VECTOR_BUILDER_H unsigned int m_nelts_per_pattern; }; -template +template inline const Derived * -vector_builder::derived () const +vector_builder::derived () const { return static_cast (this); } -template +template inline -vector_builder::vector_builder () +vector_builder::vector_builder () : m_full_nelts (0), m_npatterns (0), m_nelts_per_pattern (0) @@ -140,18 +169,18 @@ vector_builder::vector_build
[Ada] Crash on protected type with self-referential component
This patch fixes a compiler abort on a declarastion for a protected type PT when one of its private component is of type access PT. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-11 Ed Schonberg gcc/ada/ * exp_ch9.adb (Expand_N_Protected_Type_Declaaration): New subsidiary routine Replace_Access_Definition, to handle properly a protected type PT one of whose private components is of type access PT. gcc/testsuite/ * gnat.dg/prot8.adb, gnat.dg/prot8.ads: New testcase.--- gcc/ada/exp_ch9.adb +++ gcc/ada/exp_ch9.adb @@ -8928,6 +8928,8 @@ package body Exp_Ch9 is Current_Node : Node_Id := N; E_Count : Int; Entries_Aggr : Node_Id; + Rec_Decl : Node_Id; + Rec_Id : Entity_Id; procedure Check_Inlining (Subp : Entity_Id); -- If the original operation has a pragma Inline, propagate the flag @@ -8949,6 +8951,21 @@ package body Exp_Ch9 is -- For a protected operation that is an interrupt handler, add the -- freeze action that will register it as such. + procedure Replace_Access_Definition (Comp : Node_Id); + -- If a private component of the type is an access to itself, this + -- is not a reference to the current instance, but an access type out + -- of which one might construct a list. If such a component exists, we + -- create an incomplete type for the equivalent record type, and + -- a named access type for it, that replaces the access definition + -- of the original component. This is similar to what is done for + -- records in Check_Anonymous_Access_Components, but simpler, because + -- the corresponding record type has no previous declaration. + -- This needs to be done only once, even if there are several such + -- access components. The following entity stores the constructed + -- access type. + + Acc_T : Entity_Id := Empty; + -- Check_Inlining -- @@ -9096,6 +9113,41 @@ package body Exp_Ch9 is Append_Freeze_Action (Prot_Proc, RTS_Call); end Register_Handler; + --- + -- Replace_Access_Definition -- + --- + + procedure Replace_Access_Definition (Comp : Node_Id) is + Loc : constant Source_Ptr := Sloc (Comp); + Inc_T : Node_Id; + Inc_D : Node_Id; + Acc_Def : Node_Id; + Acc_D : Node_Id; + + begin + if No (Acc_T) then +Inc_T := Make_Defining_Identifier (Loc, Chars (Rec_Id)); +Inc_D := Make_Incomplete_Type_Declaration (Loc, Inc_T); +Acc_T := Make_Temporary (Loc, 'S'); +Acc_Def := + Make_Access_To_Object_Definition (Loc, +Subtype_Indication => New_Occurrence_Of (Inc_T, Loc)); +Acc_D := + Make_Full_Type_Declaration (Loc, +Defining_Identifier => Acc_T, +Type_Definition => Acc_Def); + +Insert_Before (Rec_Decl, Inc_D); +Analyze (Inc_D); + +Insert_Before (Rec_Decl, Acc_D); +Analyze (Acc_D); + end if; + + Set_Access_Definition (Comp, Empty); + Set_Subtype_Indication (Comp, New_Occurrence_Of (Acc_T, Loc)); + end Replace_Access_Definition; + -- Local variables Body_Arr: Node_Id; @@ -9107,7 +9159,6 @@ package body Exp_Ch9 is Obj_Def : Node_Id; Object_Comp : Node_Id; Priv: Node_Id; - Rec_Decl: Node_Id; Sub : Node_Id; -- Start of processing for Expand_N_Protected_Type_Declaration @@ -9117,6 +9168,7 @@ package body Exp_Ch9 is return; else Rec_Decl := Build_Corresponding_Record (N, Prot_Typ, Loc); + Rec_Id := Defining_Identifier (Rec_Decl); end if; Cdecls := Component_Items (Component_List (Type_Definition (Rec_Decl))); @@ -9262,6 +9314,15 @@ package body Exp_Ch9 is Access_Definition => New_Copy_Tree (Access_Definition (Old_Comp), Discr_Map)); + + -- A self-reference in the private part becomes a + -- self-reference to the corresponding record. + + if Entity (Subtype_Mark (Access_Definition (New_Comp))) + = Prot_Typ + then +Replace_Access_Definition (New_Comp); + end if; end if; New_Priv := --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/prot8.adb @@ -0,0 +1,8 @@ +-- { dg-do compile } + +package body Prot8 is + + protected body Prot is + end Prot; + +end Prot8; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/prot8.ads @@ -0,0 +1,10 @@ +package Prot8 is + +
Generalise VEC_DUPLICATE folding for variable-length vectors
This patch uses the constant vector encoding scheme to handle more cases of a VEC_DUPLICATE of another vector. Duplicating any fixed-length vector is fine, and duplicating a variable-length vector is OK as long as that vector is also a duplicate of a fixed-length sequence. Other cases fell through to: if (VECTOR_MODE_P (mode) && GET_CODE (op) == CONST_VECTOR) which was only expecting to deal with elementwise operations. Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. OK to install? Richard 2019-07-11 Richard Sandiford gcc/ * simplify-rtx.c (simplify_const_unary_operation): Fold a VEC_DUPLICATE of a fixed-length vector even if the result is variable-length. Likewise fold a duplicate of a variable-length vector if the variable-length vector is itself a duplicate of a fixed-length sequence. (test_vector_ops_duplicate): Test more cases. Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c 2019-07-11 08:55:03.187049079 +0100 +++ gcc/simplify-rtx.c 2019-07-11 08:56:41.354257873 +0100 @@ -1736,23 +1736,24 @@ simplify_const_unary_operation (enum rtx } if (CONST_SCALAR_INT_P (op) || CONST_DOUBLE_AS_FLOAT_P (op)) return gen_const_vec_duplicate (mode, op); - unsigned int n_elts; if (GET_CODE (op) == CONST_VECTOR - && GET_MODE_NUNITS (mode).is_constant (&n_elts)) + && (CONST_VECTOR_DUPLICATE_P (op) + || CONST_VECTOR_NUNITS (op).is_constant ())) { - /* This must be constant if we're duplicating it to a constant -number of elements. */ - unsigned int in_n_elts = CONST_VECTOR_NUNITS (op).to_constant (); - gcc_assert (in_n_elts < n_elts); - gcc_assert ((n_elts % in_n_elts) == 0); - rtvec v = rtvec_alloc (n_elts); - for (unsigned i = 0; i < n_elts; i++) - RTVEC_ELT (v, i) = CONST_VECTOR_ELT (op, i % in_n_elts); - return gen_rtx_CONST_VECTOR (mode, v); + unsigned int npatterns = (CONST_VECTOR_DUPLICATE_P (op) + ? CONST_VECTOR_NPATTERNS (op) + : CONST_VECTOR_NUNITS (op).to_constant ()); + gcc_assert (multiple_p (GET_MODE_NUNITS (mode), npatterns)); + rtx_vector_builder builder (mode, npatterns, 1); + for (unsigned i = 0; i < npatterns; i++) + builder.quick_push (CONST_VECTOR_ELT (op, i)); + return builder.build (); } } - if (VECTOR_MODE_P (mode) && GET_CODE (op) == CONST_VECTOR) + if (VECTOR_MODE_P (mode) + && GET_CODE (op) == CONST_VECTOR + && known_eq (GET_MODE_NUNITS (mode), CONST_VECTOR_NUNITS (op))) { gcc_assert (GET_MODE (op) == op_mode); @@ -7071,6 +7072,18 @@ test_vector_ops_duplicate (machine_mode && mode_for_vector (inner_mode, 2).exists (&narrower_mode) && VECTOR_MODE_P (narrower_mode)) { + /* Test VEC_DUPLICATE of a vector. */ + rtx_vector_builder nbuilder (narrower_mode, 2, 1); + nbuilder.quick_push (const0_rtx); + nbuilder.quick_push (const1_rtx); + rtx_vector_builder builder (mode, 2, 1); + builder.quick_push (const0_rtx); + builder.quick_push (const1_rtx); + ASSERT_RTX_EQ (builder.build (), +simplify_unary_operation (VEC_DUPLICATE, mode, + nbuilder.build (), + narrower_mode)); + /* Test VEC_SELECT of a vector. */ rtx vec_par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, const1_rtx, const0_rtx));
Re: PR90723
Prathamesh Kulkarni writes: > @@ -186,6 +186,23 @@ skip_alternative (const char *p) > /* Nonzero means volatile operands are recognized. */ > extern int volatile_ok; > > +/* RAII class for temporarily setting volatile_ok. */ > + > +class temporary_volatile_ok > +{ > +public: > + temporary_volatile_ok (int value): save_volatile_ok (volatile_ok) Missing space before the ":". > + { > +volatile_ok = value; > + } > + > + ~temporary_volatile_ok () { volatile_ok = save_volatile_ok; } > + > +private: > + temporary_volatile_ok (const temporary_volatile_ok&); Missing space before the "&". OK with those changes, thanks. Richard > + int save_volatile_ok; > +}; > + > /* Set by constrain_operands to the number of the alternative that > matched. */ > extern int which_alternative;
Re: [PATCH] Add hints for slim dumping if fallthrough bb of jump isn't next bb
Hi Richard, on 2019/7/11 上午3:30, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi all, >> Is it a reasonable patch? If yes, is it ok for trunk? > > It looks really useful, but IMO we should either emit the hint for all > end-of-block insns with a surprising fallthrough edge, or -- if we > really do want to keep it specific to conditional jumps -- we should > replace rhs "pc" in the jump insn with something like "bb ". > Personally the former seems better to me (and should also be simpler). > Thanks a lot for the comments. It's a good idea not only to check conditional jump insn. I've updated the patch accordingly. I was intended to change the pc dumping by using the actual BB label similar to what we see for non fallthrough edge. But as you mentioned it's not simple, need to pass some information down, it looks like a hint is enough for this case. :) >> + basic_block dest = e->dest; >> + if (BB_HEAD (dest) != ninsn) >> +fprintf (outf, ";; pc falls through to BB %d\n", dest->index); > > Very minor, but I think the output would be more readable if the comment > were indented by the same amount as the register notes (like REG_DEAD above). > In lisp tradition I guess the comment marker would then be ";" rather > than ";;". > Good idea, also updated it accordingly, thanks! With the attached patch, the dumping is as below: 12: r135:CC=cmp(r122:DI,0) 13: pc={(r135:CC!=0)?L52:pc} REG_DEAD r135:CC REG_BR_PROB 1041558836 ; pc falls through to BB 10 31: L31: 17: NOTE_INSN_BASIC_BLOCK 3 Thanks, Kewen -- gcc/ChangeLog 2019-07-11 Kewen Lin * gcc/cfgrtl.c (print_rtl_with_bb): Emit a hint if the fallthrough target of current basic block isn't the placed right next. diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c index a1ca5992c41..5f2accf1f4f 100644 --- a/gcc/cfgrtl.c +++ b/gcc/cfgrtl.c @@ -2193,14 +2193,14 @@ print_rtl_with_bb (FILE *outf, const rtx_insn *rtx_first, dump_flags_t flags) if (df) df_dump_start (outf); - if (flags & TDF_BLOCKS) + FOR_EACH_BB_REVERSE_FN (bb, cfun) { - FOR_EACH_BB_REVERSE_FN (bb, cfun) - { - rtx_insn *x; + rtx_insn *x; - start[INSN_UID (BB_HEAD (bb))] = bb; - end[INSN_UID (BB_END (bb))] = bb; + start[INSN_UID (BB_HEAD (bb))] = bb; + end[INSN_UID (BB_END (bb))] = bb; + if (flags & TDF_BLOCKS) + { for (x = BB_HEAD (bb); x != NULL_RTX; x = NEXT_INSN (x)) { enum bb_state state = IN_MULTIPLE_BB; @@ -2244,16 +2244,31 @@ print_rtl_with_bb (FILE *outf, const rtx_insn *rtx_first, dump_flags_t flags) if (flags & TDF_DETAILS) df_dump_insn_bottom (tmp_rtx, outf); - if (flags & TDF_BLOCKS) + bb = end[INSN_UID (tmp_rtx)]; + if (bb != NULL) { - bb = end[INSN_UID (tmp_rtx)]; - if (bb != NULL) + if (flags & TDF_BLOCKS) { dump_bb_info (outf, bb, 0, dump_flags, false, true); if (df && (flags & TDF_DETAILS)) df_dump_bottom (bb, outf); putc ('\n', outf); } + /* Emit a hint if the fallthrough target of current basic block +isn't the one placed right next. */ + else if (EDGE_COUNT (bb->succs) > 0) + { + gcc_assert (BB_END (bb) == tmp_rtx); + const rtx_insn *ninsn = NEXT_INSN (tmp_rtx); + edge e = find_fallthru_edge (bb->succs); + if (e && ninsn) + { + basic_block dest = e->dest; + if (start[INSN_UID (ninsn)] != dest) + fprintf (outf, "%s ; pc falls through to BB %d\n", +print_rtx_head, dest->index); + } + } } }
Re: [PATCH] Fix recent avx512vl-vpsh*dvd-2.c FAILs (PR target/91124)
On Thu, Jul 11, 2019 at 9:01 AM Jakub Jelinek wrote: > > Hi! > > A recent SCCVN change results in more constants (in these testcases constant > vectors) into arguments of builtins and because of a bug in the expansion of > some of them we ended up with: > UNRESOLVED: gcc.target/i386/avx512vl-vpshldvd-2.c compilation failed to > produce executable > FAIL: gcc.target/i386/avx512vl-vpshldvq-2.c (test for excess errors) > UNRESOLVED: gcc.target/i386/avx512vl-vpshldvq-2.c compilation failed to > produce executable > FAIL: gcc.target/i386/avx512vl-vpshrdvd-2.c (test for excess errors) > UNRESOLVED: gcc.target/i386/avx512vl-vpshrdvd-2.c compilation failed to > produce executable > FAIL: gcc.target/i386/avx512vl-vpshrdvq-2.c (test for excess errors) > UNRESOLVED: gcc.target/i386/avx512vl-vpshrdvq-2.c compilation failed to > produce executable > > The following patch fixes that. The bug was using wrong (in fact, totally > unneeded) function types for the builtins. All of them return an integral > vector and have 4 arguments, where the first 3 are same kind vectors as the > return value and the last one is the mask (in one case __mmask32, in two > cases __mmask16, in the rest __mmask8). Kirill added types like > V32HI_FTYPE_V32HI_V32HI_V32HI for the non-masked ones (haven't really > checked if those builtins without _mask or _maskz suffixes couldn't be > dropped and use -1 masks in the headers instead for _mask), but for the > masked ones added types like V32HI_FTYPE_V32HI_V32HI_V32HI_INT where we > already had V32HI_FTYPE_V32HI_V32HI_V32HI_USI to represent such builtins > with __mmask32 last argument. What is worse, the handling of those new > function types has been added to the > nargs = 4; > mask_pos = 1; > nargs_constant = 1; > cases, which is what is done for builtins which require some the third > argument of four to be immediate and the last argument is a mask integer, > instead of the > nargs = 4; > case we want (we don't want to ask for any immediates). This bug didn't > trigger before because if that third argument satisfies the predicate, which > means it is either a register_operand or memory_operand in these cases, > all is fine, it is just when it doesn't satisfy that we mishandle it > (and CONST_VECTOR doesn't satisfy nonimmediate_operand). > > Fixed by removing all those unnecessary function types and using the right > ones. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2019-07-11 Jakub Jelinek > > PR target/91124 > * config/i386/i386-builtin-types.def > (V32HI_FTYPE_V32HI_V32HI_V32HI_INT, > V16HI_FTYPE_V16HI_V16HI_V16HI_INT, V8HI_FTYPE_V8HI_V8HI_V8HI_INT, > V8SI_FTYPE_V8SI_V8SI_V8SI_INT, V4DI_FTYPE_V4DI_V4DI_V4DI_INT, > V8DI_FTYPE_V8DI_V8DI_V8DI_INT, V16SI_FTYPE_V16SI_V16SI_V16SI_INT, > V2DI_FTYPE_V2DI_V2DI_V2DI_INT, V4SI_FTYPE_V4SI_V4SI_V4SI_INT): Remove. > * config/i386/i386-builtin.def (__builtin_ia32_vpshrdv_v32hi_mask, > __builtin_ia32_vpshrdv_v32hi_maskz, __builtin_ia32_vpshrdv_v16hi_mask, > __builtin_ia32_vpshrdv_v16hi_maskz, __builtin_ia32_vpshrdv_v8hi_mask, > __builtin_ia32_vpshrdv_v8hi_maskz, __builtin_ia32_vpshrdv_v16si_mask, > __builtin_ia32_vpshrdv_v16si_maskz, __builtin_ia32_vpshrdv_v8si_mask, > __builtin_ia32_vpshrdv_v8si_maskz, __builtin_ia32_vpshrdv_v4si_mask, > __builtin_ia32_vpshrdv_v4si_maskz, __builtin_ia32_vpshrdv_v8di_mask, > __builtin_ia32_vpshrdv_v8di_maskz, __builtin_ia32_vpshrdv_v4di_mask, > __builtin_ia32_vpshrdv_v4di_maskz, __builtin_ia32_vpshrdv_v2di_mask, > __builtin_ia32_vpshrdv_v2di_maskz, __builtin_ia32_vpshldv_v32hi_mask, > __builtin_ia32_vpshldv_v32hi_maskz, __builtin_ia32_vpshldv_v16hi_mask, > __builtin_ia32_vpshldv_v16hi_maskz, __builtin_ia32_vpshldv_v8hi_mask, > __builtin_ia32_vpshldv_v8hi_maskz, __builtin_ia32_vpshldv_v16si_mask, > __builtin_ia32_vpshldv_v16si_maskz, __builtin_ia32_vpshldv_v8si_mask, > __builtin_ia32_vpshldv_v8si_maskz, __builtin_ia32_vpshldv_v4si_mask, > __builtin_ia32_vpshldv_v4si_maskz, __builtin_ia32_vpshldv_v8di_mask, > __builtin_ia32_vpshldv_v8di_maskz, __builtin_ia32_vpshldv_v4di_mask, > __builtin_ia32_vpshldv_v4di_maskz, __builtin_ia32_vpshldv_v2di_mask, > __builtin_ia32_vpshldv_v2di_maskz, __builtin_ia32_vpdpbusd_v16si_mask, > __builtin_ia32_vpdpbusd_v16si_maskz, > __builtin_ia32_vpdpbusd_v8si_mask, > __builtin_ia32_vpdpbusd_v8si_maskz, __builtin_ia32_vpdpbusd_v4si_mask, > __builtin_ia32_vpdpbusd_v4si_maskz, > __builtin_ia32_vpdpbusds_v16si_mask, > __builtin_ia32_vpdpbusds_v16si_maskz, > __builtin_ia32_vpdpbusds_v8si_mask, > __builtin_ia32_vpdpbusds_v8si_maskz, > __builtin_ia32_vpdpbusds_v4si_mask, > __builtin_ia32_vpdpbusds_v4si_maskz, > __builtin_ia32_vpdpwssd_v16si_mask, > __
Re: Make nonoverlapping_component_refs work with duplicated main variants
Hi Jan, > * g++.dg/lto/alias-3_0.C: New file. > * g++.dg/lto/alias-3_1.c: New file. the new test has a couple of problems: DejaGnu warns everywhere: WARNING: lto.exp does not support dg-lto-do in secondary source files WARNING: lto.exp does not support dg-lto-options in secondary source files This would have been prominent in either mail-report.log or the runtest output. Besides, the test FAILs in the same way as its companions lto/alias-[12] on Solaris (PRs ipa/90720 and lto/91028). Your fix for the latter two didn't change anything, btw., neither on Solaris nor on Linux/x86_64 with -fno-use-linker-plugin. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Fix recent avx512vl-vcvttpd2*dq-2.c FAILs (PR target/91124)
On Thu, Jul 11, 2019 at 9:17 AM Jakub Jelinek wrote: > > Hi! > > The same SCCVN change also introduced: > FAIL: gcc.target/i386/avx512vl-vcvttpd2dq-2.c execution test > FAIL: gcc.target/i386/avx512vl-vcvttpd2udq-2.c execution test > failures, the reason in the generic code is similar, more constants > propagated to builtins, but the actual bug is very different. > The 128-bit instructions like vcvtpd2dqx when using non-{z} masking > don't behave like their RTL pattern describes. > The instructions only use low 2 bits of the mask register and the > upper 2 elements of the 128-bit destination register are always cleared > regardless of what the bit 2 and 3 in the mask register say and what is in > the destination before, so it is incorrect to use: > (set (match_operand:V4SI 0 "register_operand" "=v") > (vec_merge:V4SI (vec_concat:V4SI (fix:V2SI (match_operand:V2DF 1 > "vector_operand" "vBm")) > (const_vector:V2SI [ > (const_int 0 [0]) > (const_int 0 [0]) > ])) > (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") > (match_operand:QI 3 "register_operand" "Yk"))) > which is correct for the low 2 elements, but for the upper 2 elements > says that if the mask bit 2 (or 3) is set, then the element will be zero > (correct), but if it is clear, then it will be operands[2] element (correct > if it is const0_operand, but if it is some nonimmediate and doesn't have > zeros there, it is incorrect). > The RTL has been correct already for similar instructions, e.g. > *floatv2div2sf2{,_mask{,_1}} or some narrowing movs, this > patch just uses similar RTL for those. The _mask_1 patterns are added > as PR69671 will trigger on these new patterns as well, for {z} masking > RTL simplification can fold that second vec_merge operand into CONST0_RTX. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2019-07-11 Jakub Jelinek > > PR target/91124 > * config/i386/sse.md (sse2_cvtpd2dq): Change into ... > (sse2_cvtpd2dq): ... this. Remove mask substitution macros. > (sse2_cvtpd2dq_mask, sse2_cvtpd2dq_mask_1): New define_insns. > (ufix_notruncv2dfv2si2): Change into ... > (ufix_notruncv2dfv2si2): ... this. Remove mask substitution macros. > (ufix_notruncv2dfv2si2_mask, ufix_notruncv2dfv2si2_mask_1): New > define_insns. > (ufix_truncv2dfv2si2): Change into ... > (ufix_truncv2dfv2si2): ... this. Remove mask substitution macros. > (ufix_truncv2dfv2si2_mask, ufix_truncv2dfv2si2_mask_1): New > define_insns. > (sse2_cvttpd2dq): Change into ... > (sse2_cvttpd2dq): ... this. Remove mask substitution macros. > (sse2_cvttpd2dq_mask, sse2_cvttpd2dq_mask_1): New define_insns. > (*sse2_cvtpd2dq): Change into ... > (*sse2_cvtpd2dq): ... this. Remove mask substitution macros. > Add "C" constraint to const0_operand. > (*sse2_cvtpd2dq_mask, *sse2_cvtpd2dq_mask_1): New define_insns. > (sse2_cvtpd2ps_mask): Adjust expand to match *sse2_cvtpd2ps_mask > changes. OK. Thanks, Uros. > --- gcc/config/i386/sse.md.jj 2019-07-08 23:52:40.239757801 +0200 > +++ gcc/config/i386/sse.md 2019-07-10 22:50:15.021381492 +0200 > @@ -5927,16 +5927,16 @@ (define_insn "*avx_cvtpd2dq256_2" > (set_attr "btver2_decode" "vector") > (set_attr "mode" "OI")]) > > -(define_insn "sse2_cvtpd2dq" > +(define_insn "sse2_cvtpd2dq" >[(set (match_operand:V4SI 0 "register_operand" "=v") > (vec_concat:V4SI > (unspec:V2SI [(match_operand:V2DF 1 "vector_operand" "vBm")] >UNSPEC_FIX_NOTRUNC) > (const_vector:V2SI [(const_int 0) (const_int 0)])))] > - "TARGET_SSE2 && " > + "TARGET_SSE2" > { >if (TARGET_AVX) > -return "vcvtpd2dq{x}\t{%1, %0|%0, %1}"; > +return "vcvtpd2dq{x}\t{%1, %0|%0, %1}"; >else > return "cvtpd2dq\t{%1, %0|%0, %1}"; > } > @@ -5949,6 +5949,38 @@ (define_insn "sse2_cvtpd2dq" > (set_attr "athlon_decode" "vector") > (set_attr "bdver1_decode" "double")]) > > +(define_insn "sse2_cvtpd2dq_mask" > + [(set (match_operand:V4SI 0 "register_operand" "=v") > + (vec_concat:V4SI > + (vec_merge:V2SI > + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vm")] > + UNSPEC_FIX_NOTRUNC) > + (vec_select:V2SI > + (match_operand:V4SI 2 "nonimm_or_0_operand" "0C") > + (parallel [(const_int 0) (const_int 1)])) > + (match_operand:QI 3 "register_operand" "Yk")) > + (const_vector:V2SI [(const_int 0) (const_int 0)])))] > + "TARGET_AVX512VL" > + "vcvtpd2dq{x}\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "TI")]) > + > +(define_insn "*sse2_cvtpd2dq_mask_1" > + [(set (
Re: [PATCH] Add hints for slim dumping if fallthrough bb of jump isn't next bb
Kewen.Lin writes: > Hi Richard, > > on 2019/7/11 上午3:30, Richard Sandiford wrote: >> "Kewen.Lin" writes: >>> Hi all, >>> Is it a reasonable patch? If yes, is it ok for trunk? >> >> It looks really useful, but IMO we should either emit the hint for all >> end-of-block insns with a surprising fallthrough edge, or -- if we >> really do want to keep it specific to conditional jumps -- we should >> replace rhs "pc" in the jump insn with something like "bb ". >> Personally the former seems better to me (and should also be simpler). >> > > Thanks a lot for the comments. It's a good idea not only to check conditional > jump insn. I've updated the patch accordingly. I was intended to change the > pc dumping by using the actual BB label similar to what we see for non > fallthrough edge. But as you mentioned it's not simple, need to pass some > information down, it looks like a hint is enough for this case. :) > >>> + basic_block dest = e->dest; >>> + if (BB_HEAD (dest) != ninsn) >>> +fprintf (outf, ";; pc falls through to BB %d\n", dest->index); >> >> Very minor, but I think the output would be more readable if the comment >> were indented by the same amount as the register notes (like REG_DEAD above). >> In lisp tradition I guess the comment marker would then be ";" rather >> than ";;". >> > > Good idea, also updated it accordingly, thanks! With the attached patch, the > dumping is as below: > >12: r135:CC=cmp(r122:DI,0) >13: pc={(r135:CC!=0)?L52:pc} > REG_DEAD r135:CC > REG_BR_PROB 1041558836 > ; pc falls through to BB 10 >31: L31: >17: NOTE_INSN_BASIC_BLOCK 3 > > > Thanks, > Kewen > > > -- > > gcc/ChangeLog > > 2019-07-11 Kewen Lin > > * gcc/cfgrtl.c (print_rtl_with_bb): Emit a hint if the fallthrough > target of current basic block isn't the placed right next. > > > diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c > index a1ca5992c41..5f2accf1f4f 100644 > --- a/gcc/cfgrtl.c > +++ b/gcc/cfgrtl.c > @@ -2193,14 +2193,14 @@ print_rtl_with_bb (FILE *outf, const rtx_insn > *rtx_first, dump_flags_t flags) >if (df) > df_dump_start (outf); > > - if (flags & TDF_BLOCKS) I think we still need an if here, but with the condition instead being: cfun->curr_properties & PROP_cfg > + FOR_EACH_BB_REVERSE_FN (bb, cfun) > { > - FOR_EACH_BB_REVERSE_FN (bb, cfun) > - { > - rtx_insn *x; > + rtx_insn *x; > > - start[INSN_UID (BB_HEAD (bb))] = bb; > - end[INSN_UID (BB_END (bb))] = bb; > + start[INSN_UID (BB_HEAD (bb))] = bb; > + end[INSN_UID (BB_END (bb))] = bb; > + if (flags & TDF_BLOCKS) > + { > for (x = BB_HEAD (bb); x != NULL_RTX; x = NEXT_INSN (x)) > { > enum bb_state state = IN_MULTIPLE_BB; > @@ -2244,16 +2244,31 @@ print_rtl_with_bb (FILE *outf, const rtx_insn > *rtx_first, dump_flags_t flags) > if (flags & TDF_DETAILS) > df_dump_insn_bottom (tmp_rtx, outf); > > - if (flags & TDF_BLOCKS) > + bb = end[INSN_UID (tmp_rtx)]; > + if (bb != NULL) > { > - bb = end[INSN_UID (tmp_rtx)]; > - if (bb != NULL) > + if (flags & TDF_BLOCKS) > { > dump_bb_info (outf, bb, 0, dump_flags, false, true); > if (df && (flags & TDF_DETAILS)) > df_dump_bottom (bb, outf); > putc ('\n', outf); > } > + /* Emit a hint if the fallthrough target of current basic block > +isn't the one placed right next. */ > + else if (EDGE_COUNT (bb->succs) > 0) > + { > + gcc_assert (BB_END (bb) == tmp_rtx); > + const rtx_insn *ninsn = NEXT_INSN (tmp_rtx); I think it'd be better to loop until we hit a real insn or a nonnull start[], e.g. to cope with intervening deleted-insn notes and debug insns. Something like: while (ninsn && !INSN_P (nisn) && !start[INSN_UID (ninsn)]) ninsn = NEXT_INSN (insn); Looks good otherwise, thanks. Richard > + edge e = find_fallthru_edge (bb->succs); > + if (e && ninsn) > + { > + basic_block dest = e->dest; > + if (start[INSN_UID (ninsn)] != dest) > + fprintf (outf, "%s ; pc falls through to BB > %d\n", > +print_rtx_head, dest->index); > + } > + } > } > }
Re: [patch] Small improvements to coverage info (4/n)
On Wed, Jul 10, 2019 at 11:16 PM Eric Botcazou wrote: > > Hi, > > the returns are a little problematic for coverage info because they are > commonalized in GIMPLE, i.e. turned into gotos to a representative return. > This means that you need to clear the location of the representative return, > see lower_gimple_return: > > /* Remove the line number from the representative return statement. > It now fills in for many such returns. Failure to remove this > will result in incorrect results for coverage analysis. */ > gimple_set_location (tmp_rs.stmt, UNKNOWN_LOCATION); > > otherwise the coverage info is blatantly wrong. But this in turn means that > such representative returns are vulneable to inheriting random source location > information in the debug info generated by the compiler. > > I have attached an Ada testcase that demonstrates the problem at -O0: > > .loc 1 14 10 discriminator 2 > movl$0, %r13d > .L12: > movl%r13d, %eax > jmp .L23 > > .L12 is what's left at the RTL level of a representative return in GIMPLE and > it inherits the location directive, which gives wrong coverage info (that's > not visible with gcov because the instrumentation done by -fprofile-arcs > papers over the issue, but for example with callgrind). > > The proposed fix is attached: it sets the current location to the function's > end locus when expanding such a representative return to RTL. That's a bit on > the kludgy side, but doing something in GIMPLE, e.g. in lower_gimple_return, > leads to various regressions in terms of quality of diagnostics. > > Tested on x86_64-suse-linux, both GCC and GDB, OK for mainline? Hmm. So for int baz; int foo() { int i; goto skip; done: return i; skip: i = 1; if (baz) return baz; /* ... */ goto done; } /* (*) */ we'd assign (*) to the return? It might be better to record (in struct function?) the location of any actually existing return location and use that? In fact, similar kludgy would be to simply not clear the GIMPLE_RETURN location but kludge it up right away? Can you explain how diagnostics regress when doing that? Does coverage somehow treat the function end locus specially so you chose that? Will it show the alternate returns as covered at all? I presume stuff like cross-jumping or GIMPLE tail-merging also wrecks coverage info in similar ways (well, not by injecting random locations but by not covering taken paths). Thanks, Richard. > > > 2019-07-10 Eric Botcazou > > * cfgexpand.c (expand_gimple_stmt_1) : If the statement > doesn't have a location, set the current location to the function's > end. > > -- > Eric Botcazou
Re: [PATCH] integrate sprintf pass into strlen (PR 83431)
On Thu, Jul 11, 2019 at 1:54 AM Martin Sebor wrote: > > On 7/2/19 4:38 PM, Jeff Law wrote: > > On 7/1/19 7:47 PM, Martin Sebor wrote: > >> Attached is a more complete solution that fully resolves the bug > >> report by avoiding a warning in cases like: > >> > >>char a[32], b[8]; > >> > >>void f (void) > >>{ > >> if (strlen (a) < sizeof b - 2) > >>snprintf (b, sizeof b, "b=%s", a); // no -Wformat-truncation > >>} > >> > >> It does that by having get_range_strlen_dynamic use the EVRP > >> information for non-constant strlen calls: EVRP has recorded > >> that the result is less sizeof b - 2 and so the function > >> returns this limited range of lengths to snprintf which can > >> then avoid warning. It also improves the checking and can > >> find latent bugs it missed before (like the off-by-one in > >> print-rtl.c). > >> > >> Besides improving the accuracy of the -Wformat-overflow and > >> truncation warnings this can also result in better code. > >> So far this only benefits snprintf but there may be other > >> opportunities to string functions as well (e.g., strcmp or > >> memcmp). > >> > >> Jeff, I looked into your question/suggestion for me last week > >> when we spoke, to introduce some sort of a recursion limit for > >> get_range_strlen_dynamic. It's easily doable but before we go > >> down that path I did some testing to see how bad it can get and > >> to compare it with get_range_strlen. Here are the results for > >> a few packages. The dept is the maximum recursion depth, success > >> and fail are the numbers of successful and failed calls to > >> the function: > >> > >>binutils-gdb: > >>depth success fail > >> get_range_strlen: 319 830221621 > >> get_range_strlen_dynamic:41 1503 161 > >>gcc: > >> get_range_strlen:46 721111365 > >> get_range_strlen_dynamic:23 10272 12 > >>glibc: > >> get_range_strlen:76 284011422 > >> get_range_strlen_dynamic:51 1186 46 > >>elfutils: > >> get_range_strlen:33 1198 2516 > >> get_range_strlen_dynamic:31 685 36 > >>kernel: > >> get_range_strlen:25 529911698 > >> get_range_strlen_dynamic:31 9911 122 > >> > >> Except the first case of get_range_strlen (I haven't looked into > >> it yet), it doesn't seem too bad, and with just one exception it's > >> better than get_range_strlen. Let me know if you think it's worth > >> adding a parameter (I assume we'd use it for both functions) and > >> what to set it to. > >> > >> On 6/11/19 5:26 PM, Martin Sebor wrote: > >>> The sprintf and strlen passes both work with strings but > >>> run independently of one another and don't share state. As > >>> a result, lengths of strings dynamically created by functions > >>> that are available to the strlen pass are not available to > >>> sprintf. Conversely, lengths of strings formatted by > >>> the sprintf functions are not made available to the strlen > >>> pass. The result is less efficient code, poor diagnostics, > >>> and ultimately less than optimal user experience. > >>> > >>> The attached patch is the first step toward rectifying this > >>> design problem. It integrates the two passes into one and > >>> exposes the string length data managed by the strlen pass to > >>> the sprintf "module." (It does not expose any sprintf data > >>> to the strlen pass yet.) > >>> > >>> The sprintf pass invocations in passes.def have been replaced > >>> with those of strlen. The first "early" invocation is only > >>> effective for the sprintf module to enable warnings without > >>> optimization. The second invocation is "late" and enables > >>> both warnings and the sprintf and strlen optimizations unless > >>> explicitly disabled via -fno-optimize-strlen. > >>> > >>> Since the strlen optimization is the second invocation of > >>> the pass tests that scan the strlen dump have been adjusted > >>> to look for the "strlen2" dump file. > >>> > >>> The changes in the patch are mostly mechanical. The one new > >>> "feature" worth calling out is the get_range_strlen_dynamic > >>> function. It's analogous to get_range_strlen in gimple-fold.c > >>> except that it makes use of the strlen "dynamically" obtained > >>> string length info. In cases when the info is not available > >>> the former calls the latter. > >>> > >>> The other new functions in tree-ssa-strlen.c are > >>> check_and_optimize_call and handle_integral_assign: I added > >>> them only to keep the check_and_optimize_stmt function from > >>> getting excessively long and hard to follow. Otherwise, > >>> the code in the functions is unchanged. > >>> > >>> There are a number of further enhancements to consider as > >>> the next steps: > >>>* enhance the strlen module to determine string length range > >>> information f
Re: [PATCH] integrate sprintf pass into strlen (PR 83431)
On Thu, Jul 11, 2019 at 11:11:58AM +0200, Richard Biener wrote: > The others all smell like they might be yours ;) (besides object-size > maybe but that one is limited by having a cache - hopefully also > used when not used in the compute-everything mode). objsz doesn't really have a compute everything mode, it has a mode where the caches are available and a mode where they aren't. If they are available, it will compute whatever it needs to and anything needed to compute that and cache. If they aren't available, for optimize > 0 it will just punt on SSA_NAMEs (defers until it is done with the caches), for -O0 it looks through SSA_NAME_DEF_STMTs of POINTER_PLUS_EXPR with constant as second argument, and that one indeed is inbounded, so if one has a million of POINTER_PLUS_EXPRs stmts, each adding 1 to the previous SSA_NAME, yes, it will be a pain if every one of those SSA_NAMEs is then passed to some __builtin_object_size too. Guess we could add a limit there too, no reason to look through more than a dozen or two at most. Jakub
Re: Rework constant subreg folds and handle more variable-length cases
On Thu, Jul 11, 2019 at 10:03 AM Richard Sandiford wrote: > > This patch rewrites the way simplify_subreg handles constants. > It uses similar native_encode/native_decode routines to the > tree-level handling of VIEW_CONVERT_EXPR, meaning that we can > move between rtx constants and the target memory image of them. > > The main point of this patch is to support subregs of constant-length > vectors for VLA vectors, beyond the very simple cases that were already > handled. Many of the new tests failed before the patch for variable- > length vectors. > > The boolean side is tested more by the upcoming SVE ACLE work. > > Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. > OK to install? Hmm. So is subreg [offset] defined in terms of memory order or in terms of register order? I wonder if you need to handle FLOAT_WORDS_BIG_ENDIAN, REG_WORDS_BIG_ENDIAN and whether BYTES/WORDS_BIG_ENDIAN have any meaning here at all? I'm always struggling with this when working on BIT_FIELD_REFs on GIMPLE [registers]... Richard. > Richard > > > 2019-07-11 Richard Sandiford > > gcc/ > * defaults.h (TARGET_UNIT): New macro. > (target_unit): New type. > * rtl.h (native_encode_rtx, native_decode_rtx) > (native_decode_vector_rtx, subreg_size_lsb): Declare. > (subreg_lsb_1): Turn into an inline wrapper around subreg_size_lsb. > * rtlanal.c (subreg_lsb_1): Delete. > (subreg_size_lsb): New function. > * simplify-rtx.c: Include rtx-vector-builder.h > (simplify_immed_subreg): Delete. > (native_encode_rtx, native_decode_vector_rtx, native_decode_rtx) > (simplify_const_vector_byte_offset, simplify_const_vector_subreg): New > functions. > (simplify_subreg): Use them. > (test_vector_subregs_modes, test_vector_subregs_repeating) > (test_vector_subregs_fore_back, test_vector_subregs_stepped) > (test_vector_subregs): New functions. > (test_vector_ops): Call test_vector_subregs for integer vector > modes with at least 2 elements. > > Index: gcc/defaults.h > === > *** gcc/defaults.h 2019-07-11 08:33:57.0 +0100 > --- gcc/defaults.h 2019-07-11 08:33:58.069250175 +0100 > *** #define TARGET_VTABLE_USES_DESCRIPTORS 0 > *** 1459,1462 > --- 1459,1474 > #define DWARF_GNAT_ENCODINGS_DEFAULT DWARF_GNAT_ENCODINGS_GDB > #endif > > + /* Done this way to keep gengtype happy. */ > + #if BITS_PER_UNIT == 8 > + #define TARGET_UNIT uint8_t > + #elif BITS_PER_UNIT == 16 > + #define TARGET_UNIT uint16_t > + #elif BITS_PER_UNIT == 32 > + #define TARGET_UNIT uint32_t > + #else > + #error Unknown BITS_PER_UNIT > + #endif > + typedef TARGET_UNIT target_unit; > + > #endif /* ! GCC_DEFAULTS_H */ > Index: gcc/rtl.h > === > *** gcc/rtl.h 2019-07-11 08:33:57.0 +0100 > --- gcc/rtl.h 2019-07-11 08:33:58.069250175 +0100 > *** extern int rtx_cost (rtx, machine_mode, > *** 2400,2411 > extern int address_cost (rtx, machine_mode, addr_space_t, bool); > extern void get_full_rtx_cost (rtx, machine_mode, enum rtx_code, int, >struct full_rtx_costs *); > extern poly_uint64 subreg_lsb (const_rtx); > ! extern poly_uint64 subreg_lsb_1 (machine_mode, machine_mode, poly_uint64); > extern poly_uint64 subreg_size_offset_from_lsb (poly_uint64, poly_uint64, > poly_uint64); > extern bool read_modify_subreg_p (const_rtx); > > /* Return the subreg byte offset for a subreg whose outer mode is > OUTER_MODE, whose inner mode is INNER_MODE, and where there are > LSB_SHIFT *bits* between the lsb of the outer value and the lsb of > --- 2400,2429 > extern int address_cost (rtx, machine_mode, addr_space_t, bool); > extern void get_full_rtx_cost (rtx, machine_mode, enum rtx_code, int, >struct full_rtx_costs *); > + extern bool native_encode_rtx (machine_mode, rtx, vec &, > + unsigned int, unsigned int); > + extern rtx native_decode_rtx (machine_mode, vec, > + unsigned int); > + extern rtx native_decode_vector_rtx (machine_mode, vec, > +unsigned int, unsigned int, unsigned > int); > extern poly_uint64 subreg_lsb (const_rtx); > ! extern poly_uint64 subreg_size_lsb (poly_uint64, poly_uint64, poly_uint64); > extern poly_uint64 subreg_size_offset_from_lsb (poly_uint64, poly_uint64, > poly_uint64); > extern bool read_modify_subreg_p (const_rtx); > > + /* Given a subreg's OUTER_MODE, INNER_MODE, and SUBREG_BYTE, return the > +bit offset at which the subreg begins (counting from the least > significant > +bit of the operand). */ > + > + inline poly_uint
Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.
On Wed, Jul 10, 2019 at 5:52 PM Richard Biener wrote: > > On July 10, 2019 2:11:17 PM GMT+02:00, Michael Matz wrote: > >Hi, > > > >On Tue, 9 Jul 2019, Richard Biener wrote: > > > >> >The basic block index is not a DFS index, so no, that's not a test > >for > >> >backedge. > >> > >> I think in CFG RTL mode the BB index designates the order of the BBs > >in > >> the object file? So this is a way to identify backwards jumps? > > > >Even if it means a backwards jump (and that's not always the case, the > >insns are emitted by following the NEXT_INSN links, without a CFG, and > >that all happens after machine-dependend reorg, and going out of cfg > >layout might link insn together even from high index BBs to low index > >BBs > >(e.g. because of fall-through)), that's still not a backedge in the > >general case. If a heuristic is enough here it might be okay, though. > > > >OTOH, as here a CFG still exists, why not simply rely on a proper DFS > >marking backedges? > > Because proper backedges is not what we want here, see honzas example. > > So I'm second-guessing why we have different LOOP_ALIGN and when it makes > sense to apply. So docs have @defmac JUMP_ALIGN (@var{label}) The alignment (log base 2) to put in front of @var{label}, which is a common destination of jumps and has no fallthru incoming edge. ... @defmac LOOP_ALIGN (@var{label}) The alignment (log base 2) to put in front of @var{label} that heads a frequently executed basic block (usually the header of a loop). so I would expect the alignment pass to have if ( (branch_count > count_threshold || (bb->count > bb->prev_bb->count.apply_scale (10, 1) && (bb->prev_bb->count <= ENTRY_BLOCK_PTR_FOR_FN (cfun) ->count.apply_scale (1, 2) { align_flags alignment = has_fallthru ? JUMP_ALIGN (label) : LOOP_ALIGN (label); if (dump_file) fprintf (dump_file, " jump alignment added.\n"); max_alignment = align_flags::max (max_alignment, alignment); } instead of the two different conditions? Or the JUMP_ALIGN case computing "common destination" instead of "frequently executed" somehow but I think it makes sense that those are actually the same here (frequently executed). It oddly looks at prev_bb and is not guarded with optimize_bb_for_speed_p at all so this all is full of heuristics and changing anything here just based on x86 measurements is surely going to cause issues for targets more sensitive to (mis-)alignment. Richard. > Richard. > > > > >Ciao, > >Michael. >
Re: Rework constant subreg folds and handle more variable-length cases
Richard Biener writes: > On Thu, Jul 11, 2019 at 10:03 AM Richard Sandiford > wrote: >> >> This patch rewrites the way simplify_subreg handles constants. >> It uses similar native_encode/native_decode routines to the >> tree-level handling of VIEW_CONVERT_EXPR, meaning that we can >> move between rtx constants and the target memory image of them. >> >> The main point of this patch is to support subregs of constant-length >> vectors for VLA vectors, beyond the very simple cases that were already >> handled. Many of the new tests failed before the patch for variable- >> length vectors. >> >> The boolean side is tested more by the upcoming SVE ACLE work. >> >> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. >> OK to install? > > Hmm. So is subreg [offset] defined in terms of memory order or in > terms of register order? Memory order, except for the special case that paradoxical subregs always have an offset of zero (rather than the natural negative value for big-endian). > I wonder if you need to handle FLOAT_WORDS_BIG_ENDIAN, > REG_WORDS_BIG_ENDIAN and whether BYTES/WORDS_BIG_ENDIAN have any > meaning here at all? > > I'm always struggling with this when working on BIT_FIELD_REFs > on GIMPLE [registers]... Yeah. Yhe subreg_size_lsb stuff copes with BYTES/WORDS_BIG_ENDIAN. I used that in the patch exactly because I didn't want to have to redo the logic here :-) Because the offset is (mostly) the memory offset, REG_WORDS_BIG_ENDIAN doesn't affect this code. It instead affects how we divide up a multi-register hard register. (See subreg_get_info.) FLOAT_WORDS_BIG_ENDIAN is explicitly documented as not affecting subregs: @cindex @code{FLOAT_WORDS_BIG_ENDIAN}, (lack of) effect on @code{subreg} On a few targets, @code{FLOAT_WORDS_BIG_ENDIAN} disagrees with @code{WORDS_BIG_ENDIAN}. However, most parts of the compiler treat floating point values as if they had the same endianness as integer values. This works because they handle them solely as a collection of integer values, with no particular numerical value. Only real.c and the runtime libraries care about @code{FLOAT_WORDS_BIG_ENDIAN}. Thanks, Richard
Re: allow EH to escape from GIMPLE_EH_ELSE ELSE block
On Jul 4, 2019, Richard Biener wrote: > Yeah. For other stuff we're simply looking at CPP_NAME and > string-matching, see c_parser_gimple_compound_statement > where you'd probably hook this into. Here's a working patch that introduces try/finally[/else] in gimplefe. Regstrapped on x86_64-linux-gnu. Ok to install? introduce try/finally/else in gimplefe for gcc/c/ChangeLog * gimple-parser.c (c_parser_gimple_try_stmt): New. (c_parser_compound_statement): Call it. for gcc/testsuite/ChangeLog * gcc.dg/gimplefe-43.c: New. --- gcc/c/gimple-parser.c | 61 gcc/testsuite/gcc.dg/gimplefe-43.c | 25 +++ 2 files changed, 86 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/gimplefe-43.c diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c index a0ea7215984a..4970ae1e9e08 100644 --- a/gcc/c/gimple-parser.c +++ b/gcc/c/gimple-parser.c @@ -117,6 +117,7 @@ static struct c_expr c_parser_gimple_postfix_expression_after_primary static void c_parser_gimple_declaration (gimple_parser &); static void c_parser_gimple_goto_stmt (gimple_parser &, location_t, tree, gimple_seq *); +static void c_parser_gimple_try_stmt (gimple_parser &, gimple_seq *); static void c_parser_gimple_if_stmt (gimple_parser &, gimple_seq *); static void c_parser_gimple_switch_stmt (gimple_parser &, gimple_seq *); static void c_parser_gimple_return_stmt (gimple_parser &, gimple_seq *); @@ -407,6 +408,9 @@ c_parser_gimple_compound_statement (gimple_parser &parser, gimple_seq *seq) case CPP_KEYWORD: switch (c_parser_peek_token (parser)->keyword) { + case RID_AT_TRY: + c_parser_gimple_try_stmt (parser, seq); + break; case RID_IF: c_parser_gimple_if_stmt (parser, seq); break; @@ -448,6 +452,14 @@ c_parser_gimple_compound_statement (gimple_parser &parser, gimple_seq *seq) c_parser_gimple_label (parser, seq); break; } + if (c_parser_next_token_is (parser, CPP_NAME) + && c_parser_peek_token (parser)->id_kind == C_ID_ID + && strcmp (IDENTIFIER_POINTER (c_parser_peek_token (parser)->value), +"try") == 0) + { + c_parser_gimple_try_stmt (parser, seq); + break; + } /* Basic block specification. __BB (index, ...) */ if ((cfun->curr_properties & PROP_cfg) @@ -2092,6 +2104,55 @@ c_parser_gimple_paren_condition (gimple_parser &parser) return cond; } +/* Parse gimple try statement. + + try-statement: + try { ... } finally { ... } + try { ... } finally { ... } else { ... } + + This could support try/catch as well, but it's not implemented yet. + */ + +static void +c_parser_gimple_try_stmt (gimple_parser &parser, gimple_seq *seq) +{ + gimple_seq tryseq = NULL; + c_parser_consume_token (parser); + c_parser_gimple_compound_statement (parser, &tryseq); + + if ((c_parser_next_token_is (parser, CPP_KEYWORD) + && c_parser_peek_token (parser)->keyword == RID_AT_FINALLY) + || (c_parser_next_token_is (parser, CPP_NAME) + && c_parser_peek_token (parser)->id_kind == C_ID_ID + && strcmp (IDENTIFIER_POINTER (c_parser_peek_token (parser)->value), +"finally") == 0)) +{ + gimple_seq finseq = NULL; + c_parser_consume_token (parser); + c_parser_gimple_compound_statement (parser, &finseq); + + if (c_parser_next_token_is (parser, CPP_KEYWORD) + && c_parser_peek_token (parser)->keyword == RID_ELSE) + { + gimple_seq elsseq = NULL; + c_parser_consume_token (parser); + c_parser_gimple_compound_statement (parser, &elsseq); + + geh_else *stmt = gimple_build_eh_else (finseq, elsseq); + finseq = NULL; + gimple_seq_add_stmt_without_update (&finseq, stmt); + } + + gtry *stmt = gimple_build_try (tryseq, finseq, GIMPLE_TRY_FINALLY); + gimple_seq_add_stmt_without_update (seq, stmt); +} + else if (c_parser_next_token_is (parser, CPP_KEYWORD) + && c_parser_peek_token (parser)->keyword == RID_AT_CATCH) +c_parser_error (parser, "% is not supported"); + else +c_parser_error (parser, "expected % or %"); +} + /* Parse gimple if-else statement. if-statement: diff --git a/gcc/testsuite/gcc.dg/gimplefe-43.c b/gcc/testsuite/gcc.dg/gimplefe-43.c new file mode 100644 index ..5fd66e6dfa5c --- /dev/null +++ b/gcc/testsuite/gcc.dg/gimplefe-43.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-fgimple" } */ + +void __GIMPLE foo() +{ + try +{ + try + { + ; + } + finally + { + ; + } + else + { + ; + } +} + finally +{ + ; +} +} -- Alexandre Oliva, freedom fighter he/him http
Re: allow EH to escape from GIMPLE_EH_ELSE ELSE block
... and here's a patch that uses a try/finally/else gimplefe test to demonstrate the GIMPLE_EH_ELSE lowering problem (might_throw3 is tagged as [LP 1] rather than [LP 2]), and fixes it. Regstrapped on x86_64-linux-gnu. Ok to install? allow EH to escape from GIMPLE_EH_ELSE ELSE block The only preexisting use of GIMPLE_EH_ELSE, for transactional memory commits, did not allow exceptions to escape from the ELSE path. The trick it uses to allow the ELSE path to see the propagating exception does not work very well if the exception cleanup raises further exceptions: the ELSE block is configured to handle exceptions in itself. This confuses the heck out of CFG and EH cleanups. Basing the lowering context for the ELSE block on outer_state, rather than this_state, gets us the expected enclosing handler. for gcc/ChangeLog * tree-eh.c (honor_protect_cleanup_actions): Use outer_ rather than this_state as the lowering context for the ELSE seq in a GIMPLE_EH_ELSE. for gcc/testsuite/ChangeLog * gcc.dg/gimplefe-44.c: New. --- gcc/testsuite/gcc.dg/gimplefe-44.c | 33 + gcc/tree-eh.c | 13 - 2 files changed, 41 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/gimplefe-44.c diff --git a/gcc/testsuite/gcc.dg/gimplefe-44.c b/gcc/testsuite/gcc.dg/gimplefe-44.c new file mode 100644 index ..a9a92b1701ec --- /dev/null +++ b/gcc/testsuite/gcc.dg/gimplefe-44.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-fexceptions -fgimple -fdump-tree-eh-eh" } */ + +void __GIMPLE foo() +{ + try +{ + try + { + extern void might_throw1 (); + might_throw1 (); + } + finally + { + extern void might_throw2 (); + might_throw2 (); + } + else + { + extern void might_throw3 (); + might_throw3 (); + } +} + finally +{ + extern void might_throw4 (); + might_throw4 (); +} +} + +/* { dg-final { scan-tree-dump ".LP 1. might_throw1" "eh" } } */ +/* { dg-final { scan-tree-dump ".LP 2. might_throw2" "eh" } } */ +/* { dg-final { scan-tree-dump ".LP 2. might_throw3" "eh" } } */ diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c index fb7d202fc6f9..5bb07e49d285 100644 --- a/gcc/tree-eh.c +++ b/gcc/tree-eh.c @@ -996,11 +996,14 @@ honor_protect_cleanup_actions (struct leh_state *outer_state, gimple_try_set_cleanup (tf->top_p, gimple_eh_else_n_body (eh_else)); finally = gimple_eh_else_e_body (eh_else); - /* Let the ELSE see the exception that's being processed. */ - eh_region save_ehp = this_state->ehp_region; - this_state->ehp_region = this_state->cur_region; - lower_eh_constructs_1 (this_state, &finally); - this_state->ehp_region = save_ehp; + /* Let the ELSE see the exception that's being processed, but +since the cleanup is outside the try block, process it with +outer_state, otherwise it may be used as a cleanup for +itself, and Bad Things (TM) ensue. */ + eh_region save_ehp = outer_state->ehp_region; + outer_state->ehp_region = this_state->cur_region; + lower_eh_constructs_1 (outer_state, &finally); + outer_state->ehp_region = save_ehp; } else { -- Alexandre Oliva, freedom fighter he/him https://FSFLA.org/blogs/lxo Be the change, be Free! FSF Latin America board member GNU Toolchain EngineerFree Software Evangelist Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara
Re: [patch] Small improvements to coverage info (4/n)
> Hmm. So for > > int baz; > int foo() > { > int i; > goto skip; > done: > return i; > skip: > i = 1; > if (baz) > return baz; > /* ... */ > goto done; > } /* (*) */ > > we'd assign (*) to the return? It might be better to record > (in struct function?) the location of any actually existing > return location and use that? In fact, similar kludgy would > be to simply not clear the GIMPLE_RETURN location > but kludge it up right away? As I mentioned, this leads to various diagnostics regressions. > Can you explain how diagnostics regress when doing that? For example gcc.dg/uninit-17.c: /* { dg-do compile } */ /* { dg-options "-O -Wuninitialized" } */ typedef _Complex float C; C foo(int cond) { C f; __imag__ f = 0; if (cond) { __real__ f = 1; return f; } return f; /* { dg-warning "may be used" "unconditional" } */ } The warning is not emitted on the expected line, but on the final } line. There are a couple of other similar cases in the C and C++ testsuites. > Does coverage somehow treat the function end locus specially > so you chose that? Will it show the alternate returns as > covered at all? I presume stuff like cross-jumping or > GIMPLE tail-merging also wrecks coverage info in similar > ways (well, not by injecting random locations but by > not covering taken paths). Simple things first please. :-) The function's end locus is sort of a kitchen sink, you cannot have wrong coverage info when you use it, but only a possibly incomplete info. -- Eric Botcazou
Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)
Matthias Klose writes: >> I had a look at the GCC 9 version of the patches, with a build including a >> make >> install. Some comments: > > Had a test build based on the gcc-9 branch, > https://launchpad.net/~doko/+archive/ubuntu/toolchain/+sourcepub/10331180/+listing-archive-extra > > powerpc64le-linux-gnu fails to build (search for "unfinished" in the build > log) > > during RTL pass: final > ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def: In function > '_M2_SYSTEM_init': > ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def:20: internal compiler error: > in > rs6000_output_function_epilogue, at conf > ig/rs6000/rs6000.c:29169 >20 | DEFINITION MODULE SYSTEM ; > | > 0x10b6b7c7 rs6000_output_function_epilogue > ../../src/gcc/config/rs6000/rs6000.c:29169 > 0x1043f80f final_end_function() > ../../src/gcc/final.c:1887 > 0x10445313 rest_of_handle_final > ../../src/gcc/final.c:4667 > 0x10445313 execute > ../../src/gcc/final.c:4737 > Please submit a full bug report, > with preprocessed source if appropriate. > > this is using GCC 8 as the bootstrap compiler. > > search the build logs for "test_summary" to see the test results. The binary > packages gcc-9-test-results contain the log/sum files for the tests. > > all the link tests fail with: > > xgm2: fatal error: cannot execute 'gm2l': execvp: No such file or directory > compilation terminated. > compiler exited with status 1 Hi Matthias, many thanks for the build results. There is a newer version of gm2/Make-lang.in on the 9.1.0 branch which (quietens the output for mc) and fixes the gm2l execvp (on other platforms). The ICE is very interesting! regards, Gaius
[PATCH], PowerPC, Patch #8, rename rs6000_prefixed_address
In a future patch, I plan to introduce a new function that says whether an address is prefixed based on the address format (D-form, DS-form, DQ-form) used by the instruction. I plan on naming the new function: rs6000_prefixed_address_format This means the existing function that takes a mode argument will be renamed to: rs6000_prefixed_address_mode Rs6000_prefixed_address_mode will have a table mapping the mode into the expected address format, and then call rs6000_prefixed_address_format. This is due to the number of irregularities in the PowerPC architecture: LWA is DS-form, but LWZ is D-form LFD is D-form, LXSD is DS-form In the tests that will occur after register allocation, we can check the register type, and more accurately determine if a prefixed instruction must be used if the offset is odd (for DS-form) or the offset is not aligned on a 16-byte boundary (for DQ-form). This patch prepares the way by renaming the current function, and changing its one use. I have bootstrapped a compiler on a little endian power8 and there were no regressions in the test suite. Can I check this patch into the trunk? 2019-07-10 Michael Meissner * config/rs6000/predicates.md (prefixed_mem_operand): Call rs6000_prefixed_address_mode instead of rs6000_prefixed_address. * config/rs6000/rs6000-protos.h (rs6000_prefixed_address_mode): Rename fucntion from rs6000_prefixed_address. * config/rs6000/rs6000.c (rs6000_prefixed_address_mode): Rename fucntion from rs6000_prefixed_address. Index: gcc/config/rs6000/predicates.md === --- gcc/config/rs6000/predicates.md (revision 273368) +++ gcc/config/rs6000/predicates.md (working copy) @@ -1686,7 +1686,7 @@ (define_predicate "pcrel_external_addres (define_predicate "prefixed_mem_operand" (match_code "mem") { - return rs6000_prefixed_address (XEXP (op, 0), GET_MODE (op)); + return rs6000_prefixed_address_mode (XEXP (op, 0), GET_MODE (op)); }) ;; Return 1 if op is a memory operand to an external variable when we Index: gcc/config/rs6000/rs6000-protos.h === --- gcc/config/rs6000/rs6000-protos.h (revision 273367) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -154,7 +154,7 @@ extern align_flags rs6000_loop_align (rt extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool); extern bool rs6000_pcrel_p (struct function *); extern bool rs6000_fndecl_pcrel_p (const_tree); -extern bool rs6000_prefixed_address (rtx, machine_mode); +extern bool rs6000_prefixed_address_mode (rtx, machine_mode); #endif /* RTX_CODE */ #ifdef TREE_CODE Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 273368) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -21514,7 +21514,7 @@ mode_supports_prefixed_address_p (machin mode MODE. */ bool -rs6000_prefixed_address (rtx addr, machine_mode mode) +rs6000_prefixed_address_mode (rtx addr, machine_mode mode) { if (!TARGET_PREFIXED_ADDR || !mode_supports_prefixed_address_p (mode)) return false; -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH PR Fortran/89286] Intrinsic sign and GNU Extension
On 10/07/2019 17:20, Bernhard Reutner-Fischer wrote: On 10 July 2019 17:52:40 CEST, Steve Kargl wrote: On Wed, Jul 10, 2019 at 02:50:47PM +0100, Mark Eggleston wrote: The attached patch treats the intrinsic SIGN in the same way as MOD and DIM as it has the same arguments. Tested using make -j 8 check-fortran on x86_64 Conditional compilation using #ifdef __GFC_REAL_16__ has been employed where appropriate in the test cases so should be OK on platforms without REAL(16). Change logs: gcc/fortran Mark Eggleston PR fortran/89286 * check.c (gfc_check_sign): Deleted. ChangeLog has to be in present tense per convention. * intrinsic.c (add_functions): Call add_sym_2 with gfc_check_a_p instead of gfc_check_sign for "sign". * intrinsic.h: Remove declaration of gfc_check_sign. * iresolve.c (gfc_resolve_sign): Check for largest kind of the actual arguments and convert the smaller. Set return kind to be the largest. * simplify.c (gfc_simplify_sign): Use the largest kind of the actual arguments for return * intrinsic.texi: Add GNU extension notes for return value to SIGN. gcc/testsuite Mark Eggleston PR fortran/89240 * gfortran.dg/sign_gnu_extension_1.f90: New test. * gfortran.dg/sign_gnu_extension_2.f90: New test. * gfortran.dg/pr78619.f90: Check for "must have" instead of "must be". If OK please can someone commit as I do not have the privileges. We really need to get you commit access to the tree. I also am not a fan of this type of change. Having spent the last few days working on fixing BOZ to conform to F2018, I'm finding all sorts of undocumented "extensions". Personally, I think gfortran should be moving towards the standard by deprecating of these types of extensions. I agree. I think that -std=gnu should not be the default, if gnu extensions are required you have to ask for them. At least make them explicit under explicit extension or at least -legacy or whatever its called. thanks, I agree, at the moment the GNU extension is silently supported by DIM and MOD -- https://www.codethink.co.uk/privacy.html
Re: [PATCH,RFC,V3 0/5] Support for CTF in GCC
[Sorry for delay: head down in linker plus having nice food poisoning bouts] On 9 Jul 2019, Mike Stump verbalised: > On Jul 5, 2019, at 11:28 AM, Nix wrote: >> ICTF for the entire Linux kernel is about 6MiB > > Any reason why not add CTF to the next dwarf standard? Then, we just > support the next dwarf standard. If not, have you started talks with > them to add it? A mixture of impostor syndrome, the fact that CTF is really very non-DWARFish in all sorts of ways, and the fact that CTF-the-format is changing quite fast right now means that... well, if it is to be added, now is not the time. I haven't even documented it in texi yet :) (Just suggestions for improvement I've had on the binutils list will lead to a good few changes :) ). Right now, the rule for compatibility is that libctf will always be able to read all earlier versions written by any released binutils or libdtrace-ctf, and rewrite them as the latest version -- and one improvement I have planned is that it will eventually be able to *write* older versions as well, as long as doing so doesn't lose information or run into limitations of the older format (like trying to write >2^16 types to a format v1 container, or add an enum bitfield to a v2 container). I'm doing this in the obvious fashion: every time the format written by binutils libctf changes, it keeps the ability to upgrade all older CTF formats any release of binutils ever accepted to the latest format. Every binutils release after such a change constitutes a boundary: the next format change after that will bump the CTF format version, and the just-released format will be upgraded to be compatible with any new stuff that gets added. If CTF generation support lands in GCC, I'll treat compiler releases the same way, nailing the format any released GCC emits into binutils libctf at release time and ensuring binutils libctf can always accept it (and thus binutils ld can always link it and gdb can always use it). (I do not plan to ever drop support for any older CTF formats: indeed I plan to extend it so that the FreeBSD/Solaris CTF can be read as well, and hopefully eventually written too.) This should suffice to ensure that the CTF emitted by any released compiler and any released binutils can always be accepted by newer releases, and is probably the right approach until format evolution slows and we can start to actually standardize this. > Long term, this is a better solution, as we then get more > interoperability, more support, more tools and more goodness. Agreed! I do hope libctf remains flexible and useful enough that everyone can use it as a "format oracle", but I would welcome other implementations: the more the merrier! (It's just that now might be too early and too annoying for the other implementors, since the format is evolving faster than it ever has, thanks to all the lovely suggestions on the binutils list). If libctf *does* gain the ability to downgrade as well as upgrade formats, we can keep evolving the format even after standardization, with libctf translating the standardized version to newer versions and back down again as needed, restandardizing at intervals so the other tools can catch up: this seems like a fairly strong reason to gain the ability to write out old versions as well as new ones. (But I'm getting way ahead of myself here: the internal intermediate representation inside libctf that will make this sort of format ubiquity possible only exists inside my head right now, after all.)
Re: [PATCH,RFC,V3 0/5] Support for CTF in GCC
On 10 Jul 2019, Segher Boessenkool spake thusly: > On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: >> On 5 Jul 2019, Richard Biener said: >> >> > On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat wrote: >> >> CTF, at this time, is type information for entities at global or file >> >> scope. >> >> This can be used by online debuggers, program tracers (dynamic tracing); >> >> More >> >> generally, it provides type introspection for C programs, with an optional >> >> library API to allow them to get at their own types quite more easily than >> >> DWARF. So, the umbrella usecases are - all C programs that want to >> >> introspect >> >> their own types quickly; and applications that want to introspect other >> >> programs's types quickly. >> > >> > What makes it superior to DWARF stripped down to the above feature set? >> >> Increased compactness. > > Does CTF support something like -fasynchronous-unwind-tables? You need > that to have any sane debugging on many platforms. Without it, you > even have only partial backtraces, on most architectures/ABIs anyway. The backtrace section is still being designed, so it could! There is certainly nothing intrinsically preventing it. Am I right that this stuff works by ensuring that the arg->location picture is consistent at all times, between every instruction, rather than just at function calls, i.e. tracking all register moves done by the compiler, even transiently? Because that sounds doable, given that the compiler is doing the hard work of identifying such locations anyway (it has to for DWARF -fasynchronous-unwind-tables, right?). It seems essential to do this in any case if you want to get correct args for the function the user is actually stopped at: there's no requirement that the user is stopped at a function call!
[PATCH] Support folding from array ctor spanning multiple elements
The following exends fold_array_ctor_reference to constant-fold reads that cross multiple array elements which is needed to fix folding for targets like aarch64 in PR83518 where we end up with arr = *.LC0; arr[0] = 5; vect__56.15_75 = MEM [(int *)&arr]; now it would be nice to have an iterator-style interface for accessing ctor elements of an array ctor, the code I wrote is somewhat ugly. It also seems we have to support unordered ctors in get_array_ctor_element_at_index when called from (some) frontend(s) context. When in GIMPLE form we could use a binary search which conveniently the C++ frontend already implements for constexpr folding. Test coverage is also weak (writing more testcases now...), esp. for the cases with missing initializers. Anywhow, boostrap / regtest running on x86_64-unknown-linux-gnu. The folding code is unfortunately structured in a way that doesn't easily allow to deal with sub-ctors here. Any comments? Thanks, Richard. 2019-07-11 Richard Biener * fold-const.h (get_array_ctor_element_at_index): Adjust. * fold-const.c (get_array_ctor_element_at_index): Add ctor_idx output parameter informing the caller where in the constructor the element was (not) found. Add early exit for when the ctor is sorted. * gimple-fold.c (fold_array_ctor_reference): Support constant folding across multiple array elements. * gcc.dg/tree-ssa/vector-7.c: New testcase. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 273355) +++ gcc/fold-const.c(working copy) @@ -11839,10 +11839,15 @@ fold_ternary_loc (location_t loc, enum t } /* Gets the element ACCESS_INDEX from CTOR, which must be a CONSTRUCTOR - of an array (or vector). */ + of an array (or vector). *CTOR_IDX if non-NULL is updated with the + constructor element index of the value returned. If the element is + not found NULL_TREE is returned and *CTOR_IDX is updated to + the index of the element after the ACCESS_INDEX position (which + may be outside of the CTOR array). */ tree -get_array_ctor_element_at_index (tree ctor, offset_int access_index) +get_array_ctor_element_at_index (tree ctor, offset_int access_index, +unsigned *ctor_idx) { tree index_type = NULL_TREE; offset_int low_bound = 0; @@ -11869,7 +11874,7 @@ get_array_ctor_element_at_index (tree ct TYPE_SIGN (index_type)); offset_int max_index; - unsigned HOST_WIDE_INT cnt; + unsigned cnt; tree cfield, cval; FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval) @@ -11897,11 +11902,26 @@ get_array_ctor_element_at_index (tree ct max_index = index; } -/* Do we have match? */ -if (wi::cmpu (access_index, index) >= 0 - && wi::cmpu (access_index, max_index) <= 0) - return cval; - } + /* Do we have match? */ + if (wi::cmpu (access_index, index) >= 0) + { + if (wi::cmpu (access_index, max_index) <= 0) + { + if (ctor_idx) + *ctor_idx = cnt; + return cval; + } + } + else if (in_gimple_form) + /* We're past the element we search for. Note during parsing + the elements might not be sorted. + ??? We should use a binary search and a flag on the + CONSTRUCTOR as to whether elements are sorted in declaration + order. */ + break; +} + if (ctor_idx) +*ctor_idx = cnt; return NULL_TREE; } Index: gcc/fold-const.h === --- gcc/fold-const.h(revision 273355) +++ gcc/fold-const.h(working copy) @@ -67,7 +67,8 @@ extern tree fold_build_call_array_loc (l #define fold_build_call_array_initializer(T1,T2,N,T4)\ fold_build_call_array_initializer_loc (UNKNOWN_LOCATION, T1, T2, N, T4) extern tree fold_build_call_array_initializer_loc (location_t, tree, tree, int, tree *); -extern tree get_array_ctor_element_at_index (tree, offset_int); +extern tree get_array_ctor_element_at_index (tree, offset_int, +unsigned * = NULL); extern bool fold_convertible_p (const_tree, const_tree); #define fold_convert(T1,T2)\ fold_convert_loc (UNKNOWN_LOCATION, T1, T2) Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 273355) +++ gcc/gimple-fold.c (working copy) @@ -6716,14 +6716,13 @@ fold_array_ctor_reference (tree type, tr elt_size = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (ctor; /* When TYPE is non-null, verify that it specifies a constant-sized - accessed not larger than size of array element. Avoid division + access of a multiple of the array element size. Avoid division by zero below when ELT_SIZE is zero, such as with the result
Re: [patch] Small improvements to coverage info (4/n)
On Thu, Jul 11, 2019 at 12:52 PM Eric Botcazou wrote: > > > Hmm. So for > > > > int baz; > > int foo() > > { > > int i; > > goto skip; > > done: > > return i; > > skip: > > i = 1; > > if (baz) > > return baz; > > /* ... */ > > goto done; > > } /* (*) */ > > > > we'd assign (*) to the return? It might be better to record > > (in struct function?) the location of any actually existing > > return location and use that? In fact, similar kludgy would > > be to simply not clear the GIMPLE_RETURN location > > but kludge it up right away? > > As I mentioned, this leads to various diagnostics regressions. > > > Can you explain how diagnostics regress when doing that? > > For example gcc.dg/uninit-17.c: > > /* { dg-do compile } */ > /* { dg-options "-O -Wuninitialized" } */ > > typedef _Complex float C; > C foo(int cond) > { > C f; > __imag__ f = 0; > if (cond) > { > __real__ f = 1; > return f; > } > return f; /* { dg-warning "may be used" "unconditional" } */ > } > > The warning is not emitted on the expected line, but on the final } line. It's odd that we pick up this location w/o a location on the return stmt ... and probably luck on which one we warn. > There are a couple of other similar cases in the C and C++ testsuites. > > > Does coverage somehow treat the function end locus specially > > so you chose that? Will it show the alternate returns as > > covered at all? I presume stuff like cross-jumping or > > GIMPLE tail-merging also wrecks coverage info in similar > > ways (well, not by injecting random locations but by > > not covering taken paths). > > Simple things first please. :-) The function's end locus is sort of a kitchen > sink, you cannot have wrong coverage info when you use it, but only a possibly > incomplete info. After your patch does behavior change when trying to break on a line with a return stmt inside a debugger? Thanks, Richard. > -- > Eric Botcazou
[PATCH V5] PR88497 - Extend reassoc for vector bit_field_ref
Hi Richard, on 2019/7/10 下午7:50, Richard Biener wrote: > On Mon, 8 Jul 2019, Kewen.Lin wrote: > > > + tree rhs = gimple_assign_rhs1 (oe1def); > + tree vec = TREE_OPERAND (rhs, 0); > + tree vec_type = TREE_TYPE (vec); > + > + if (TREE_CODE (vec) != SSA_NAME || !VECTOR_TYPE_P (vec_type)) > + continue; > ... > + /* Ignore it if target machine can't support this VECTOR type. */ > + if (!VECTOR_MODE_P (TYPE_MODE (vec_type))) > + continue; > > put the check below next to the one above, it's cheaper than the > poly-int stuff that currently preceeds it. > > + v_info_ptr *info_ptr = v_info_map.get (vec); > + if (info_ptr) > + { > > there is get_or_insert which enables you to mostly merge the two cases > with a preceeding > > if (!existed) > v_info_ptr = new v_info; > > also eliding the extra hash-lookup the put operation otherwise requires. > > + for (hash_map::iterator it = v_info_map.begin (); > + it != v_info_map.end (); ++it) > +{ > + tree cand_vec = (*it).first; > + v_info_ptr cand_info = (*it).second; > + unsigned int num_elems = VECTOR_CST_NELTS (cand_vec).to_constant > (); > > please add a quick out like > > if (cand_info->length () != num_elems) > continue; > > since to have no holes and no overlap you need exactly that many. > > I think you can avoid the somewhat ugly mode_to_total counter array. > Since you have sorted valid_vecs after modes simply do > > for (unsigned i = 0; i < valid_vecs.length () - 1; ++i) > { > tree tvec = valid_vecs[i]; > enum machine_mode mode = TYPE_MODE (TREE_TYPE (tvec)); > > (please don't use unsigned int for mode!) > > /* Skip modes with only a single candidate. */ > if (TYPE_MODE (TREE_TYPE (valid_vecs[i+1])) != mode) > continue; > > do > { > ... > } > while (...) > > Richard. > Thanks a lot for your comments, I've updated the patch accordingly (also including Segher's comments on test cases). Bootstrapped and regression test passed on powerpc64le-unknown-linux-gnu and x86_64-linux-gnu. Is it ok for trunk? Thanks, Kewen --- gcc/ChangeLog 2019-07-11 Kewen Lin PR tree-optimization/88497 * tree-ssa-reassoc.c (reassociate_bb): Swap the positions of GIMPLE_BINARY_RHS check and gimple_visited_p check, call new function undistribute_bitref_for_vector. (undistribute_bitref_for_vector): New function. (cleanup_vinfo_map): Likewise. (sort_by_mach_mode): Likewise. gcc/testsuite/ChangeLog 2019-07-11 Kewen Lin * gcc.dg/tree-ssa/pr88497-1.c: New test. * gcc.dg/tree-ssa/pr88497-2.c: Likewise. * gcc.dg/tree-ssa/pr88497-3.c: Likewise. * gcc.dg/tree-ssa/pr88497-4.c: Likewise. * gcc.dg/tree-ssa/pr88497-5.c: Likewise. * gcc.dg/tree-ssa/pr88497-6.c: Likewise. * gcc.dg/tree-ssa/pr88497-7.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c new file mode 100644 index 000..b6dd7ba --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c @@ -0,0 +1,60 @@ +/* { dg-do run } */ +/* { dg-require-effective-target vect_double } */ +/* { dg-require-effective-target vsx_hw { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target sse2_runtime { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options "-O2 -ffast-math -fdump-tree-reassoc1" } */ +/* { dg-additional-options "-mvsx" { target { powerpc*-*-* } } } */ +/* { dg-additional-options "-msse2" { target { i?86-*-* x86_64-*-* } } } */ + +/* To test reassoc can undistribute vector bit_field_ref summation. + + arg1 and arg2 are two arrays whose elements of type vector double. + Assuming: + A0 = arg1[0], A1 = arg1[1], A2 = arg1[2], A3 = arg1[3], + B0 = arg2[0], B1 = arg2[1], B2 = arg2[2], B3 = arg2[3], + + Then: + V0 = A0 * B0, V1 = A1 * B1, V2 = A2 * B2, V3 = A3 * B3, + + reassoc transforms + + accumulator += V0[0] + V0[1] + V1[0] + V1[1] + V2[0] + V2[1] + + V3[0] + V3[1]; + + into: + + T = V0 + V1 + V2 + V3 + accumulator += T[0] + T[1]; + + Fewer bit_field_refs, only two for 128 or more bits vector. */ + +typedef double v2df __attribute__ ((vector_size (16))); +__attribute__ ((noinline)) double +test (double accumulator, v2df arg1[], v2df arg2[]) +{ + v2df temp; + temp = arg1[0] * arg2[0]; + accumulator += temp[0] + temp[1]; + temp = arg1[1] * arg2[1]; + accumulator += temp[0] + temp[1]; + temp = arg1[2] * arg2[2]; + accumulator += temp[0] + temp[1]; + temp = arg1[3] * arg2[3]; + accumulator += temp[0] + temp[1]; + return accumulator; +} + +extern void abort (void); + +int +main () +{ + v2df v2[4] = {{1.0, 2.0}, {4.0, 8.0}, {1.0, 3.0}, {9.0, 27.0}}; + v2df v3[4] = {{1.0, 4.0}, {16.0, 64.0}, {1.0, 2.0}, {3.0, 4.0}}; + double acc = 100.0; + double res = test (acc, v2, v3); + if (res != 827.0) +
Re: [PATCH] Add hints for slim dumping if fallthrough bb of jump isn't next bb
Hi Richard, on 2019/7/11 下午4:51, Richard Sandiford wrote: > Kewen.Lin writes: >> >> - if (flags & TDF_BLOCKS) > > I think we still need an if here, but with the condition instead being: > >cfun->curr_properties & PROP_cfg > >> + else if (EDGE_COUNT (bb->succs) > 0) >> + { >> + gcc_assert (BB_END (bb) == tmp_rtx); >> + const rtx_insn *ninsn = NEXT_INSN (tmp_rtx); > > I think it'd be better to loop until we hit a real insn or a nonnull > start[], e.g. to cope with intervening deleted-insn notes and debug insns. > Something like: > > while (ninsn && !INSN_P (nisn) && !start[INSN_UID (ninsn)]) > ninsn = NEXT_INSN (insn); > Thanks a lot for the comments, I've merged your suggested codes as below. Regression testing is ongoing on powerpc64le-unknown-linux-gnu. If regtest and bootstrap is ok, is it ok for trunk? Thanks, Kewen - gcc/ChangeLog 2019-07-11 Kewen Lin * gcc/cfgrtl.c (print_rtl_with_bb): Emit a hint if the fallthrough target of current basic block isn't the placed right next. diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c index a1ca5992c41..19b118a8ece 100644 --- a/gcc/cfgrtl.c +++ b/gcc/cfgrtl.c @@ -2193,7 +2193,7 @@ print_rtl_with_bb (FILE *outf, const rtx_insn *rtx_first, dump_flags_t flags) if (df) df_dump_start (outf); - if (flags & TDF_BLOCKS) + if (cfun->curr_properties & PROP_cfg) { FOR_EACH_BB_REVERSE_FN (bb, cfun) { @@ -2201,16 +2201,19 @@ print_rtl_with_bb (FILE *outf, const rtx_insn *rtx_first, dump_flags_t flags) start[INSN_UID (BB_HEAD (bb))] = bb; end[INSN_UID (BB_END (bb))] = bb; - for (x = BB_HEAD (bb); x != NULL_RTX; x = NEXT_INSN (x)) + if (flags & TDF_BLOCKS) { - enum bb_state state = IN_MULTIPLE_BB; + for (x = BB_HEAD (bb); x != NULL_RTX; x = NEXT_INSN (x)) + { + enum bb_state state = IN_MULTIPLE_BB; - if (in_bb_p[INSN_UID (x)] == NOT_IN_BB) - state = IN_ONE_BB; - in_bb_p[INSN_UID (x)] = state; + if (in_bb_p[INSN_UID (x)] == NOT_IN_BB) + state = IN_ONE_BB; + in_bb_p[INSN_UID (x)] = state; - if (x == BB_END (bb)) - break; + if (x == BB_END (bb)) + break; + } } } } @@ -2244,16 +2247,35 @@ print_rtl_with_bb (FILE *outf, const rtx_insn *rtx_first, dump_flags_t flags) if (flags & TDF_DETAILS) df_dump_insn_bottom (tmp_rtx, outf); - if (flags & TDF_BLOCKS) + bb = end[INSN_UID (tmp_rtx)]; + if (bb != NULL) { - bb = end[INSN_UID (tmp_rtx)]; - if (bb != NULL) + if (flags & TDF_BLOCKS) { dump_bb_info (outf, bb, 0, dump_flags, false, true); if (df && (flags & TDF_DETAILS)) df_dump_bottom (bb, outf); putc ('\n', outf); } + /* Emit a hint if the fallthrough target of current basic block +isn't the one placed right next. */ + else if (EDGE_COUNT (bb->succs) > 0) + { + gcc_assert (BB_END (bb) == tmp_rtx); + const rtx_insn *ninsn = NEXT_INSN (tmp_rtx); + /* Bypass intervening deleted-insn notes and debug insns. */ + while (ninsn && !NONDEBUG_INSN_P (ninsn) +&& !start[INSN_UID (ninsn)]) + ninsn = NEXT_INSN (ninsn); + edge e = find_fallthru_edge (bb->succs); + if (e && ninsn) + { + basic_block dest = e->dest; + if (start[INSN_UID (ninsn)] != dest) + fprintf (outf, "%s ; pc falls through to BB %d\n", +print_rtx_head, dest->index); + } + } } }
Make nonoverlapping_component_refs_since_match_p work with non-trivial MEM_REFs and TMRs
Hi, this patch makes nonoverlapping_component_refs_since_match_p to accept paths with non-trivial MEM_REFs and TMRs assuming that they have same semantics. Bootstrapped/regtested x86_64-linux, OK? Honza * tree-ssa-alias.c (same_tmr_indexing_p): Break out from ... (indirect_refs_may_alias_p): ... here. (nonoverlapping_component_refs_since_match_p): Support also non-trivial mem refs in the access paths. Index: testsuite/gcc.dg/tree-ssa/alias-access-path-9.c === --- testsuite/gcc.dg/tree-ssa/alias-access-path-9.c (nonexistent) +++ testsuite/gcc.dg/tree-ssa/alias-access-path-9.c (working copy) @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-fre1" } */ + +/* This testcase tests nonoverlapping_component_refs_since_match_p in presence + of non-trivial mem-refs. */ +struct a {int a,b;}; +struct b {struct a a[10];}; +struct c {int c; struct b b;} c, *cptr; + +void +set_a(struct a *a, int p) +{ + a->a=p; +} +void +set_b(struct a *a, int p) +{ + a->b=p; +} +int +get_a(struct a *a) +{ + return a->a; +} + +int +test(int i, int j) +{ + struct b *bptr = &c.b; + set_a (&bptr->a[i], 123); + set_b (&bptr->a[j], 124); + return get_a (&bptr->a[i]); +} + +int +test2(int i, int j) +{ + struct b *bptr = &cptr->b; + set_a (&bptr->a[i], 125); + set_b (&bptr->a[j], 126); + return get_a (&bptr->a[i]); +} +/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */ +/* { dg-final { scan-tree-dump-times "return 125" 1 "fre1"} } */ Index: tree-ssa-alias.c === --- tree-ssa-alias.c(revision 273322) +++ tree-ssa-alias.c(working copy) @@ -1216,6 +1216,25 @@ nonoverlapping_component_refs_p_1 (const return -1; } +/* Return if TARGET_MEM_REFS base1 and base2 have same offsets. */ + +static bool +same_tmr_indexing_p (tree base1, tree base2) +{ + return ((TMR_STEP (base1) == TMR_STEP (base2) + || (TMR_STEP (base1) && TMR_STEP (base2) + && operand_equal_p (TMR_STEP (base1), + TMR_STEP (base2), 0))) + && (TMR_INDEX (base1) == TMR_INDEX (base2) + || (TMR_INDEX (base1) && TMR_INDEX (base2) + && operand_equal_p (TMR_INDEX (base1), + TMR_INDEX (base2), 0))) + && (TMR_INDEX2 (base1) == TMR_INDEX2 (base2) + || (TMR_INDEX2 (base1) && TMR_INDEX2 (base2) + && operand_equal_p (TMR_INDEX2 (base1), + TMR_INDEX2 (base2), 0; +} + /* Try to disambiguate REF1 and REF2 under the assumption that MATCH1 and MATCH2 either point to the same address or are disjoint. MATCH1 and MATCH2 are assumed to be ref in the access path of REF1 and REF2 @@ -1265,20 +1284,6 @@ nonoverlapping_component_refs_since_matc component_refs1.safe_push (ref1); ref1 = TREE_OPERAND (ref1, 0); } - if (TREE_CODE (ref1) == MEM_REF && ref1 != match1) -{ - if (!integer_zerop (TREE_OPERAND (ref1, 1))) - { - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; - } -} - /* TODO: Handle TARGET_MEM_REF later. */ - if (TREE_CODE (ref1) == TARGET_MEM_REF && ref1 != match1) -{ - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; -} /* Create the stack of handled components for REF2. */ while (handled_component_p (ref2) && ref2 != match2) @@ -1290,15 +1295,39 @@ nonoverlapping_component_refs_since_matc component_refs2.safe_push (ref2); ref2 = TREE_OPERAND (ref2, 0); } - if (TREE_CODE (ref2) == MEM_REF && ref2 != match2) + + bool mem_ref1 = TREE_CODE (ref1) == MEM_REF && ref1 != match1; + bool mem_ref2 = TREE_CODE (ref2) == MEM_REF && ref2 != match2; + + /* If only one of access path starts with MEM_REF check that offset is 0 + so the addresses stays the same after stripping it. + TODO: In this case we may walk the other access path until we get same + offset. + + If both starts with MEM_REF, offset has to be same. */ + if ((mem_ref1 && !mem_ref2 && !integer_zerop (TREE_OPERAND (ref1, 1))) + || (mem_ref2 && !mem_ref1 && !integer_zerop (TREE_OPERAND (ref2, 1))) + || (mem_ref1 && mem_ref2 + && !tree_int_cst_equal (TREE_OPERAND (ref1, 1), + TREE_OPERAND (ref2, 1 { - if (!integer_zerop (TREE_OPERAND (ref2, 1))) - { - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; - } + ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; + return -1; } - if (TREE_CODE (ref2) == TARGET_MEM_REF && ref2 != match2) + + bool target_mem_ref1 = TREE_CODE (ref1) == TARGET_MEM_REF && ref1 != match1; + bool target_mem_ref2 = TREE_CODE (ref2) == TARGET_ME
Re: [PATCH] Add hints for slim dumping if fallthrough bb of jump isn't next bb
Kewen.Lin writes: > + /* Emit a hint if the fallthrough target of current basic block > +isn't the one placed right next. */ > + else if (EDGE_COUNT (bb->succs) > 0) > + { > + gcc_assert (BB_END (bb) == tmp_rtx); > + const rtx_insn *ninsn = NEXT_INSN (tmp_rtx); > + /* Bypass intervening deleted-insn notes and debug insns. > */ > + while (ninsn && !NONDEBUG_INSN_P (ninsn) > +&& !start[INSN_UID (ninsn)]) Just a cosmetic thing, but when the full expression needs to be split over several lines, there should be one condition per line: while (ninsn && !NONDEBUG_INSN_P (ninsn) && !start[INSN_UID (ninsn)]) OK with that change, thanks. Richard > + ninsn = NEXT_INSN (ninsn); > + edge e = find_fallthru_edge (bb->succs); > + if (e && ninsn) > + { > + basic_block dest = e->dest; > + if (start[INSN_UID (ninsn)] != dest) > + fprintf (outf, "%s ; pc falls through to BB > %d\n", > +print_rtx_head, dest->index); > + } > + } > } > }
Re: Make nonoverlapping_component_refs_since_match_p work with non-trivial MEM_REFs and TMRs
On Thu, 11 Jul 2019, Jan Hubicka wrote: > Hi, > this patch makes nonoverlapping_component_refs_since_match_p to accept > paths with non-trivial MEM_REFs and TMRs assuming that they have same > semantics. Hmm. We'll never get any TARGET_MEM_REFs wrapped with handled-components so I wonder if it makes sense to handle it in nonoverlapping_component_refs_since_match_p at all. > Bootstrapped/regtested x86_64-linux, OK? > > Honza > > * tree-ssa-alias.c (same_tmr_indexing_p): Break out from ... > (indirect_refs_may_alias_p): ... here. > (nonoverlapping_component_refs_since_match_p): Support also non-trivial > mem refs in the access paths. > Index: testsuite/gcc.dg/tree-ssa/alias-access-path-9.c > === > --- testsuite/gcc.dg/tree-ssa/alias-access-path-9.c (nonexistent) > +++ testsuite/gcc.dg/tree-ssa/alias-access-path-9.c (working copy) > @@ -0,0 +1,44 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-fre1" } */ > + > +/* This testcase tests nonoverlapping_component_refs_since_match_p in > presence > + of non-trivial mem-refs. */ > +struct a {int a,b;}; > +struct b {struct a a[10];}; > +struct c {int c; struct b b;} c, *cptr; > + > +void > +set_a(struct a *a, int p) > +{ > + a->a=p; > +} > +void > +set_b(struct a *a, int p) > +{ > + a->b=p; > +} > +int > +get_a(struct a *a) > +{ > + return a->a; > +} > + > +int > +test(int i, int j) > +{ > + struct b *bptr = &c.b; > + set_a (&bptr->a[i], 123); > + set_b (&bptr->a[j], 124); > + return get_a (&bptr->a[i]); > +} > + > +int > +test2(int i, int j) > +{ > + struct b *bptr = &cptr->b; > + set_a (&bptr->a[i], 125); > + set_b (&bptr->a[j], 126); > + return get_a (&bptr->a[i]); > +} > +/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */ > +/* { dg-final { scan-tree-dump-times "return 125" 1 "fre1"} } */ > Index: tree-ssa-alias.c > === > --- tree-ssa-alias.c (revision 273322) > +++ tree-ssa-alias.c (working copy) > @@ -1216,6 +1216,25 @@ nonoverlapping_component_refs_p_1 (const >return -1; > } > > +/* Return if TARGET_MEM_REFS base1 and base2 have same offsets. */ > + > +static bool > +same_tmr_indexing_p (tree base1, tree base2) > +{ > + return ((TMR_STEP (base1) == TMR_STEP (base2) > + || (TMR_STEP (base1) && TMR_STEP (base2) > + && operand_equal_p (TMR_STEP (base1), > + TMR_STEP (base2), 0))) > + && (TMR_INDEX (base1) == TMR_INDEX (base2) > + || (TMR_INDEX (base1) && TMR_INDEX (base2) > + && operand_equal_p (TMR_INDEX (base1), > + TMR_INDEX (base2), 0))) > + && (TMR_INDEX2 (base1) == TMR_INDEX2 (base2) > + || (TMR_INDEX2 (base1) && TMR_INDEX2 (base2) > + && operand_equal_p (TMR_INDEX2 (base1), > + TMR_INDEX2 (base2), 0; > +} > + > /* Try to disambiguate REF1 and REF2 under the assumption that MATCH1 and > MATCH2 either point to the same address or are disjoint. > MATCH1 and MATCH2 are assumed to be ref in the access path of REF1 and > REF2 > @@ -1265,20 +1284,6 @@ nonoverlapping_component_refs_since_matc > component_refs1.safe_push (ref1); >ref1 = TREE_OPERAND (ref1, 0); > } > - if (TREE_CODE (ref1) == MEM_REF && ref1 != match1) > -{ > - if (!integer_zerop (TREE_OPERAND (ref1, 1))) > - { > - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; > - return -1; > - } > -} > - /* TODO: Handle TARGET_MEM_REF later. */ > - if (TREE_CODE (ref1) == TARGET_MEM_REF && ref1 != match1) > -{ > - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; > - return -1; > -} > >/* Create the stack of handled components for REF2. */ >while (handled_component_p (ref2) && ref2 != match2) > @@ -1290,15 +1295,39 @@ nonoverlapping_component_refs_since_matc > component_refs2.safe_push (ref2); >ref2 = TREE_OPERAND (ref2, 0); > } > - if (TREE_CODE (ref2) == MEM_REF && ref2 != match2) > + > + bool mem_ref1 = TREE_CODE (ref1) == MEM_REF && ref1 != match1; > + bool mem_ref2 = TREE_CODE (ref2) == MEM_REF && ref2 != match2; > + > + /* If only one of access path starts with MEM_REF check that offset is 0 > + so the addresses stays the same after stripping it. > + TODO: In this case we may walk the other access path until we get same > + offset. > + > + If both starts with MEM_REF, offset has to be same. */ > + if ((mem_ref1 && !mem_ref2 && !integer_zerop (TREE_OPERAND (ref1, 1))) > + || (mem_ref2 && !mem_ref1 && !integer_zerop (TREE_OPERAND (ref2, 1))) > + || (mem_ref1 && mem_ref2 > + && !tree_int_cst_equal (TREE_OPERAND (ref1, 1), > + TREE_OPERAND (ref2, 1 > { > - if (!integer_zerop (TREE_O
Re: Make nonoverlapping_component_refs_since_match_p work with non-trivial MEM_REFs and TMRs
> On Thu, 11 Jul 2019, Jan Hubicka wrote: > > > Hi, > > this patch makes nonoverlapping_component_refs_since_match_p to accept > > paths with non-trivial MEM_REFs and TMRs assuming that they have same > > semantics. > > Hmm. We'll never get any TARGET_MEM_REFs wrapped with > handled-components so I wonder if it makes sense to handle it in > nonoverlapping_component_refs_since_match_p at all. OK, that makes my life easier. Here is updated patch. Index: tree-ssa-alias.c === --- tree-ssa-alias.c(revision 273322) +++ tree-ssa-alias.c(working copy) @@ -1265,20 +1265,6 @@ nonoverlapping_component_refs_since_matc component_refs1.safe_push (ref1); ref1 = TREE_OPERAND (ref1, 0); } - if (TREE_CODE (ref1) == MEM_REF && ref1 != match1) -{ - if (!integer_zerop (TREE_OPERAND (ref1, 1))) - { - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; - } -} - /* TODO: Handle TARGET_MEM_REF later. */ - if (TREE_CODE (ref1) == TARGET_MEM_REF && ref1 != match1) -{ - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; -} /* Create the stack of handled components for REF2. */ while (handled_component_p (ref2) && ref2 != match2) @@ -1290,20 +1276,31 @@ nonoverlapping_component_refs_since_matc component_refs2.safe_push (ref2); ref2 = TREE_OPERAND (ref2, 0); } - if (TREE_CODE (ref2) == MEM_REF && ref2 != match2) -{ - if (!integer_zerop (TREE_OPERAND (ref2, 1))) - { - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; - } -} - if (TREE_CODE (ref2) == TARGET_MEM_REF && ref2 != match2) + + bool mem_ref1 = TREE_CODE (ref1) == MEM_REF && ref1 != match1; + bool mem_ref2 = TREE_CODE (ref2) == MEM_REF && ref2 != match2; + + /* If only one of access paths starts with MEM_REF check that offset is 0 + so the addresses stays the same after stripping it. + TODO: In this case we may walk the other access path until we get same + offset. + + If both starts with MEM_REF, offset has to be same. */ + if ((mem_ref1 && !mem_ref2 && !integer_zerop (TREE_OPERAND (ref1, 1))) + || (mem_ref2 && !mem_ref1 && !integer_zerop (TREE_OPERAND (ref2, 1))) + || (mem_ref1 && mem_ref2 + && !tree_int_cst_equal (TREE_OPERAND (ref1, 1), + TREE_OPERAND (ref2, 1 { ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; return -1; } + /* TARGET_MEM_REF are never wrapped in handled components, so we do not need + to handle them here at all. */ + gcc_checking_assert (TREE_CODE (ref1) != TARGET_MEM_REF + && TREE_CODE (ref2) != TARGET_MEM_REF); + /* Pop the stacks in parallel and examine the COMPONENT_REFs of the same rank. This is sufficient because we start from the same DECL and you cannot reference several fields at a time with COMPONENT_REFs (unlike
Re: Make nonoverlapping_component_refs_since_match_p work with non-trivial MEM_REFs and TMRs
> > On Thu, 11 Jul 2019, Jan Hubicka wrote: > > > > > Hi, > > > this patch makes nonoverlapping_component_refs_since_match_p to accept > > > paths with non-trivial MEM_REFs and TMRs assuming that they have same > > > semantics. > > > > Hmm. We'll never get any TARGET_MEM_REFs wrapped with > > handled-components so I wonder if it makes sense to handle it in > > nonoverlapping_component_refs_since_match_p at all. > > OK, that makes my life easier. Here is updated patch. Hi, the patch finished testing on x86_64-linux so here is with Changelog and testcase. OK? * tree-ssa-alias.c (same_tmr_indexing_p): Break out from ... (indirect_refs_may_alias_p): ... here. (nonoverlapping_component_refs_since_match_p): Support also non-trivial mem refs in the access paths. Index: testsuite/gcc.dg/tree-ssa/alias-access-path-9.c === --- testsuite/gcc.dg/tree-ssa/alias-access-path-9.c (nonexistent) +++ testsuite/gcc.dg/tree-ssa/alias-access-path-9.c (working copy) @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-fre1" } */ + +/* This testcase tests nonoverlapping_component_refs_since_match_p in presence + of non-trivial mem-refs. */ +struct a {int a,b;}; +struct b {struct a a[10];}; +struct c {int c; struct b b;} c, *cptr; + +void +set_a(struct a *a, int p) +{ + a->a=p; +} +void +set_b(struct a *a, int p) +{ + a->b=p; +} +int +get_a(struct a *a) +{ + return a->a; +} + +int +test(int i, int j) +{ + struct b *bptr = &c.b; + set_a (&bptr->a[i], 123); + set_b (&bptr->a[j], 124); + return get_a (&bptr->a[i]); +} + +int +test2(int i, int j) +{ + struct b *bptr = &cptr->b; + set_a (&bptr->a[i], 125); + set_b (&bptr->a[j], 126); + return get_a (&bptr->a[i]); +} +/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */ +/* { dg-final { scan-tree-dump-times "return 125" 1 "fre1"} } */ Index: tree-ssa-alias.c === --- tree-ssa-alias.c(revision 273322) +++ tree-ssa-alias.c(working copy) @@ -1265,20 +1265,6 @@ nonoverlapping_component_refs_since_matc component_refs1.safe_push (ref1); ref1 = TREE_OPERAND (ref1, 0); } - if (TREE_CODE (ref1) == MEM_REF && ref1 != match1) -{ - if (!integer_zerop (TREE_OPERAND (ref1, 1))) - { - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; - } -} - /* TODO: Handle TARGET_MEM_REF later. */ - if (TREE_CODE (ref1) == TARGET_MEM_REF && ref1 != match1) -{ - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; -} /* Create the stack of handled components for REF2. */ while (handled_component_p (ref2) && ref2 != match2) @@ -1290,20 +1276,31 @@ nonoverlapping_component_refs_since_matc component_refs2.safe_push (ref2); ref2 = TREE_OPERAND (ref2, 0); } - if (TREE_CODE (ref2) == MEM_REF && ref2 != match2) -{ - if (!integer_zerop (TREE_OPERAND (ref2, 1))) - { - ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; - return -1; - } -} - if (TREE_CODE (ref2) == TARGET_MEM_REF && ref2 != match2) + + bool mem_ref1 = TREE_CODE (ref1) == MEM_REF && ref1 != match1; + bool mem_ref2 = TREE_CODE (ref2) == MEM_REF && ref2 != match2; + + /* If only one of access paths starts with MEM_REF check that offset is 0 + so the addresses stays the same after stripping it. + TODO: In this case we may walk the other access path until we get same + offset. + + If both starts with MEM_REF, offset has to be same. */ + if ((mem_ref1 && !mem_ref2 && !integer_zerop (TREE_OPERAND (ref1, 1))) + || (mem_ref2 && !mem_ref1 && !integer_zerop (TREE_OPERAND (ref2, 1))) + || (mem_ref1 && mem_ref2 + && !tree_int_cst_equal (TREE_OPERAND (ref1, 1), + TREE_OPERAND (ref2, 1 { ++alias_stats.nonoverlapping_component_refs_since_match_p_may_alias; return -1; } + /* TARGET_MEM_REF are never wrapped in handled components, so we do not need + to handle them here at all. */ + gcc_checking_assert (TREE_CODE (ref1) != TARGET_MEM_REF + && TREE_CODE (ref2) != TARGET_MEM_REF); + /* Pop the stacks in parallel and examine the COMPONENT_REFs of the same rank. This is sufficient because we start from the same DECL and you cannot reference several fields at a time with COMPONENT_REFs (unlike
[RFC][tree-vect]PR 88915: Further vectorize second loop when versioning
Hi Richard(s), I am trying to tackle PR88915 and get GCC to further vectorize the "fallback" loop when doing loop versioning as it does when loop versioning is not required. I have a prototype patch that needs further testing, but at first glance it seems to be achieving the desired outcome. I was wondering whether you had any specific concerns with my current approach. On top of this change I am looking at the iterations and alias checks generated for every "vectorized-version". I.e. with the above patch I see: if (iterations_check_VF_0 () && alias_check_VF_0 ()) vectorized_for_VF_0 (); else if (iterations_check_VF_1 () && alias_check_VF_1 ()) vectorized_for_VF_1 (); ... else scalar_loop (); The alias checks are not always short and may cause unnecessary performance hits. Instead I am now trying to change the checks to produce the following form: if (iterations_check_VF_0 ()) { if (alias_check_VF_0 ()) { vectorized_for_VF_0 (); } else goto VF_1_check; // or scalar_loop } else if (iterations_check_VF_1 ()) { VF_1_check: if (alias_check_VF_1 ()) vectorized_for_VF_1 (); else goto goto_VF_2_check; // or scalar_loop } ... else scalar_loop (); I am not yet sure whether to try the next VF after an alias check fail or to just fall back to scalar instead. Cheers, Andre diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 2f8ab106d03a8927087ee8038e08a825f6e1e237..04d874b70ddfb8a3f5175dcddf00fef6d33f3219 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -266,6 +266,8 @@ struct GTY ((chain_next ("%h.next"))) loop { the basic-block from being collected but its index can still be reused. */ basic_block former_header; + + unsigned long max_vf_limit; }; /* Set if the loop is known to be infinite. */ diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c index f64326b944e630075ced7035937f4601a1cb6c66..07d633b678b52943d3ab82e8d61b80cd712431ac 100644 --- a/gcc/cfgloop.c +++ b/gcc/cfgloop.c @@ -355,6 +355,7 @@ alloc_loop (void) loop->nb_iterations_upper_bound = 0; loop->nb_iterations_likely_upper_bound = 0; loop->nb_iterations_estimate = 0; + loop->max_vf_limit = MAX_VECTORIZATION_FACTOR; return loop; } diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c index bd8fffb1704787d0a611fc02ee29054422596cbb..89529138b9cefb7f822bca72da06df519eff1a28 100644 --- a/gcc/tree-vect-loop-manip.c +++ b/gcc/tree-vect-loop-manip.c @@ -2968,7 +2968,8 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr) struct loop * vect_loop_versioning (loop_vec_info loop_vinfo, unsigned int th, bool check_profitability, - poly_uint64 versioning_threshold) + poly_uint64 versioning_threshold, + vec &more_loops) { struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo), *nloop; struct loop *scalar_loop = LOOP_VINFO_SCALAR_LOOP (loop_vinfo); @@ -3143,6 +3144,19 @@ vect_loop_versioning (loop_vec_info loop_vinfo, nloop = loop_version (loop_to_version, cond_expr, &condition_bb, prob, prob.invert (), prob, prob.invert (), true); gcc_assert (nloop); + + if (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + { + + /* Add the scalar fallback loop to the MORE_LOOPS vector to be looked + at later. Also make sure it is never vectorized for the original + vf by setting the limit of the maximum vf to the original vf minus + one. */ + nloop->max_vf_limit + = constant_lower_bound (LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 1; + more_loops.safe_push (nloop); + } + nloop = get_loop_copy (loop); /* Kill off IFN_LOOP_VECTORIZED_CALL in the copy, nobody will diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index b49ab152012a5c7fe9cc0564e58d296447f9ffb1..081885c378200661237ef22d2b011fc775e21218 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -1862,7 +1862,7 @@ vect_analyze_loop_2 (loop_vec_info loop_vinfo, bool &fatal, unsigned *n_stmts) { opt_result ok = opt_result::success (); int res; - unsigned int max_vf = MAX_VECTORIZATION_FACTOR; + unsigned int max_vf = LOOP_VINFO_LOOP (loop_vinfo)->max_vf_limit; poly_uint64 min_vf = 2; /* The first group of checks is independent of the vector size. */ @@ -8468,7 +8468,7 @@ vect_transform_loop_stmt (loop_vec_info loop_vinfo, stmt_vec_info stmt_info, Returns scalar epilogue loop if any. */ struct loop * -vect_transform_loop (loop_vec_info loop_vinfo) +vect_transform_loop (loop_vec_info loop_vinfo, vec &more_loops) { struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *epilogue = NULL; @@ -8530,7 +8530,8 @@ vect_transform_loop (loop_vec_info loop_vinfo) } struct loop *sloop = vect_loop_versioning (loop_vinfo, th, check_profitability, -versioning_threshold); +versioning_threshold, more_loops); + sloop->force_vectorize = false; check_profitability = false; } diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer
Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)
On Thu, Jul 11, 2019 at 12:49:44PM +0100, Gaius Mulley wrote: > Matthias Klose writes: > > powerpc64le-linux-gnu fails to build (search for "unfinished" in the build > > log) > > > > during RTL pass: final > > ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def: In function > > '_M2_SYSTEM_init': > > ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def:20: internal compiler > > error: in > > rs6000_output_function_epilogue, at conf > > ig/rs6000/rs6000.c:29169 We don't yet support Modula2 in the traceback tables. I'll add it. What is the exact spelling in lang_hooks.name? "GNU Modula 2"? Or "GNU Modula2"? Or what else :-) > Hi Matthias, > > many thanks for the build results. There is a newer version of > gm2/Make-lang.in on the 9.1.0 branch which (quietens the output for mc) > and fixes the gm2l execvp (on other platforms). > > The ICE is very interesting! Should I add this to the GCC 9 branch as well? Does that help testing, I mean? Segher
Re: [PATCH 1/3] C++20 constexpr lib part 1/3
On 7/6/19 3:55 AM, Ville Voutilainen wrote: Blargh! But that's fine; the result of copy is not stored in a constexpr variable, but the function return is static_asserted so we have sufficiently tested that std::copy is indeed constexpr-compatible since it appears in a function that is evaluated at compile-time. Ping?
Re: [PATCH,RFC,V3 0/5] Support for CTF in GCC
On Thu, Jul 11, 2019 at 01:25:18PM +0100, Nix wrote: > On 10 Jul 2019, Segher Boessenkool spake thusly: > > > On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: > >> On 5 Jul 2019, Richard Biener said: > >> > >> > On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat > >> > wrote: > >> >> CTF, at this time, is type information for entities at global or file > >> >> scope. > >> >> This can be used by online debuggers, program tracers (dynamic > >> >> tracing); More > >> >> generally, it provides type introspection for C programs, with an > >> >> optional > >> >> library API to allow them to get at their own types quite more easily > >> >> than > >> >> DWARF. So, the umbrella usecases are - all C programs that want to > >> >> introspect > >> >> their own types quickly; and applications that want to introspect other > >> >> programs's types quickly. > >> > > >> > What makes it superior to DWARF stripped down to the above feature set? > >> > >> Increased compactness. > > > > Does CTF support something like -fasynchronous-unwind-tables? You need > > that to have any sane debugging on many platforms. Without it, you > > even have only partial backtraces, on most architectures/ABIs anyway. > > The backtrace section is still being designed, so it could! There is > certainly nothing intrinsically preventing it. Am I right that this > stuff works by ensuring that the arg->location picture is consistent at > all times, between every instruction, rather than just at function > calls, i.e. tracking all register moves done by the compiler, even > transiently? Yes, something like that. You get unwind tables that are valid at each instruction boundary. This is esp. important for the return address, without that backtraces are broken. > Because that sounds doable, given that the compiler is > doing the hard work of identifying such locations anyway (it has to for > DWARF -fasynchronous-unwind-tables, right?). Yes, every backend outputs DWARF info semi-manually for this. You have some work to do if you want to use this for CTF. > It seems essential to do this in any case if you want to get correct > args for the function the user is actually stopped at: there's no > requirement that the user is stopped at a function call! Yes. You need the asynchronous option only if you need this info at any possible point in a program -- but quite a few things do need it everywhere ;-) Segher
Re: [PATCH] i386: Add AVX512 unaligned intrinsics
Fixed. --Sunil Pandey i386: Add AVX512 unaligned intrinsics __m512i _mm512_loadu_epi64( void * sa); void _mm512_storeu_epi64(void * d, __m512i a); __m512i _mm512_loadu_epi32( void * sa); void _mm512_storeu_epi32(void * d, __m512i a); void _mm256_storeu_epi64(void * d, __m256i a); void _mm_storeu_epi64(void * d, __m128i a); void _mm256_storeu_epi32(void * d, __m256i a); void _mm_storeu_epi32(void * d, __m128i a); Tested on x86-64. gcc/ PR target/90980 * config/i386/avx512fintrin.h (_mm512_loadu_epi64): New. (_mm512_storeu_epi64): Likewise. (_mm512_loadu_epi32): Likewise. (_mm512_storeu_epi32): Likewise. * config/i386/avx512vlintrin.h (_mm256_storeu_epi64): New. (_mm_storeu_epi64): Likewise. (_mm256_storeu_epi32): Likewise. (_mm_storeu_epi32): Likewise. gcc/testsuite/ PR target/90980 * gcc.target/i386/pr90980-1.c: New test. * gcc.target/i386/pr90980-2.c: Likewise. * gcc.target/i386/pr90980-3.c: Likewise. On Wed, Jul 10, 2019 at 12:20 PM Uros Bizjak wrote: > > On Wed, Jul 10, 2019 at 9:11 PM Sunil Pandey wrote: > > > > Thanks Uros. I incorporated suggested changes in attached patch. > > > > --Sunil Pandey > > > > i386: Add AVX512 unaligned intrinsics > > > > __m512i _mm512_loadu_epi32( void * sa); > > __m512i _mm512_loadu_epi64( void * sa); > > void _mm512_storeu_epi32(void * d, __m512i a); > > void _mm256_storeu_epi32(void * d, __m256i a); > > void _mm_storeu_epi32(void * d, __m128i a); > > void _mm512_storeu_epi64(void * d, __m512i a); > > void _mm256_storeu_epi64(void * d, __m256i a); > > void _mm_storeu_epi64(void * d, __m128i a); > > > > Tested on x86-64. > > > > gcc/ > > > > PR target/90980 > > * config/i386/avx512fintrin.h (_mm512_loadu_epi32): New. > > (_mm512_loadu_epi64): Likewise. > > (_mm512_storeu_epi32): Likewise. > > (_mm512_storeu_epi64): Likewise. > > * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New. > > (_mm256_storeu_epi32): Likewise. > > (_mm_storeu_epi64): Likewise. > > (_mm256_storeu_epi64): Likewise. > > > > gcc/testsuite/ > > > > PR target/90980 > > * gcc.target/i386/pr90980-1.c: New test. > > * gcc.target/i386/pr90980-2.c: Likewise. > > * gcc.target/i386/pr90980-3.c: Likewise. > > Looks good, but please put new intrinsics nearby existing intrinsics, > so we will have e.g.: > > _mm512_loadu_epi32 > _mm512_mask_loadu_epi32 > _mm512_maskz_loadu_epi32 > > and in similar way for other loads and stores. > > Uros. > > > > > On Tue, Jul 9, 2019 at 11:39 PM Uros Bizjak wrote: > > > > > > On Tue, Jul 9, 2019 at 11:44 PM Sunil Pandey wrote: > > > > > > > > __m512i _mm512_loadu_epi32( void * sa); > > > > __m512i _mm512_loadu_epi64( void * sa); > > > > void _mm512_storeu_epi32(void * d, __m512i a); > > > > void _mm256_storeu_epi32(void * d, __m256i a); > > > > void _mm_storeu_epi32(void * d, __m128i a); > > > > void _mm512_storeu_epi64(void * d, __m512i a); > > > > void _mm256_storeu_epi64(void * d, __m256i a); > > > > void _mm_storeu_epi64(void * d, __m128i a); > > > > > > > > Tested on x86-64. > > > > > > > > OK for trunk? > > > > > > > > --Sunil Pandey > > > > > > > > > > > > gcc/ > > > > > > > > PR target/90980 > > > > * config/i386/avx512fintrin.h (__v16si_u): New data type > > > > (__v8di_u): Likewise > > > > (_mm512_loadu_epi32): New. > > > > (_mm512_loadu_epi64): Likewise. > > > > (_mm512_storeu_epi32): Likewise. > > > > (_mm512_storeu_epi64): Likewise. > > > > * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New. > > > > (_mm256_storeu_epi32): Likewise. > > > > (_mm_storeu_epi64): Likewise. > > > > (_mm256_storeu_epi64): Likewise. > > > > > > > > gcc/testsuite/ > > > > > > > > PR target/90980 > > > > * gcc.target/i386/avx512f-vmovdqu32-3.c: New test. > > > > * gcc.target/i386/avx512f-vmovdqu64-3.c: Likewise. > > > > * gcc.target/i386/pr90980-1.c: Likewise. > > > > * gcc.target/i386/pr90980-2.c: Likewise. > > > > > > +/* Internal data types for implementing unaligned version of intrinsics. > > > */ > > > +typedef int __v16si_u __attribute__ ((__vector_size__ (64), > > > + __aligned__ (1))); > > > +typedef long long __v8di_u __attribute__ ((__vector_size__ (64), > > > + __aligned__ (1))); > > > > > > You should define only one generic __m512i_u type, something like: > > > > > > typedef long long __m512i_u __attribute__ ((__vector_size__ (64), > > > __may_alias__, __aligned__ (1))); > > > > > > Please see avxintrin.h how __m256i_u is defined and used. > > > > > > Uros. 0001-i386-Add-AVX512-unaligned-intrinsics.patch Description: Binary data
Re: [patch] Small improvements to coverage info (4/n)
> After your patch does behavior change when trying to break on a line > with a return stmt inside a debugger? Note that every patch in the series was tested against GDB too, so hopefully this would have been caught... But the answer is no, see lower_gimple_return: /* Generate a goto statement and remove the return statement. */ found: /* When not optimizing, make sure user returns are preserved. */ if (!optimize && gimple_has_location (stmt)) DECL_ARTIFICIAL (tmp_rs.label) = 0; t = gimple_build_goto (tmp_rs.label); /* location includes block. */ gimple_set_location (t, gimple_location (stmt)); gsi_insert_before (gsi, t, GSI_SAME_STMT); gsi_remove (gsi, false); So the label, the goto and its location are all preserved: (gdb) b ops.adb:11 Breakpoint 1 at 0x40308e: file ops.adb, line 11. (gdb) run Starting program: /home/eric/gnat/test_ops_okovb Breakpoint 1, ops.both_ok (a=..., b=...) at ops.adb:11 11 return True; -- # ok (gdb) b ops.adb:14 Breakpoint 1 at 0x4030c1: file ops.adb, line 14. (gdb) run Starting program: /home/eric/gnat/test_ops_okovb Breakpoint 1, ops.both_ok (a=..., b=...) at ops.adb:14 14 return False; -- # ko (gdb) b ops.adb:19 Breakpoint 1 at 0x403115: file ops.adb, line 19. (gdb) run Starting program: /home/eric/gnat/test_ops_okovb Breakpoint 1, ops.both_ok (a=..., b=...) at ops.adb:19 19 return False; -- # ov -- Eric Botcazou
Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)
Segher Boessenkool writes: > On Thu, Jul 11, 2019 at 12:49:44PM +0100, Gaius Mulley wrote: >> Matthias Klose writes: >> > powerpc64le-linux-gnu fails to build (search for "unfinished" in the build >> > log) >> > >> > during RTL pass: final >> > ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def: In function >> > '_M2_SYSTEM_init': >> > ../../src/gcc/gm2/gm2-libs-coroutines/SYSTEM.def:20: internal compiler >> > error: in >> > rs6000_output_function_epilogue, at conf >> > ig/rs6000/rs6000.c:29169 > > We don't yet support Modula2 in the traceback tables. I'll add it. > What is the exact spelling in lang_hooks.name? "GNU Modula 2"? Or > "GNU Modula2"? Or what else :-) Hi Segher, very close ! #define LANG_HOOKS_NAME "GNU Modula-2" >> Hi Matthias, >> >> many thanks for the build results. There is a newer version of >> gm2/Make-lang.in on the 9.1.0 branch which (quietens the output for mc) >> and fixes the gm2l execvp (on other platforms). >> >> The ICE is very interesting! > > Should I add this to the GCC 9 branch as well? Does that help testing, > I mean? yes please - all that helps is good, regards, Gaius
Re: [PATCH] Deprecate -frepo on gcc-9 branch (PR c++/91125).
On 7/11/19 3:02 AM, Jakub Jelinek wrote: On Thu, Jul 11, 2019 at 08:48:52AM +0200, Martin Liška wrote: --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -501,6 +501,8 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value, flag_use_repository = value; if (value) flag_implicit_templates = 0; + warning (0, "%<-frepo%> is deprecated and will be removed " + "in a future release"); break; Is there a reason not to condition this on OPT_Wdeprecated (which defaults to on)? Jason
[PATCH] add --param ssa-name-def-chain-limit
Attached is a patch that adds a new parameter to limit the number of SSA_NAME assignments for GCC to follow in iterative or recursive algorithms. Purely as a proof of concept the patch introduces the parameter into -Warray-bounds where the warning follows POINTER_PLUS (and ASSERT_EXPR) assignments to get at the DECL the final pointer points to. With this "infrastructure" in place the parameter can start to be introduced wherever else it might be necessary. I don't know of any pathological cases where it actually is necessary (i.e., one the 512 default keeps from going off the rails) so the test I have put together for it is artificial. A better test case involving one of the known recursive algorithms would be helpful. Martin gcc/ChangeLog: * doc/invoke.texi (ssa-name-def-chain-limit): Document new --param. * params.def (PARAM_SSA_NAME_DEF_CHAIN_LIMIT): Add new --param. * tree-vrp.c (vrp_prop::check_mem_ref): Use PARAM_SSA_NAME_DEF_CHAIN_LIMIT. gcc/testsuite/ChangeLog: * gcc.dg/Warray-bounds-43.c: New test. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 273357) +++ gcc/doc/invoke.texi (working copy) @@ -12225,6 +12225,13 @@ before the loop versioning pass considers it too b discounting any instructions in inner loops that directly benefit from versioning. +@item ssa-name-def-chain-limit +The maximum number of SSA_NAME assignments to follow in determining +a property of a variable such as its value. This limits the number +of iterations or recursive calls GCC performs when optimizing certain +statements or when determining their validity prior to issuing +diagnostics. + @end table @end table Index: gcc/params.def === --- gcc/params.def (revision 273357) +++ gcc/params.def (working copy) @@ -1437,6 +1437,12 @@ DEFPARAM(PARAM_HASH_TABLE_VERIFICATION_LIMIT, "each searched element.", 10, 0, 0) +DEFPARAM(PARAM_SSA_NAME_DEF_CHAIN_LIMIT, + "ssa-name-def-chain-limit", + "The maximum number of SSA_NAME assignments to follow in determining " + "a value.", + 512, 0, 0) + /* Local variables: Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 273357) +++ gcc/tree-vrp.c (working copy) @@ -4492,7 +4492,8 @@ vrp_prop::check_mem_ref (location_t location, tree The loop computes the range of the final offset for expressions such as (A + i0 + ... + iN)[CSTOFF] where i0 through iN are SSA_NAMEs in some range. */ - while (TREE_CODE (arg) == SSA_NAME) + const unsigned limit = PARAM_VALUE (PARAM_SSA_NAME_DEF_CHAIN_LIMIT); + for (unsigned n = 0; TREE_CODE (arg) == SSA_NAME && n < limit; ++n) { gimple *def = SSA_NAME_DEF_STMT (arg); if (!is_gimple_assign (def)) Index: gcc/testsuite/gcc.dg/Warray-bounds-43.c === --- gcc/testsuite/gcc.dg/Warray-bounds-43.c (nonexistent) +++ gcc/testsuite/gcc.dg/Warray-bounds-43.c (working copy) @@ -0,0 +1,133 @@ +/* Test to verify that --param ssa_name_def_chain_limit can be used to + limit the maximum number of SSA_NAME assignments the warning follows. + { dg-do compile } + { dg-options "-O2 -Wall --param ssa-name-def-chain-limit=4" } */ + +#define NOIPA __attribute__ ((noipa)) + +const char a0[] = ""; +const char a1[] = "1"; +const char a2[] = "12"; +const char a3[] = "123"; +const char a4[] = "1234"; +const char a5[] = "12345"; +const char a6[] = "123456"; +const char a7[] = "1234567"; +const char a8[] = "12345678"; +const char a9[] = "123456789"; + +void f (const char*, ...); + +int i0, i1, i2, i3, i4, i5, i6, i7, i8; + +NOIPA int g2 (int i) +{ + if (i < 1) i = 1; + + const char *p0 = a9; + const char *p1 = p0 + i; + const char *p2 = p1 + i; + + f (p0, p1, p2); + + return p2[8]; // { dg-warning "\\\[-Warray-bounds]" } +} + +NOIPA int g3 (int i) +{ + if (i < 1) i = 1; + + const char *p0 = a9; + const char *p1 = p0 + i; + const char *p2 = p1 + i; + const char *p3 = p2 + i; + + f (p0, p1, p2, p3); + + return p3[7]; // { dg-warning "\\\[-Warray-bounds]" } +} + +NOIPA int g4 (int i) +{ + if (i < 1) i = 1; + + const char *p0 = a9; + const char *p1 = p0 + i; + const char *p2 = p1 + i; + const char *p3 = p2 + i; + const char *p4 = p3 + i; + + f (p0, p1, p2, p3, p4); + + return p4[6]; // { dg-warning "\\\[-Warray-bounds]" } +} + +NOIPA int g5 (int i) +{ + if (i < 1) i = 1; + + const char *p0 = a9; + const char *p1 = p0 + i; + const char *p2 = p1 + i; + const char *p3 = p2 + i; + const char *p4 = p3 + i; + const char *p5 = p4 + i; + + f (p0, p1, p2, p3, p4, p5); + + return p5[5]; +} + +NOIPA int g6 (int i) +{ + if (i < 1) i = 1; + + const char *p0 = a9; + const char *p1 = p0 + i; + const char *p2 = p1 + i; + const char *p3 = p2 + i; + const char *p4 = p3 + i; + const char *p5 = p4 + i; + const char
[PATCH] rs6000: Handle Modula-2 in the traceback table
This patch recognises Modula-2 as language for the traceback table, fixing the problem shown in https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00848.html . Committing to trunk, and committing a variant to the 9 branch. Segher 2019-07-11 Segher Boessenkool * config/rs6000/rs6000-logue.c (rs6000_output_function_epilogue): Handle Modula-2. --- gcc/config/rs6000/rs6000-logue.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c index 8454f96..8e8cf05 100644 --- a/gcc/config/rs6000/rs6000-logue.c +++ b/gcc/config/rs6000/rs6000-logue.c @@ -5287,6 +5287,8 @@ rs6000_output_function_epilogue (FILE *file) i = 1; else if (! strcmp (language_string, "GNU Ada")) i = 3; + else if (! strcmp (language_string, "GNU Modula-2")) + i = 8; else if (lang_GNU_CXX () || ! strcmp (language_string, "GNU Objective-C++")) i = 9; -- 1.8.3.1
Re: [PATCH] rs6000: Handle Modula-2 in the traceback table
On Thu, Jul 11, 2019 at 2:34 PM Segher Boessenkool wrote: > > This patch recognises Modula-2 as language for the traceback table, > fixing the problem shown in > https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00848.html . The value "8" is predefined in the ABI and AIX headers as the "language indicator" for Modula-2. Okay. - David
Re: [PATCH] i386: Add AVX512 unaligned intrinsics
On Thu, Jul 11, 2019 at 6:54 PM Sunil Pandey wrote: > > Fixed. > > --Sunil Pandey > > i386: Add AVX512 unaligned intrinsics > > __m512i _mm512_loadu_epi64( void * sa); > void _mm512_storeu_epi64(void * d, __m512i a); > __m512i _mm512_loadu_epi32( void * sa); > void _mm512_storeu_epi32(void * d, __m512i a); > void _mm256_storeu_epi64(void * d, __m256i a); > void _mm_storeu_epi64(void * d, __m128i a); > void _mm256_storeu_epi32(void * d, __m256i a); > void _mm_storeu_epi32(void * d, __m128i a); > > Tested on x86-64. > > gcc/ > > PR target/90980 > * config/i386/avx512fintrin.h (_mm512_loadu_epi64): New. > (_mm512_storeu_epi64): Likewise. > (_mm512_loadu_epi32): Likewise. > (_mm512_storeu_epi32): Likewise. > * config/i386/avx512vlintrin.h (_mm256_storeu_epi64): New. > (_mm_storeu_epi64): Likewise. > (_mm256_storeu_epi32): Likewise. > (_mm_storeu_epi32): Likewise. > > gcc/testsuite/ > > PR target/90980 > * gcc.target/i386/pr90980-1.c: New test. > * gcc.target/i386/pr90980-2.c: Likewise. > * gcc.target/i386/pr90980-3.c: Likewise. OK. Thanks, Uros. > On Wed, Jul 10, 2019 at 12:20 PM Uros Bizjak wrote: > > > > On Wed, Jul 10, 2019 at 9:11 PM Sunil Pandey wrote: > > > > > > Thanks Uros. I incorporated suggested changes in attached patch. > > > > > > --Sunil Pandey > > > > > > i386: Add AVX512 unaligned intrinsics > > > > > > __m512i _mm512_loadu_epi32( void * sa); > > > __m512i _mm512_loadu_epi64( void * sa); > > > void _mm512_storeu_epi32(void * d, __m512i a); > > > void _mm256_storeu_epi32(void * d, __m256i a); > > > void _mm_storeu_epi32(void * d, __m128i a); > > > void _mm512_storeu_epi64(void * d, __m512i a); > > > void _mm256_storeu_epi64(void * d, __m256i a); > > > void _mm_storeu_epi64(void * d, __m128i a); > > > > > > Tested on x86-64. > > > > > > gcc/ > > > > > > PR target/90980 > > > * config/i386/avx512fintrin.h (_mm512_loadu_epi32): New. > > > (_mm512_loadu_epi64): Likewise. > > > (_mm512_storeu_epi32): Likewise. > > > (_mm512_storeu_epi64): Likewise. > > > * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New. > > > (_mm256_storeu_epi32): Likewise. > > > (_mm_storeu_epi64): Likewise. > > > (_mm256_storeu_epi64): Likewise. > > > > > > gcc/testsuite/ > > > > > > PR target/90980 > > > * gcc.target/i386/pr90980-1.c: New test. > > > * gcc.target/i386/pr90980-2.c: Likewise. > > > * gcc.target/i386/pr90980-3.c: Likewise. > > > > Looks good, but please put new intrinsics nearby existing intrinsics, > > so we will have e.g.: > > > > _mm512_loadu_epi32 > > _mm512_mask_loadu_epi32 > > _mm512_maskz_loadu_epi32 > > > > and in similar way for other loads and stores. > > > > Uros. > > > > > > > > On Tue, Jul 9, 2019 at 11:39 PM Uros Bizjak wrote: > > > > > > > > On Tue, Jul 9, 2019 at 11:44 PM Sunil Pandey wrote: > > > > > > > > > > __m512i _mm512_loadu_epi32( void * sa); > > > > > __m512i _mm512_loadu_epi64( void * sa); > > > > > void _mm512_storeu_epi32(void * d, __m512i a); > > > > > void _mm256_storeu_epi32(void * d, __m256i a); > > > > > void _mm_storeu_epi32(void * d, __m128i a); > > > > > void _mm512_storeu_epi64(void * d, __m512i a); > > > > > void _mm256_storeu_epi64(void * d, __m256i a); > > > > > void _mm_storeu_epi64(void * d, __m128i a); > > > > > > > > > > Tested on x86-64. > > > > > > > > > > OK for trunk? > > > > > > > > > > --Sunil Pandey > > > > > > > > > > > > > > > gcc/ > > > > > > > > > > PR target/90980 > > > > > * config/i386/avx512fintrin.h (__v16si_u): New data type > > > > > (__v8di_u): Likewise > > > > > (_mm512_loadu_epi32): New. > > > > > (_mm512_loadu_epi64): Likewise. > > > > > (_mm512_storeu_epi32): Likewise. > > > > > (_mm512_storeu_epi64): Likewise. > > > > > * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New. > > > > > (_mm256_storeu_epi32): Likewise. > > > > > (_mm_storeu_epi64): Likewise. > > > > > (_mm256_storeu_epi64): Likewise. > > > > > > > > > > gcc/testsuite/ > > > > > > > > > > PR target/90980 > > > > > * gcc.target/i386/avx512f-vmovdqu32-3.c: New test. > > > > > * gcc.target/i386/avx512f-vmovdqu64-3.c: Likewise. > > > > > * gcc.target/i386/pr90980-1.c: Likewise. > > > > > * gcc.target/i386/pr90980-2.c: Likewise. > > > > > > > > +/* Internal data types for implementing unaligned version of > > > > intrinsics. */ > > > > +typedef int __v16si_u __attribute__ ((__vector_size__ (64), > > > > + __aligned__ (1))); > > > > +typedef long long __v8di_u __attribute__ ((__vector_size__ (64), > > > > + __aligned__ (1))); > > > > > > > > You should define only one generic __m512i_u type, something like: > > > > > > > > typedef long long
[PATCH] rs6000: Adjust comment for the Modula-2 traceback lang
I missed the big preceding comment. It has all other values, so let's put this one there, as well. 2019-07-11 Segher Boessenkool * config/rs6000/rs6000-logue.c: Add Modula-2 to comment. --- gcc/config/rs6000/rs6000-logue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c index 8e8cf05..868a865 100644 --- a/gcc/config/rs6000/rs6000-logue.c +++ b/gcc/config/rs6000/rs6000-logue.c @@ -5272,7 +5272,7 @@ rs6000_output_function_epilogue (FILE *file) /* Language type. Unfortunately, there does not seem to be any official way to discover the language being compiled, so we use language_string. -C is 0. Fortran is 1. Ada is 3. C++ is 9. +C is 0. Fortran is 1. Ada is 3. Modula-2 is 8. C++ is 9. Java is 13. Objective-C is 14. Objective-C++ isn't assigned a number, so for now use 9. LTO, Go, D, and JIT aren't assigned numbers either, so for now use 0. */ -- 1.8.3.1
[PATCH], PowerPC, Patch #9, Refine calculation of whether an address offset is d-form, ds-form, or dq-form
As I mentioned in patch #8, I needed to refine the calculation of what type of address offset it used on the PowerPC for particular loads and stores. The motivation is that the new prefixed instructions that future processors may support have improved instruction encodings, and the prefixed form of the instruction do not need the bottom 2 bits clear for traditional DS-form instructions or the bottom 4 bits clear for traditional DQ-form instructions. This means if you have a load where the offset it not aligned: LD rx,3(base) currently the compiler will have to generate code like: LI tmp,3 LDX rx,tmp,base With prefixed instructions, the compiler can instead generate: PLD rx,3(base) The original implementation just using the mode does not work in a lot of cases, because you don't know the context of the use. For example, a LWZ instruction uses a d-form load (all 16 bits are available), while a LWA instruction uses a ds-form load (bottom 2 bits must be 0). Another example is loading up a scalar floating value. If we are loading to a traditional FPR, the LFD instruction is a d-form instruction, while if we are loading to a traditional Altivec register, the LXSD instruction is a ds-form instruction. I have built the compiler with these patches installed on a little endian power8 system. There were no regressions in the test suite. Can I install this patch into the FSF trunk? 2019-07-11 Michael Meissner * config/rs6000/rs6000-protos.h (enum rs6000_offset_format): New enumeration to describe the addressing offset format. (reg_to_offset_format): New declaration. (rs6000_prefixed_address_format): New declaration. * config/rs6000/rs6000.c (struct rs6000_reg_addr): Add default address offset format field. (mode_supports_prefixed_address_p): Move function higher. (initialize_offset_formats): New function to determine the default address format for each mode. (rs6000_init_hard_regno_mode_ok): Initialize the address formats for each mode. (quad_address_p): Add support for prefixed instructions. (mem_operand_gpr): Add support for prefixed instructions. (mem_operand_ds_form): Add support for prefixed instructions. (rs6000_prefixed_address_format): New function to validate an address given an address format. (rs6000_prefixed_address_mode): Rewrite to used saved default address mode format. (reg_to_offset_format): New function to determine what address format to use for a particular register. Index: gcc/config/rs6000/rs6000-protos.h === --- gcc/config/rs6000/rs6000-protos.h (revision 273371) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -154,6 +154,18 @@ extern align_flags rs6000_loop_align (rt extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool); extern bool rs6000_pcrel_p (struct function *); extern bool rs6000_fndecl_pcrel_p (const_tree); + +/* Enumeration to describe the offsettable address formats of PowerPC memory + instructions. */ +enum rs6000_offset_format { + OFFSET_FORMAT_NONE, /* No offset format is available. */ + OFFSET_FORMAT_D, /* All 16-bits are available. */ + OFFSET_FORMAT_DS,/* Bottom 2 bits must be 0. */ + OFFSET_FORMAT_DQ /* Bottom 4 bits must be 0. */ +}; + +extern enum rs6000_offset_format reg_to_offset_format (rtx, machine_mode, bool); +extern bool rs6000_prefixed_address_format (rtx, enum rs6000_offset_format); extern bool rs6000_prefixed_address_mode (rtx, machine_mode); #endif /* RTX_CODE */ Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 273371) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -459,6 +459,7 @@ struct rs6000_reg_addr { enum insn_code reload_fpr_gpr; /* INSN to move from FPR to GPR. */ enum insn_code reload_gpr_vsx; /* INSN to move from GPR to VSX. */ enum insn_code reload_vsx_gpr; /* INSN to move from VSX to GPR. */ + enum rs6000_offset_format offset_format; /* Format of offset insn. */ addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks. */ bool scalar_in_vmx_p;/* Scalar value can go in VMX. */ }; @@ -498,6 +499,17 @@ mode_supports_dq_form (machine_mode mode != 0); } +/* Helper function to return whether a MODE can do prefixed loads/stores. + VOIDmode is used when we are loading the pc-relative address into a base + register, but we are not using it as part of a memory operation. As modes + add support for prefixed memory, they will be added here. */ + +static bool +mode_supports_prefixed_address_p (machine_mode mode) +{ + return mode == VOIDmode; +} + /* Given that there exists at least one variable that is
[PATCH] Define std::atomic_ref and std::atomic for C++20
This adds the new atomic types from C++2a, as proposed by P0019 and P0020. To reduce duplication the calls to the compiler's atomic built-ins are wrapped in new functions in the __atomic_impl namespace. These functions are currently only used by std::atomic and std::atomic_ref but could also be used for all other specializations of std::atomic. * include/bits/atomic_base.h (__atomic_impl): New namespace for wrappers around atomic built-ins. (__atomic_float, __atomic_ref): New class templates for use as base classes. * include/std/atomic (atomic, atomic) (atomic): New explicit specializations. (atomic_ref): New class template. (__cpp_lib_atomic_ref): Define. * include/std/version (__cpp_lib_atomic_ref): Define. * testsuite/29_atomics/atomic/60695.cc: Adjust dg-error. * testsuite/29_atomics/atomic_float/1.cc: New test. * testsuite/29_atomics/atomic_float/requirements.cc: New test. * testsuite/29_atomics/atomic_ref/deduction.cc: New test. * testsuite/29_atomics/atomic_ref/float.cc: New test. * testsuite/29_atomics/atomic_ref/generic.cc: New test. * testsuite/29_atomics/atomic_ref/integral.cc: New test. * testsuite/29_atomics/atomic_ref/pointer.cc: New test. * testsuite/29_atomics/atomic_ref/requirements.cc: New test. Testted x86_64-linux, committed to trunk. commit 6d63be4697285c362642b0e112441b75db4944ff Author: redi Date: Thu Jul 11 19:43:25 2019 + Define std::atomic_ref and std::atomic for C++20 This adds the new atomic types from C++2a, as proposed by P0019 and P0020. To reduce duplication the calls to the compiler's atomic built-ins are wrapped in new functions in the __atomic_impl namespace. These functions are currently only used by std::atomic and std::atomic_ref but could also be used for all other specializations of std::atomic. * include/bits/atomic_base.h (__atomic_impl): New namespace for wrappers around atomic built-ins. (__atomic_float, __atomic_ref): New class templates for use as base classes. * include/std/atomic (atomic, atomic) (atomic): New explicit specializations. (atomic_ref): New class template. (__cpp_lib_atomic_ref): Define. * include/std/version (__cpp_lib_atomic_ref): Define. * testsuite/29_atomics/atomic/60695.cc: Adjust dg-error. * testsuite/29_atomics/atomic_float/1.cc: New test. * testsuite/29_atomics/atomic_float/requirements.cc: New test. * testsuite/29_atomics/atomic_ref/deduction.cc: New test. * testsuite/29_atomics/atomic_ref/float.cc: New test. * testsuite/29_atomics/atomic_ref/generic.cc: New test. * testsuite/29_atomics/atomic_ref/integral.cc: New test. * testsuite/29_atomics/atomic_ref/pointer.cc: New test. * testsuite/29_atomics/atomic_ref/requirements.cc: New test. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@273420 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h index e30caef91bf..146e70a9f2e 100644 --- a/libstdc++-v3/include/bits/atomic_base.h +++ b/libstdc++-v3/include/bits/atomic_base.h @@ -35,6 +35,7 @@ #include #include #include +#include #ifndef _GLIBCXX_ALWAYS_INLINE #define _GLIBCXX_ALWAYS_INLINE inline __attribute__((__always_inline__)) @@ -817,6 +818,876 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return __atomic_fetch_sub(&_M_p, _M_type_size(__d), int(__m)); } }; +#if __cplusplus > 201703L + // Implementation details of atomic_ref and atomic. + namespace __atomic_impl + { +// Remove volatile and create a non-deduced context for value arguments. +template + using _Val = remove_volatile_t<_Tp>; + +// As above, but for difference_type arguments. +template + using _Diff = conditional_t, ptrdiff_t, _Val<_Tp>>; + +template + _GLIBCXX_ALWAYS_INLINE bool + is_lock_free() noexcept + { + // Produce a fake, minimally aligned pointer. + return __atomic_is_lock_free(_Size, reinterpret_cast(-_Align)); + } + +template + _GLIBCXX_ALWAYS_INLINE void + store(_Tp* __ptr, _Val<_Tp> __t, memory_order __m) noexcept + { __atomic_store(__ptr, std::__addressof(__t), int(__m)); } + +template + _GLIBCXX_ALWAYS_INLINE _Tp + load(_Tp* __ptr, memory_order __m) noexcept + { + alignas(_Tp) unsigned char __buf[sizeof(_Tp)]; + _Tp* __dest = reinterpret_cast<_Tp*>(__buf); + __atomic_load(__ptr, __dest, int(__m)); + return *__dest; + } + +template + _GLIBCXX_ALWAYS_INLINE _Tp + exchange(_Tp* __ptr, _Val<_Tp> __desired, memory_order __m) noexcept + { +alignas(_Tp) unsigned char __buf[sizeof(_Tp)]; + _Tp* __dest
[PATCH] Improve docs for --enable-libstdcxx-time=rt
* doc/xml/manual/configure.xml: Improve documentation of --enable-libstdcxx-time option. Committed to trunk. commit 6a7d6b5ea112fd5c14d5af2394817178293ac934 Author: redi Date: Thu Jul 11 19:43:32 2019 + Improve docs for --enable-libstdcxx-time=rt * doc/xml/manual/configure.xml: Improve documentation of --enable-libstdcxx-time option. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@273421 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/libstdc++-v3/doc/xml/manual/configure.xml b/libstdc++-v3/doc/xml/manual/configure.xml index d296c8d8a49..58587e858a4 100644 --- a/libstdc++-v3/doc/xml/manual/configure.xml +++ b/libstdc++-v3/doc/xml/manual/configure.xml @@ -166,18 +166,24 @@ --enable-libstdcxx-time=OPTION Enables link-type checks for the availability of the - clock_gettime clocks, used in the implementation of [time.clock], - and of the nanosleep and sched_yield functions, used in the + clock_gettime clocks, used in the implementation + of [time.clock], and of the nanosleep and + sched_yield functions, used in the implementation of [thread.thread.this] of the 2011 ISO C++ standard. The choice OPTION=yes checks for the availability of the facilities in libc and libposix4. In case it's needed the latter is also linked - to libstdc++ as part of the build process. OPTION=rt also searches - (and, if needed, links) librt. Note that the latter is not always - desirable because, in glibc, for example, in turn it triggers the - linking of libpthread too, which activates locking, a large overhead - for single-thread programs. OPTION=no skips the tests completely. + to libstdc++ as part of the build process. OPTION=rt also checks in + librt (and, if it's needed, links to it). Note that linking to librt + is not always desirable because for glibc it requires linking to + libpthread too, which causes all reference counting to use atomic + operations, resulting in a potentially large overhead for + single-threaded programs. OPTION=no skips the tests completely. The default is OPTION=auto, which skips the checks and enables the features only for targets known to support them. + For Linux targets, if clock_gettime is not used + then the [time.clock] implementation will use a system call to access + the realtime and monotonic clocks, which is significantly slower than + the C library's clock_gettime function.
Re: [PATCH], PowerPC, Patch #6, revision 2, Create pc-relative addressing insns
On Wed, Jul 10, 2019 at 05:17:48PM -0400, Michael Meissner wrote: > +extern alias_set_type get_pc_relative_alias_set (void); I'd just call it "pcrel", not "pc_relative", just like everywhere else? > @@ -7785,7 +7787,7 @@ bool > toc_relative_expr_p (const_rtx op, bool strict, const_rtx *tocrel_base_ret, >const_rtx *tocrel_offset_ret) > { > - if (!TARGET_TOC) > + if (!TARGET_TOC || TARGET_PCREL) Maybe (TARGET_TOC && !TARGET_PCREL) should get a name of its own? Or maybe TARGET_TOC should not be enabled when TARGET_PCREL is? It doesn't make much sense at all to have both enabled, does it? > + /* If this is a SYMBOL_REF that we refer to via pc-relative addressing, > + we don't have to do any special for it. */ > + else if (TARGET_PCREL && SYMBOL_REF_P (operands[1]) > +&& TARGET_CMODEL == CMODEL_MEDIUM > +&& SYMBOL_REF_LOCAL_P (operands[1])) else if (TARGET_PCREL && TARGET_CMODEL == CMODEL_MEDIUM && SYMBOL_REF_P (operands[1]) && SYMBOL_REF_LOCAL_P (operands[1])) > @@ -9928,6 +9938,11 @@ rs6000_emit_move (rtx dest, rtx source, > operands[1] = gen_const_mem (mode, tocref); > set_mem_alias_set (operands[1], get_TOC_alias_set ()); > } > + > + else if (TARGET_PCREL && SYMBOL_REF_P (XEXP (operands[1], 0)) > +&& TARGET_CMODEL == CMODEL_MEDIUM > +&& SYMBOL_REF_LOCAL_P (XEXP (operands[1], 0))) > + set_mem_alias_set (operands[1], get_pc_relative_alias_set ()); Similar. > alias_set_type > get_TOC_alias_set (void) > { > - if (set == -1) > -set = new_alias_set (); > - return set; > + if (TOC_alias_set == -1) > +TOC_alias_set = new_alias_set (); > + return TOC_alias_set; > +} It would be nice if you could initialise the alias sets some other way. Not new code, but you duplicate it ;-) > +alias_set_type > +get_pc_relative_alias_set (void) > +{ > + if (pc_relative_alias_set == -1) > +pc_relative_alias_set = new_alias_set (); > + return pc_relative_alias_set; > } Rest looks fine. Segher
Re: [PATCH], PowerPC, Patch #6, revision 2, Create pc-relative addressing insns
On Thu, Jul 11, 2019 at 02:52:03PM -0500, Segher Boessenkool wrote: > On Wed, Jul 10, 2019 at 05:17:48PM -0400, Michael Meissner wrote: > > +extern alias_set_type get_pc_relative_alias_set (void); > > I'd just call it "pcrel", not "pc_relative", just like everywhere else? > > > @@ -7785,7 +7787,7 @@ bool > > toc_relative_expr_p (const_rtx op, bool strict, const_rtx *tocrel_base_ret, > > const_rtx *tocrel_offset_ret) > > { > > - if (!TARGET_TOC) > > + if (!TARGET_TOC || TARGET_PCREL) > > Maybe (TARGET_TOC && !TARGET_PCREL) should get a name of its own? Or > maybe TARGET_TOC should not be enabled when TARGET_PCREL is? It doesn't > make much sense at all to have both enabled, does it? Yeah, in theory TARGET_TOC should not be set if TARGET_PCREL is set, but when I last tried it, there were some failures. I can make a macro for the two together. > > + /* If this is a SYMBOL_REF that we refer to via pc-relative > > addressing, > > +we don't have to do any special for it. */ > > + else if (TARGET_PCREL && SYMBOL_REF_P (operands[1]) > > + && TARGET_CMODEL == CMODEL_MEDIUM > > + && SYMBOL_REF_LOCAL_P (operands[1])) > > else if (TARGET_PCREL > && TARGET_CMODEL == CMODEL_MEDIUM > && SYMBOL_REF_P (operands[1]) > && SYMBOL_REF_LOCAL_P (operands[1])) > > > @@ -9928,6 +9938,11 @@ rs6000_emit_move (rtx dest, rtx source, > > operands[1] = gen_const_mem (mode, tocref); > > set_mem_alias_set (operands[1], get_TOC_alias_set ()); > > } > > + > > + else if (TARGET_PCREL && SYMBOL_REF_P (XEXP (operands[1], 0)) > > + && TARGET_CMODEL == CMODEL_MEDIUM > > + && SYMBOL_REF_LOCAL_P (XEXP (operands[1], 0))) > > + set_mem_alias_set (operands[1], get_pc_relative_alias_set ()); > > Similar. I had it in a function that did the checking, but after eliminating the constant pool check that TOC needs, it was just doing the medium code model and symbol_ref local checks, so I eliminated the function. > > alias_set_type > > get_TOC_alias_set (void) > > { > > - if (set == -1) > > -set = new_alias_set (); > > - return set; > > + if (TOC_alias_set == -1) > > +TOC_alias_set = new_alias_set (); > > + return TOC_alias_set; > > +} > > It would be nice if you could initialise the alias sets some other way. > Not new code, but you duplicate it ;-) For the pc-relative case, I'm not sure we need the alias set. In the TOC case, we do a force_const_mem which returns a SYMBOL_REF pointing to the data in the constant pool. Then we call create_TOC_reference which rewrites the memory, and then it calls gen_const_mem which notes that memory in unchanging. But for pc-relative, we just use output of force_const_mem. Now force_const_mem does not seem to set the alias set (but it will have the constant pool bit set). > > +alias_set_type > > +get_pc_relative_alias_set (void) > > +{ > > + if (pc_relative_alias_set == -1) > > +pc_relative_alias_set = new_alias_set (); > > + return pc_relative_alias_set; > > } > > Rest looks fine. > > > Segher > -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Re: [PATCH], PowerPC, Patch #8, rename rs6000_prefixed_address
[ Please do not send patches as replies. ] On Thu, Jul 11, 2019 at 08:11:52AM -0400, Michael Meissner wrote: > In a future patch, I plan to introduce a new function that says whether an > address is prefixed based on the address format (D-form, DS-form, DQ-form) > used > by the instruction. I plan on naming the new function: > > rs6000_prefixed_address_format > > This means the existing function that takes a mode argument will be renamed > to: > > rs6000_prefixed_address_mode D/DQ/DS are the "instruction form", not "address format". Let's not invent new similar names :-/ > Rs6000_prefixed_address_mode will have a table mapping the mode into the > expected address format, and then call rs6000_prefixed_address_format. This > is > due to the number of irregularities in the PowerPC architecture: > > LWA is DS-form, but LWZ is D-form > LFD is D-form, LXSD is DS-form Great isn't it? > In the tests that will occur after register allocation, we can check the > register type, and more accurately determine if a prefixed instruction must be > used if the offset is odd (for DS-form) or the offset is not aligned on a > 16-byte boundary (for DQ-form). Is not 0 mod 4, for DS, even. > --- gcc/config/rs6000/predicates.md (revision 273368) > +++ gcc/config/rs6000/predicates.md (working copy) > @@ -1686,7 +1686,7 @@ (define_predicate "pcrel_external_addres > (define_predicate "prefixed_mem_operand" >(match_code "mem") > { > - return rs6000_prefixed_address (XEXP (op, 0), GET_MODE (op)); > + return rs6000_prefixed_address_mode (XEXP (op, 0), GET_MODE (op)); > }) That sounds like it returns a mode. rs6000_prefixed_address_mode_p or rs6000_is_prefixed_address_mode, maybe? Well you use _p alredy, so please do that here as well. Okay for trunk with that change. Thanks! Segher
Re: [PATCH], PowerPC, Patch #6, revision 2, Create pc-relative addressing insns
On Thu, Jul 11, 2019 at 04:09:44PM -0400, Michael Meissner wrote: > On Thu, Jul 11, 2019 at 02:52:03PM -0500, Segher Boessenkool wrote: > > On Wed, Jul 10, 2019 at 05:17:48PM -0400, Michael Meissner wrote: > > > +extern alias_set_type get_pc_relative_alias_set (void); > > > > I'd just call it "pcrel", not "pc_relative", just like everywhere else? > > > > > @@ -7785,7 +7787,7 @@ bool > > > toc_relative_expr_p (const_rtx op, bool strict, const_rtx > > > *tocrel_base_ret, > > >const_rtx *tocrel_offset_ret) > > > { > > > - if (!TARGET_TOC) > > > + if (!TARGET_TOC || TARGET_PCREL) > > > > Maybe (TARGET_TOC && !TARGET_PCREL) should get a name of its own? Or > > maybe TARGET_TOC should not be enabled when TARGET_PCREL is? It doesn't > > make much sense at all to have both enabled, does it? > > Yeah, in theory TARGET_TOC should not be set if TARGET_PCREL is set, but when > I > last tried it, there were some failures. I can make a macro for the two > together. TARGET_TOC is set in various header files (linux64.h etc.) While TARGET_PCREL is defined in rs6000.opt . Maybe rename the original TARGET_TOC definitions, and make a generic TARGET_TOC defined as (TARGET_CAN_HAVE_TOC && !TARGET_PCREL) ? Something like that? > > > + else if (TARGET_PCREL && SYMBOL_REF_P (operands[1]) > > > +&& TARGET_CMODEL == CMODEL_MEDIUM > > > +&& SYMBOL_REF_LOCAL_P (operands[1])) > > > > else if (TARGET_PCREL > >&& TARGET_CMODEL == CMODEL_MEDIUM > >&& SYMBOL_REF_P (operands[1]) > >&& SYMBOL_REF_LOCAL_P (operands[1])) > I had it in a function that did the checking, but after eliminating the > constant pool check that TOC needs, it was just doing the medium code model > and > symbol_ref local checks, so I eliminated the function. My point is that you should put like things together, and you should not put unrelated conditions on one line. You could put the SYMBOL_REF_P and SYMBOL_REF_LOCAL_P on one line, I wouldn't mind (except that doesn't fit on one line then ;-) ). > > > alias_set_type > > > get_TOC_alias_set (void) > > > { > > > - if (set == -1) > > > -set = new_alias_set (); > > > - return set; > > > + if (TOC_alias_set == -1) > > > +TOC_alias_set = new_alias_set (); > > > + return TOC_alias_set; > > > +} > > > > It would be nice if you could initialise the alias sets some other way. > > Not new code, but you duplicate it ;-) > > For the pc-relative case, I'm not sure we need the alias set. In the TOC > case, > we do a force_const_mem which returns a SYMBOL_REF pointing to the data in the > constant pool. Then we call create_TOC_reference which rewrites the memory, > and then it calls gen_const_mem which notes that memory in unchanging. > > But for pc-relative, we just use output of force_const_mem. Now > force_const_mem does not seem to set the alias set (but it will have the > constant pool bit set). I mean just this: if (TOC_alias_set == -1) TOC_alias_set = new_alias_set (); in the getter function. It could be initialised elsewhere, there are enough initialisation functions, it will fit *somewhere*, and then the getter function doesn't need to do the initialisation anymore. Which then suddenly makes it very inlinable, etc. Segher
Go patch committed: Ensure evaluation order in type hash/eq functions
This Go frontend patch by Cherry Zhang ensures correct evaluation order in type hash/equality functions. The type hash and equality functions are generated after the order_evaluations pass. They may contain shortcut operators and Set_and_use_temporary_expressions (e.g. from lowering a Binary_exprssion) that need to be ordered. Run order_evaluations and remove_shortcuts on these functions. (The hash functions may be fine, but to be on the safe side we run on them anyway. We do need to run on the equality functions.) A Set_and_use_temporary_expression is effectively an assignment, so it needs to be ordered. Otherwise if we insert a temporary statement before it, we may get wrong evaluation order. A test case is https://golang.org/cl/185818. This fixes https://golang.org/issue/33062. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 273364) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -ec754ff4617d564d3dc377121ea9ac5e55f6535a +70ceba5e95716653b9f829a457a44a829175d4da The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/expressions.h === --- gcc/go/gofrontend/expressions.h (revision 273307) +++ gcc/go/gofrontend/expressions.h (working copy) @@ -1628,6 +1628,10 @@ class Set_and_use_temporary_expression : } bool + do_must_eval_in_order() const + { return true; } + + bool do_is_addressable() const { return true; } Index: gcc/go/gofrontend/gogo.cc === --- gcc/go/gofrontend/gogo.cc (revision 273364) +++ gcc/go/gofrontend/gogo.cc (working copy) @@ -4097,6 +4097,15 @@ Gogo::order_evaluations() this->traverse(&order_eval); } +// Order evaluations in a block. + +void +Gogo::order_block(Block* block) +{ + Order_eval order_eval(this); + block->traverse(&order_eval); +} + // A traversal class used to find a single shortcut operator within an // expression. @@ -4306,6 +4315,15 @@ Gogo::remove_shortcuts() this->traverse(&shortcuts); } +// Turn shortcut operators into explicit if statements in a block. + +void +Gogo::remove_shortcuts_in_block(Block* block) +{ + Shortcuts shortcuts(this); + block->traverse(&shortcuts); +} + // Traversal to flatten parse tree after order of evaluation rules are applied. class Flatten : public Traverse Index: gcc/go/gofrontend/gogo.h === --- gcc/go/gofrontend/gogo.h(revision 273364) +++ gcc/go/gofrontend/gogo.h(working copy) @@ -749,10 +749,18 @@ class Gogo void remove_shortcuts(); + // Turn short-cut operators into explicit if statements in a block. + void + remove_shortcuts_in_block(Block*); + // Use temporary variables to force order of evaluation. void order_evaluations(); + // Order evaluations in a block. + void + order_block(Block*); + // Add write barriers as needed. void add_write_barriers(); Index: gcc/go/gofrontend/types.cc === --- gcc/go/gofrontend/types.cc (revision 273307) +++ gcc/go/gofrontend/types.cc (working copy) @@ -2098,6 +2098,8 @@ Type::write_specific_type_functions(Gogo Block* b = gogo->finish_block(bloc); gogo->add_block(b, bloc); gogo->lower_block(hash_fn, b); + gogo->order_block(b); + gogo->remove_shortcuts_in_block(b); gogo->finish_function(bloc); Named_object *equal_fn = gogo->start_function(equal_name, equal_fntype, @@ -2119,6 +2121,8 @@ Type::write_specific_type_functions(Gogo b = gogo->finish_block(bloc); gogo->add_block(b, bloc); gogo->lower_block(equal_fn, b); + gogo->order_block(b); + gogo->remove_shortcuts_in_block(b); gogo->finish_function(bloc); // Build the function descriptors for the type descriptor to refer to.
Re: [PATCH] integrate sprintf pass into strlen (PR 83431)
... We've already touched on the need to limit the backwards walk out of this code. Limiting based on the product of # phi args * number of phis visited, recursion depth, or number of SSA_NAME_DEF_STMTs visited all seem reasonable to me. I do think Richi's suggestion for figuring out a suitable inflection point is reasonable. It's easy enough to add here. But I know I've introduced other algorithms that recurse on SSA_NAME_DEF_STMT, and I'm quite sure others predate those. To make a difference I think we need to review at least the one most likely to be exposed to this problem and introduce the same limit there. I could probably fix the ones I wrote reasonably quickly, but making the change to the others would be much more of a project. I looked to see how pervasive this might be and here is just a small sampling of things that jumped out at me in a quick grep search: * compute_builtin_object_size (for _FORTIFY_SOURCE) * compute_objsize (for -Wstringop-overflow) * get_range_strlen * maybe_fold_{and,or}_comparisons in gimple-fold.c * -Warray-bounds (computing an offset into an object) * -Wrestrict (computing an offset into an object) * -Wreturn-local-addr (is_addr_local) * -Wuninitialized (collect_phi_def_edges) I don't see the recursion in maybe_fold_{and,or}_comparisons. I didn't study any of these very carefully, I just looked for uses of SSA_NAME_DEF_STMT around recursive calls. same_bool_comparison_p looks like it calls itself with that result. I could be wrong. The others all smell like they might be yours ;) (besides object-size maybe but that one is limited by having a cache - hopefully also used when not used in the compute-everything mode). It doesn't matter who introduced them. What I'm saying is that if the recursion is a problem it should be limited everywhere, not just in one place. A change with global scope like that would best be made independently of other unrelated changes, to minimize the risk of introducing bugs and then the difficulty of debugging and fixing them. Given how wide-spread this technique seems to be, if the recursion is in fact a problem it's just as likely (if not more) to come up in the folder or in BOS or some other place as it is here. So if it needs fixing it seems it should be done as its own project and everywhere (or as close as we can get), and not as part of this integration. It's never an excuse to add new cases though and adding a limit is _really_ simple. An "excuse?" I'm just explaining that I would prefer to introduce the limit independently of the integration. Sure, it is simple in that one place. I said that above. It's also probably relatively simple in each of the other instances (at least those I'm familiar with). It's just cleaner and safer to add it independently of other changes, that's all. Anyway, I posted a simple patch to introduce the limit. I'm also changing the integrated pass to make use of it once it's committed. If it's considered necessary I'm also willing to adjust the rest of the algorithms I introduced to respect the limit. Thanks Martin
Re: [PATCH] improve ifcvt optimization (PR rtl-optimization/89430)
On Mon, Jul 8, 2019 at 12:39 AM JiangNing OS wrote: > > Hi Jeff and Richard B., > > Following your tips, I've found a much simpler solution in tree-ssa-phiopt.c. > Attached is the new patch. Review again, please! /* Prove that we can move the store down. We could also check TREE_THIS_NOTRAP here, but in that case we also could move stores, whose value is not available readily, which we want to avoid. */ I think the comment above the change needs to be changed or extended slightly. Thanks, Andrew Pinski > > Thanks a lot! > -Jiangning > > > -Original Message- > > From: Jeff Law > > Sent: Saturday, June 22, 2019 12:10 AM > > To: Richard Biener > > Cc: JiangNing OS ; gcc- > > patc...@gcc.gnu.org > > Subject: Re: [PATCH] improve ifcvt optimization (PR rtl-optimization/89430) > > > > On 6/21/19 6:23 AM, Richard Biener wrote: > > > > > > That looks like a pretty easy form to analyze. I'd suggest looking > > > through tree-ssa-phiopt.c closely. There's several transformations in > > > there that share similarities with yours. > > > > > > > > > More specifically there is the conditional store elimination (cselim) > > > pass inside this file. > > That's the one I was thinking of. It looks reasonably close to the cases > > JiangNing is trying to capture. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 2) My current solution fits into current back-end if-conversion > > > pass very well. I don't want to invent > > >> a new framework to solve this relatively small issue. Besides, > > > this back-end patch doesn't only > > >> enhance store speculation detection, but also fix a bug in the > > > original code. Understood, but I still wonder if we're better off > > > addressing this in gimple. > > > > > > > > > Traditionally if-conversions profitability heavily depends on the > > > target and esp. if memory is involved costing on GIMPLE is hard. > > > That's also one reason why we turned off loop if-conversion on GIMPLE > > > for non-vectorized code. > > Yea. That's always the concern for transformations that aren't trivial to > > show > > are better. > > > > Conditional store elimination as implemented right now in phiopt requires a > > single store in the then/else clauses. So we're really just sinking the > > stores > > through a PHI. We're really not changing the number of loads or stores on > > any path. > > > > In the cases I think JiangNing is trying to handle we are adding a store on > > a > > path where it didn't previously exist because there is no else clause. So > > we > > likely need to check the edge probabilities to avoid doing something dumb. > > I > > don't have a good sense where the cutoff should be. tree-ssa-sink might > > have some nuggets of information to help guide. > > > > > > > > phiopt is really about followup optimization opportunities in passes > > > that do not handle conditionals well. > > > > > > cselim is on the border... > > ACK. In fact, looking at it it I wonder if it'd be better in > > tree-ssa-sink.c :-) > > > > > > jeff
Re: [PATCH] Add hints for slim dumping if fallthrough bb of jump isn't next bb
on 2019/7/11 下午9:55, Richard Sandiford wrote: > Kewen.Lin writes: >> + /* Emit a hint if the fallthrough target of current basic block >> +isn't the one placed right next. */ >> + else if (EDGE_COUNT (bb->succs) > 0) >> + { >> + gcc_assert (BB_END (bb) == tmp_rtx); >> + const rtx_insn *ninsn = NEXT_INSN (tmp_rtx); >> + /* Bypass intervening deleted-insn notes and debug insns. >> */ >> + while (ninsn && !NONDEBUG_INSN_P (ninsn) >> +&& !start[INSN_UID (ninsn)]) > > Just a cosmetic thing, but when the full expression needs to be split > over several lines, there should be one condition per line: > >while (ninsn > && !NONDEBUG_INSN_P (ninsn) > && !start[INSN_UID (ninsn)]) > > OK with that change, thanks. > Thanks Richard! Regression tested and bootstrapped on powerpc64le-unknown-linux-gnu. Committed as r273430 with above change. Kewen
Re: [ARM/FDPIC v5 05/21] [ARM] FDPIC: Fix __do_global_dtors_aux and frame_dummy generation
Christophe Lyon writes: > In FDPIC, we need to make sure __do_global_dtors_aux and frame_dummy > are referenced by their address, not by pointers to the function > descriptors. > > 2019-XX-XX Christophe Lyon > Mickaël Guêné > > * libgcc/crtstuff.c: Add support for FDPIC. > > Change-Id: I0bc4b1232fbf3c69068fb23a1b9cafc895d141b1 > > diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c > index 4927a9f..159b461 100644 > --- a/libgcc/crtstuff.c > +++ b/libgcc/crtstuff.c > @@ -429,9 +429,18 @@ __do_global_dtors_aux (void) > #ifdef FINI_SECTION_ASM_OP > CRT_CALL_STATIC_FUNCTION (FINI_SECTION_ASM_OP, __do_global_dtors_aux) > #elif defined (FINI_ARRAY_SECTION_ASM_OP) > +#if defined(__FDPIC__) > +__asm__( > +" .section .fini_array\n" > +" .align 2\n" > +" .word __do_global_dtors_aux\n" > +); > +asm (TEXT_SECTION_ASM_OP); > +#else /* defined(__FDPIC__) */ > static func_ptr __do_global_dtors_aux_fini_array_entry[] >__attribute__ ((__used__, section(".fini_array"), > aligned(sizeof(func_ptr >= { __do_global_dtors_aux }; > +#endif /* defined(__FDPIC__) */ > #else /* !FINI_SECTION_ASM_OP && !FINI_ARRAY_SECTION_ASM_OP */ > static void __attribute__((used)) > __do_global_dtors_aux_1 (void) It'd be good to avoid hard-coding the pointer size. Would it work to do: __asm__("\t.equ\.t__do_global_dtors_aux_alias, __do_global_dtors_aux\n"); extern char __do_global_dtors_aux_alias; static void *__do_global_dtors_aux_fini_array_entry[] __attribute__ ((__used__, section(".fini_array"), aligned(sizeof(void * = { &__do_global_dtors_aux_alias }; ? Similarly for the init_array. AFAICT this and 02/21 are the only patches that aren't Arm-specific, is that right? Thanks, Richard
Re: [ARM/FDPIC v5 02/21] [ARM] FDPIC: Handle arm*-*-uclinuxfdpiceabi in configure scripts
Christophe Lyon writes: > The new arm-uclinuxfdpiceabi target behaves pretty much like > arm-linux-gnueabi. In order the enable the same set of features, we > have to update several configure scripts that generally match targets > like *-*-linux*: in most places, we add *-uclinux* where there is > already *-linux*, or uclinux* when there is already linux*. > > In gcc/config.gcc and libgcc/config.host we use *-*-uclinuxfdpiceabi > because there is already a different behaviour for *-*uclinux* target. > > In libtool.m4, we use uclinuxfdpiceabi in cases where ELF shared > libraries support is required, as uclinux does not guarantee that. > > 2019-XX-XX Christophe Lyon > > config/ > * futex.m4: Handle *-uclinux*. > * tls.m4 (GCC_CHECK_TLS): Likewise. > > gcc/ > * config.gcc: Handle *-*-uclinuxfdpiceabi. > > libatomic/ > * configure.tgt: Handle arm*-*-uclinux*. > * configure: Regenerate. > > libgcc/ > * config.host: Handle *-*-uclinuxfdpiceabi. > > libitm/ > * configure.tgt: Handle *-*-uclinux*. > * configure: Regenerate. > > libstdc++-v3/ > * acinclude.m4: Handle uclinux*. > * configure: Regenerate. > * configure.host: Handle uclinux* > > * libtool.m4: Handle uclinux*. Has the libtool.m4 patch been submitted to upstream libtool? I think this is supposed to be handled by submitting there first and then cherry-picking into gcc, so that the change isn't lost by a future import. > [...] > > diff --git a/config/tls.m4 b/config/tls.m4 > index 1a5fc59..a487aa4 100644 > --- a/config/tls.m4 > +++ b/config/tls.m4 > @@ -76,7 +76,7 @@ AC_DEFUN([GCC_CHECK_TLS], [ > dnl Shared library options may depend on the host; this check > dnl is only known to be needed for GNU/Linux. > case $host in > - *-*-linux*) > + *-*-linux* | -*-uclinux*) > LDFLAGS="-shared -Wl,--no-undefined $LDFLAGS" > ;; > esac Is this right for all uclinux targets? > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 > index 84258d8..cb0fdc5 100644 > --- a/libstdc++-v3/acinclude.m4 > +++ b/libstdc++-v3/acinclude.m4 It'd probably be worth splitting out the libstdc++-v3 bits and submitting them separately, cc:ing libstd...@gcc.gnu.org. But... > @@ -1404,7 +1404,7 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [ > ac_has_nanosleep=yes > ac_has_sched_yield=yes > ;; > - gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu) > + gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*) > AC_MSG_CHECKING([for at least GNU libc 2.17]) > AC_TRY_COMPILE( >[#include ], is this the right thing to do? It seems odd to be testing the glibc version for uclibc. Do you want to support multiple possible settings of ac_has_clock_monotonic and ac_has_clock_realtime? Or could you just hard-code the values, given particular baseline assumptions about the version of uclibc etc.? Hard-coding would then make > @@ -1526,7 +1526,7 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [ > >if test x"$ac_has_clock_monotonic" != x"yes"; then > case ${target_os} in > - linux*) > + linux* | uclinux*) > AC_MSG_CHECKING([for clock_gettime syscall]) > AC_TRY_COMPILE( > [#include ...this redundant. > @@ -2415,7 +2415,7 @@ AC_DEFUN([GLIBCXX_ENABLE_CLOCALE], [ ># Default to "generic". >if test $enable_clocale_flag = auto; then > case ${target_os} in > - linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu) > + linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*) > enable_clocale_flag=gnu > ;; >darwin*) This too seems to be choosing a glibc setting for a uclibc target. > @@ -2661,7 +2661,7 @@ AC_DEFUN([GLIBCXX_ENABLE_ALLOCATOR], [ ># Default to "new". >if test $enable_libstdcxx_allocator_flag = auto; then > case ${target_os} in > - linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu) > + linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*) > enable_libstdcxx_allocator_flag=new > ;; >*) The full case is: # Probe for host-specific support if no specific model is specified. # Default to "new". if test $enable_libstdcxx_allocator_flag = auto; then case ${target_os} in linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu) enable_libstdcxx_allocator_flag=new ;; *) enable_libstdcxx_allocator_flag=new ;; esac fi which looks a bit redundant :-) Thanks, Richard