date:20180116

[PATCH] Fix PR82132

2018-01-16 Thread Richard Biener


Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-01-16  Richard Biener  

PR testsuite/82132
* gcc.dg/vect/vect-tail-nomask-1.c: Copy posix_memalign boiler-plate
from gcc.dg/torture/pr60092.c.

Index: gcc/testsuite/gcc.dg/vect/vect-tail-nomask-1.c
===
--- gcc/testsuite/gcc.dg/vect/vect-tail-nomask-1.c  (revision 256722)
+++ gcc/testsuite/gcc.dg/vect/vect-tail-nomask-1.c  (working copy)
@@ -1,5 +1,9 @@
 /* { dg-do run } */
 /* { dg-require-weak "" } */
+/* { dg-skip-if "No undefined weak" { hppa*-*-hpux* && { ! lp64 } } } */
+/* { dg-skip-if "No undefined weak" { nvptx-*-* } } */
+/* { dg-additional-options "-Wl,-undefined,dynamic_lookup" { target 
*-*-darwin* } } */
+/* { dg-additional-options "-Wl,-flat_namespace" { target *-*-darwin[89]* } } 
*/
 /* { dg-additional-options "--param vect-epilogues-nomask=1 -mavx2" { target 
avx2_runtime } } */
 
 #define SIZE 1023

Re: [PATCH] Preserve CROSSING_JUMP_P in peephole2 (PR rtl-optimization/83213)

2018-01-16 Thread Richard Biener

On Mon, 15 Jan 2018, Jakub Jelinek wrote:

> Hi!
> 
> On the testcase in the PR (too large and creduce not making sufficient
> progress) we ICE because i386.md:
> ;; Combining simple memory jump instruction
> 
> (define_peephole2
>   [(set (match_operand:W 0 "register_operand")
> (match_operand:W 1 "memory_operand"))
>(set (pc) (match_dup 0))]
>   "!TARGET_X32
>&& !ix86_indirect_branch_thunk_register
>&& peep2_reg_dead_p (2, operands[0])"
>   [(set (pc) (match_dup 1))])
> 
> peephole2 triggers on a CROSSING_JUMP_P jump, but nothing actually
> copies that bit over from the old to the new JUMP_INSN.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

Ok.

Richard.

> 2018-01-15  Jakub Jelinek  
> 
>   PR rtl-optimization/83213
>   * recog.c (peep2_attempt): Copy over CROSSING_JUMP_P from peepinsn
>   to last if both are JUMP_INSNs.
> 
> --- gcc/recog.c.jj2018-01-09 08:58:14.594002069 +0100
> +++ gcc/recog.c   2018-01-15 16:37:13.279196178 +0100
> @@ -3446,6 +3446,8 @@ peep2_attempt (basic_block bb, rtx_insn
>last = emit_insn_after_setloc (attempt,
>peep2_insn_data[i].insn,
>INSN_LOCATION (peepinsn));
> +  if (JUMP_P (peepinsn) && JUMP_P (last))
> +CROSSING_JUMP_P (last) = CROSSING_JUMP_P (peepinsn);
>before_try = PREV_INSN (insn);
>delete_insn_chain (insn, peep2_insn_data[i].insn, false);
>  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix PR83435

2018-01-16 Thread Richard Biener

On Mon, 15 Jan 2018, Szabolcs Nagy wrote:

> On 11/01/18 13:41, Richard Biener wrote:
> > 2018-01-11  Richard Biener  
> > 
> > PR tree-optimization/83435
> > * graphite.c (canonicalize_loop_form): Ignore fake loop exit edges.
> > * graphite-scop-detection.c (scop_detection::get_sese): Likewise.
> > * tree-vrp.c (add_assert_info): Drop TREE_OVERFLOW if they appear.
> > 
> > * gcc.dg/graphite/pr83435.c: New testcase.
> 
> this test case fails on baremetal targets for me with
> 
> xgcc: error: unrecognized command line option '-pthread'

Fixed as follows.

Richard.

2018-01-16  Richard Biener  

* gcc.dg/graphite/pr83435.c: Restrict to target pthread.

Index: gcc/testsuite/gcc.dg/graphite/pr83435.c
===
--- gcc/testsuite/gcc.dg/graphite/pr83435.c (revision 256722)
+++ gcc/testsuite/gcc.dg/graphite/pr83435.c (working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target pthread } } */
 /* { dg-options "-O -ftree-parallelize-loops=2 -floop-parallelize-all" } */
 
 int yj, ax;

> 
> > Index: gcc/testsuite/gcc.dg/graphite/pr83435.c
> > ===
> > --- gcc/testsuite/gcc.dg/graphite/pr83435.c (nonexistent)
> > +++ gcc/testsuite/gcc.dg/graphite/pr83435.c (working copy)
> > @@ -0,0 +1,25 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -ftree-parallelize-loops=2 -floop-parallelize-all" } */
> > +
> > +int yj, ax;
> > +
> > +void
> > +gf (signed char mp)
> > +{
> > +  int *dh = &yj;
> > +
> > +  for (;;)
> > +{
> > +  signed char sb;
> > +
> > +  for (sb = 0; sb < 1; sb -= 8)
> > +   {
> > +   }
> > +
> > +  mp &= mp <= sb;
> > +  if (mp == 0)
> > +   dh = &ax;
> > +  mp = 0;
> > +  *dh = 0;
> > +}
> > +}
> > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: Don't group gather loads (PR83847)

2018-01-16 Thread Richard Biener

On Mon, Jan 15, 2018 at 4:12 PM, Richard Sandiford
 wrote:
> In the testcase we were trying to group two gather loads, even though
> that isn't supported.  Fixed by explicitly disallowing grouping of
> gathers and scatters.
>
> This problem didn't show up on SVE because there we convert to
> IFN_GATHER_LOAD/IFN_SCATTER_STORE pattern statements, which fail
> the can_group_stmts_p check.
>
> Tested on x86_64-linux-gnu.  OK to install?

Ok.

Richard.

> Richard
>
>
> 2018-01-15  Richard Sandiford  
>
> gcc/
> * tree-vect-data-refs.c (vect_analyze_data_ref_accesses):
>
> gcc/testsuite/
> * gcc.dg/torture/pr83847.c: New test.
>
> Index: gcc/tree-vect-data-refs.c
> ===
> --- gcc/tree-vect-data-refs.c   2018-01-13 18:02:00.948360274 +
> +++ gcc/tree-vect-data-refs.c   2018-01-15 12:22:47.066621712 +
> @@ -2923,7 +2923,8 @@ vect_analyze_data_ref_accesses (vec_info
>data_reference_p dra = datarefs_copy[i];
>stmt_vec_info stmtinfo_a = vinfo_for_stmt (DR_STMT (dra));
>stmt_vec_info lastinfo = NULL;
> -  if (! STMT_VINFO_VECTORIZABLE (stmtinfo_a))
> +  if (!STMT_VINFO_VECTORIZABLE (stmtinfo_a)
> + || STMT_VINFO_GATHER_SCATTER_P (stmtinfo_a))
> {
>   ++i;
>   continue;
> @@ -2932,7 +2933,8 @@ vect_analyze_data_ref_accesses (vec_info
> {
>   data_reference_p drb = datarefs_copy[i];
>   stmt_vec_info stmtinfo_b = vinfo_for_stmt (DR_STMT (drb));
> - if (! STMT_VINFO_VECTORIZABLE (stmtinfo_b))
> + if (!STMT_VINFO_VECTORIZABLE (stmtinfo_b)
> + || STMT_VINFO_GATHER_SCATTER_P (stmtinfo_b))
> break;
>
>   /* ???  Imperfect sorting (non-compatible types, non-modulo
> Index: gcc/testsuite/gcc.dg/torture/pr83847.c
> ===
> --- /dev/null   2018-01-12 06:40:27.684409621 +
> +++ gcc/testsuite/gcc.dg/torture/pr83847.c  2018-01-15 12:22:47.064621805 
> +
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=bdver4" { target i?86-*-* x86_64-*-* } } 
> */
> +
> +typedef struct {
> +  struct {
> +int a;
> +int b;
> +  } c;
> +} * d;
> +typedef struct {
> +  unsigned e;
> +  d f[];
> +} g;
> +g h;
> +d *k;
> +int i(int j) {
> +  if (j) {
> +*k = *h.f;
> +return 1;
> +  }
> +  return 0;
> +}
> +int l;
> +int m;
> +int n;
> +d o;
> +void p() {
> +  for (; i(l); l++) {
> +n += o->c.a;
> +m += o->c.b;
> +  }
> +}

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Richard Biener

On Mon, Jan 15, 2018 at 5:53 PM, H.J. Lu  wrote:
> On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
>> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
>>  wrote:
>>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
 Now my patch set has been checked into trunk.  Here is a patch set
 to move struct ix86_frame to machine_function on GCC 7, which is
 needed to backport the patch set to GCC 7:

 https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
 https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
 https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html

 OK for gcc-7-branch?
>>>
>>> Yes, backporting is ok - please watch for possible fallout on trunk and make
>>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
>>> Wednesday now with the final release about a week later if no issue shows
>>> up.
>>>
>>
>> Backport is blocked by
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
>>
>> There are many test failures due to lack of comdat support in linker on 
>> Solaris.
>> I can limit these tests to Linux.
>
> These are testcase issues and shouldn't block backport to GCC 7.

It makes the option using thunks unusable though, right?  Can you simply make
them hidden on systems without comdat support?  That duplicates them per TU
but at least the feature works.  Or those systems should provide the thunks via
libgcc.

I agree we can followup with a fix for Solaris given lack of a public
testing machine.

Richard.

>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839
>>
>> Bootstrap failed on Dawning due to lack of ".set" directive in assembler.  I
>> uploaded a patch:
>>
>> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43124
>>
>> There is no confirmation on it.  Also there may be test failures on Dardwin
>> due to difference in assembly output.
>
> I posted a patch for Darwin build:
>
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01347.html
>
> This needs to be checked into trunk before I can start backport to GCC 7.
>
> --
> H.J.

Re: [PATCH, rs6000] Support for gimple folding of mergeh, mergel intrinsics

2018-01-16 Thread Richard Biener

On Mon, Jan 15, 2018 at 6:20 PM, Will Schmidt  wrote:
> On Mon, 2018-01-15 at 10:24 +, Richard Sandiford wrote:
>> >> +  for (int i = 0; i < midpoint; i++)
>> >> +{
>> >> +  tree tmp1 = build_int_cst (lhs_type_type, offset + i);
>> >> +  tree tmp2 = build_int_cst (lhs_type_type, offset + n_elts +
>> i);
>> >> +  CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, tmp1);
>> >> +  CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, tmp2);
>> >> +}
>> >> +  tree permute = create_tmp_reg_or_ssa_name (lhs_type);
>> >> +  g = gimple_build_assign (permute, build_constructor (lhs_type,
>> ctor_elts));
>> >
>> > I think this is no longer canonical GIMPLE (Richard?)
>>
>> FWIW, although the recent patches added the option of using wider
>> permute vectors if the permute vector is constant, it's still OK to
>> use the original style of permute vectors if that's known to be valid.
>> In this case it is because we know the indices won't wrap, given the
>> size of the input vectors.
>
> Ok.
>
>> > and given it is also a constant you shouldn't emit a CONSTRUCTOR
>> here
>> > but directly construct the appropriate VECTOR_CST.  So it looks like
>> > the mergel/h intrinsics interleave the low or high part of two
>> > vectors?
>
> Right, it is an interleaving of the two vectors.  The size and contents
> vary depending on the type, and though i briefly considered building up
> a if/else table, this approach was far simpler (and less error prone for
> me) to code up.
> i.e. (int, mergel)   (permute) D.2885 = {0, 4, 1, 5};
> (long long, mergel)  (permute) D.2876 = {1, 3};

I meant in the loop you could have populated a

   auto_vec elts;
   elts.safe_grow (n_elts * 2);

and use

   permute = build_vector (lhs_type, elts);

to build a VECTOR_CST rather than going through a COSNTRUCTOR
and a separate assignment statement.

Richard.

> Thanks
> -Will
>
>

Re: [PATCH] Bump minimum value for max-sched-ready-insns param to 1 (PR rtl-optimization/86620)

2018-01-16 Thread Richard Biener

On Mon, Jan 15, 2018 at 11:04 PM, Jakub Jelinek  wrote:
> Hi!
>
> This param allows minimum of 0, which doesn't make much sense.
> On the i386/pr83620.c test (when used with the =0 value) we ICE
> because ix86_adjust_priority which has code to prevent moving of likely
> spilled hard regs doesn't have a chance to do anything, since we don't
> consider any other insns as ready.
>
> This patch bumps the minimum to 1, so that there is at least something
> considered.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> 2018-01-15  Jakub Jelinek  
>
> PR rtl-optimization/86620
> * params.def (max-sched-ready-insns): Bump minimum value to 1.
>
> * gcc.dg/pr64935-2.c: Use --param=max-sched-ready-insns=1
> instead of --param=max-sched-ready-insns=0.
> * gcc.target/i386/pr83620.c: New test.
> * gcc.dg/pr83620.c: New test.
>
> --- gcc/params.def.jj   2018-01-14 17:16:57.471836055 +0100
> +++ gcc/params.def  2018-01-15 18:53:24.122124325 +0100
> @@ -744,7 +744,7 @@ DEFPARAM (PARAM_MAX_FIELDS_FOR_FIELD_SEN
>  DEFPARAM(PARAM_MAX_SCHED_READY_INSNS,
>  "max-sched-ready-insns",
>  "The maximum number of instructions ready to be issued to be 
> considered by the scheduler during the first scheduling pass.",
> -100, 0, 0)
> +100, 1, 0)
>
>  /* This is the maximum number of active local stores RTL DSE will consider.  
> */
>  DEFPARAM (PARAM_MAX_DSE_ACTIVE_LOCAL_STORES,
> --- gcc/testsuite/gcc.dg/pr64935-2.c.jj 2017-06-19 08:27:46.126467108 +0200
> +++ gcc/testsuite/gcc.dg/pr64935-2.c2018-01-15 18:52:23.987124863 +0100
> @@ -1,6 +1,6 @@
>  /* PR rtl-optimization/64935 */
>  /* { dg-do compile } */
> -/* { dg-options "-O -fschedule-insns --param=max-sched-ready-insns=0 
> -fcompare-debug" } */
> +/* { dg-options "-O -fschedule-insns --param=max-sched-ready-insns=1 
> -fcompare-debug" } */
>  /* { dg-require-effective-target scheduling } */
>  /* { dg-xfail-if "" { powerpc-ibm-aix* } } */
>
> --- gcc/testsuite/gcc.target/i386/pr83620.c.jj  2018-01-15 18:53:43.267124153 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr83620.c 2018-01-15 19:17:31.053208498 
> +0100
> @@ -0,0 +1,15 @@
> +/* PR rtl-optimization/86620 */
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-O2 -flive-range-shrinkage --param=max-sched-ready-insns=1 
> -Wno-psabi -mno-avx" } */
> +
> +typedef unsigned __int128 V __attribute__ ((vector_size (64)));
> +
> +V u, v;
> +
> +V
> +foo (char c, short d, int e, long f, __int128 g)
> +{
> +  f >>= c & 63;
> +  v = (V){f} == u;
> +  return e + g + v;
> +}
> --- gcc/testsuite/gcc.dg/pr83620.c.jj   2018-01-15 19:16:31.953190203 +0100
> +++ gcc/testsuite/gcc.dg/pr83620.c  2018-01-15 19:16:16.499185414 +0100
> @@ -0,0 +1,9 @@
> +/* PR rtl-optimization/86620 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -flive-range-shrinkage --param=max-sched-ready-insns=0" 
> } */
> +/* { dg-error "minimum value of parameter 'max-sched-ready-insns' is 1" "" { 
> target *-*-* } 0 } */
> +
> +void
> +foo (void)
> +{
> +}
>
> Jakub

Re: [PATCH v3, rs6000] Add -mspeculate-indirect-jumps option and implement non-speculating bctr / bctrl

2018-01-16 Thread Richard Biener

On Tue, Jan 16, 2018 at 12:09 AM, Bill Schmidt
 wrote:
> Hi,
>
> This patch supercedes v2: 
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01204.html,
> and fixes the problems noted in its review.  It also adds the test cases from
> https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01261.html and adjusts them 
> according
> to the results of the review.
>
> There is more function to be provided in a future patch:  Sibling calls for 
> all ABIs,
> and indirect calls for non-ELFv2 ABIs.  I'm getting close on that, but I 
> think it's
> better to keep that separate at this point.
>
> Bootstrapped and tested on powerpc64-linux-gnu and powerpc64le-linux-gnu with 
> no
> regressions.  Is this okay for trunk?

Did you consider simply removing the tablejump/casesi support so
expansion always
expands to a balanced tree?  At least if we have any knobs to tune we
should probably
tweak them away from the indirect jump using variants with
-mno-speculate-indirect-jumps,
right?

Performance optimization, so shouldn't block this patch - I just
thought I should probably
mention this.

Richard.

> Thanks,
> Bill
>
>
> [gcc]
>
> 2018-01-15  Bill Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_opt_vars): Add entry for
> -mspeculate-indirect-jumps.
> * config/rs6000/rs6000.md (*call_indirect_elfv2): Disable
> for -mno-speculate-indirect-jumps.
> (*call_indirect_elfv2_nospec): New define_insn.
> (*call_value_indirect_elfv2): Disable for
> -mno-speculate-indirect-jumps.
> (*call_value_indirect_elfv2_nospec): New define_insn.
> (indirect_jump): Emit different RTL for
> -mno-speculate-indirect-jumps.
> (*indirect_jump): Disable for
> -mno-speculate-indirect-jumps.
> (*indirect_jump_nospec): New define_insn.
> (tablejump): Emit different RTL for
> -mno-speculate-indirect-jumps.
> (tablejumpsi): Disable for -mno-speculate-indirect-jumps.
> (tablejumpsi_nospec): New define_expand.
> (tablejumpdi): Disable for -mno-speculate-indirect-jumps.
> (tablejumpdi_nospec): New define_expand.
> (*tablejump_internal1): Disable for
> -mno-speculate-indirect-jumps.
> (*tablejump_internal1_nospec): New define_insn.
> * config/rs6000/rs6000.opt (mspeculate-indirect-jumps): New
> option.
>
> [gcc/testsuite]
>
> 2018-01-15  Bill Schmidt  
>
> * gcc.target/powerpc/safe-indirect-jump-1.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-2.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-3.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-4.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-5.c: New file.
> * gcc.target/powerpc/safe-indirect-jump-6.c: New file.
>
>
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c  (revision 256364)
> +++ gcc/config/rs6000/rs6000.c  (working copy)
> @@ -36726,6 +36726,9 @@ static struct rs6000_opt_var const rs6000_opt_vars
>{ "sched-epilog",
>  offsetof (struct gcc_options, x_TARGET_SCHED_PROLOG),
>  offsetof (struct cl_target_option, x_TARGET_SCHED_PROLOG), },
> +  { "speculate-indirect-jumps",
> +offsetof (struct gcc_options, x_rs6000_speculate_indirect_jumps),
> +offsetof (struct cl_target_option, x_rs6000_speculate_indirect_jumps), },
>  };
>
>  /* Inner function to handle attribute((target("..."))) and #pragma GCC target
> Index: gcc/config/rs6000/rs6000.md
> ===
> --- gcc/config/rs6000/rs6000.md (revision 256364)
> +++ gcc/config/rs6000/rs6000.md (working copy)
> @@ -11222,11 +11222,22 @@
>  (match_operand 1 "" "g,g"))
> (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" 
> "n,n")] UNSPEC_TOCSLOT))
> (clobber (reg:P LR_REGNO))]
> -  "DEFAULT_ABI == ABI_ELFv2"
> +  "DEFAULT_ABI == ABI_ELFv2 && rs6000_speculate_indirect_jumps"
>"b%T0l\; 2,%2(1)"
>[(set_attr "type" "jmpreg")
> (set_attr "length" "8")])
>
> +;; Variant with deliberate misprediction.
> +(define_insn "*call_indirect_elfv2_nospec"
> +  [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
> +(match_operand 1 "" "g,g"))
> +   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 2 "const_int_operand" 
> "n,n")] UNSPEC_TOCSLOT))
> +   (clobber (reg:P LR_REGNO))]
> +  "DEFAULT_ABI == ABI_ELFv2 && !rs6000_speculate_indirect_jumps"
> +  "crset eq\;beq%T0l-\; 2,%2(1)"
> +  [(set_attr "type" "jmpreg")
> +   (set_attr "length" "12")])
> +
>  (define_insn "*call_value_indirect_elfv2"
>[(set (match_operand 0 "" "")
> (call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
> @@ -11233,11 +11244,22 @@
>   (match_operand 2 "" "g,g")))
> (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" 
> "n,n")] UNSPEC_TOCSLOT))
> (clobber (

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Jan Hubicka

> On Mon, Jan 15, 2018 at 5:53 PM, H.J. Lu  wrote:
> > On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
> >> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
> >>  wrote:
> >>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
>  Now my patch set has been checked into trunk.  Here is a patch set
>  to move struct ix86_frame to machine_function on GCC 7, which is
>  needed to backport the patch set to GCC 7:
> 
>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
> 
>  OK for gcc-7-branch?
> >>>
> >>> Yes, backporting is ok - please watch for possible fallout on trunk and 
> >>> make
> >>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
> >>> Wednesday now with the final release about a week later if no issue shows
> >>> up.
> >>>
> >>
> >> Backport is blocked by
> >>
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
> >>
> >> There are many test failures due to lack of comdat support in linker on 
> >> Solaris.
> >> I can limit these tests to Linux.
> >
> > These are testcase issues and shouldn't block backport to GCC 7.
> 
> It makes the option using thunks unusable though, right?  Can you simply make
> them hidden on systems without comdat support?  That duplicates them per TU
> but at least the feature works.  Or those systems should provide the thunks 
> via
> libgcc.
> 
> I agree we can followup with a fix for Solaris given lack of a public
> testing machine.

My memory is bit dim, but I am convinced I was fixing specific errors for 
comdats
on Solaris, so I think the toolchain supports them in some sort, just is more
restrictive/different from GNU implementation.

Indeed, i think just producing sorry, unimplemented message is what we should do
if we can't support retpoline on given target.

Honza
> 
> Richard.
> 
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839
> >>
> >> Bootstrap failed on Dawning due to lack of ".set" directive in assembler.  
> >> I
> >> uploaded a patch:
> >>
> >> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43124
> >>
> >> There is no confirmation on it.  Also there may be test failures on Dardwin
> >> due to difference in assembly output.
> >
> > I posted a patch for Darwin build:
> >
> > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01347.html
> >
> > This needs to be checked into trunk before I can start backport to GCC 7.
> >
> > --
> > H.J.

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Richard Biener

On Tue, Jan 16, 2018 at 9:34 AM, Jan Hubicka  wrote:
>> On Mon, Jan 15, 2018 at 5:53 PM, H.J. Lu  wrote:
>> > On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
>> >> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
>> >>  wrote:
>> >>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
>>  Now my patch set has been checked into trunk.  Here is a patch set
>>  to move struct ix86_frame to machine_function on GCC 7, which is
>>  needed to backport the patch set to GCC 7:
>> 
>>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
>>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
>>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
>> 
>>  OK for gcc-7-branch?
>> >>>
>> >>> Yes, backporting is ok - please watch for possible fallout on trunk and 
>> >>> make
>> >>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
>> >>> Wednesday now with the final release about a week later if no issue shows
>> >>> up.
>> >>>
>> >>
>> >> Backport is blocked by
>> >>
>> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
>> >>
>> >> There are many test failures due to lack of comdat support in linker on 
>> >> Solaris.
>> >> I can limit these tests to Linux.
>> >
>> > These are testcase issues and shouldn't block backport to GCC 7.
>>
>> It makes the option using thunks unusable though, right?  Can you simply make
>> them hidden on systems without comdat support?  That duplicates them per TU
>> but at least the feature works.  Or those systems should provide the thunks 
>> via
>> libgcc.
>>
>> I agree we can followup with a fix for Solaris given lack of a public
>> testing machine.
>
> My memory is bit dim, but I am convinced I was fixing specific errors for 
> comdats
> on Solaris, so I think the toolchain supports them in some sort, just is more
> restrictive/different from GNU implementation.
>
> Indeed, i think just producing sorry, unimplemented message is what we should 
> do
> if we can't support retpoline on given target.

I'm quite sure Solaris supports comdats, after all it invented ELF ;)
I've also seen
comdats in debugging early LTO issues.  We might run into Solaris as
issues though.

Richard.

> Honza
>>
>> Richard.
>>
>> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83839
>> >>
>> >> Bootstrap failed on Dawning due to lack of ".set" directive in assembler. 
>> >>  I
>> >> uploaded a patch:
>> >>
>> >> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43124
>> >>
>> >> There is no confirmation on it.  Also there may be test failures on 
>> >> Dardwin
>> >> due to difference in assembly output.
>> >
>> > I posted a patch for Darwin build:
>> >
>> > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01347.html
>> >
>> > This needs to be checked into trunk before I can start backport to GCC 7.
>> >
>> > --
>> > H.J.

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Jan Hubicka

> On Tue, Jan 16, 2018 at 9:34 AM, Jan Hubicka  wrote:
> >> On Mon, Jan 15, 2018 at 5:53 PM, H.J. Lu  wrote:
> >> > On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
> >> >> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
> >> >>  wrote:
> >> >>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
> >>  Now my patch set has been checked into trunk.  Here is a patch set
> >>  to move struct ix86_frame to machine_function on GCC 7, which is
> >>  needed to backport the patch set to GCC 7:
> >> 
> >>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
> >>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
> >>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
> >> 
> >>  OK for gcc-7-branch?
> >> >>>
> >> >>> Yes, backporting is ok - please watch for possible fallout on trunk 
> >> >>> and make
> >> >>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
> >> >>> Wednesday now with the final release about a week later if no issue 
> >> >>> shows
> >> >>> up.
> >> >>>
> >> >>
> >> >> Backport is blocked by
> >> >>
> >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
> >> >>
> >> >> There are many test failures due to lack of comdat support in linker on 
> >> >> Solaris.
> >> >> I can limit these tests to Linux.
> >> >
> >> > These are testcase issues and shouldn't block backport to GCC 7.
> >>
> >> It makes the option using thunks unusable though, right?  Can you simply 
> >> make
> >> them hidden on systems without comdat support?  That duplicates them per TU
> >> but at least the feature works.  Or those systems should provide the 
> >> thunks via
> >> libgcc.
> >>
> >> I agree we can followup with a fix for Solaris given lack of a public
> >> testing machine.
> >
> > My memory is bit dim, but I am convinced I was fixing specific errors for 
> > comdats
> > on Solaris, so I think the toolchain supports them in some sort, just is 
> > more
> > restrictive/different from GNU implementation.
> >
> > Indeed, i think just producing sorry, unimplemented message is what we 
> > should do
> > if we can't support retpoline on given target.
> 
> I'm quite sure Solaris supports comdats, after all it invented ELF ;)
> I've also seen
> comdats in debugging early LTO issues.  We might run into Solaris as
> issues though.

:)
My recollection is that the thunks in a comdat group needs to come in specific
order after the entry symbol. Probably after - at some point I tried to move the
before (for better code layout) and needed to retreat.

Honza

[PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread Tamar Christina

Hi All,

This patch updates the GCC 8 release notes for ARM and AArch64.

Ok for cvs?

Thanks,
Tamar

-- 
Index: htdocs/gcc-8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.26
diff -u -r1.26 changes.html
--- htdocs/gcc-8/changes.html	11 Jan 2018 09:31:53 -	1.26
+++ htdocs/gcc-8/changes.html	11 Jan 2018 15:47:15 -
@@ -147,7 +147,51 @@
 
 AArch64
 
-  
+  
+The Armv8.4-A architecture is now supported.  It can be used by
+specifying the -march=armv8.4-a option.
+  
+  
+The Dot Product instructions are now supported as an optional extension to the
+Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The extension can be used by
+specifying the +dotprod architecture extension.  E.g. -march=armv8.2-a+dotprod.
+  
+  
+The Armv8-A +crypto extension has now been split into two extensions for finer grained control:
+
+   +aes which contains the Armv8-A AES crytographic instructions.
+   +sha2 which contains the Armv8-A SHA2 and SHA1 cryptographic instructions.
+
+Using +crypto will now enable these two extensions.
+  
+  
+New Armv8.4-A FP16 Floating Point Multiplication Variant instructions have been added.  These instructions are
+mandatory in Armv8.4-A but available as an optional extension to Armv8.2-A and Armv8.3-A.  The new extension
+can be used by specifying the +fp16fml architectural extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
+the instructions can be enabled by specifying +fp16.
+  
+  
+New cryptographic instructions have been added as optional extensions to Armv8.2-A and newer.  These instructions can
+be enabled with:
+
+  +sha3 New SHA3 and SHA2 instructions from Armv8.4-A.  This implies +sha2.
+  +sm4 New SM3 and SM4 instructions from Armv8.4-A.
+
+ 
+  
+   Support has been added for the following processors
+   (GCC identifiers in parentheses):
+   
+ Arm Cortex-A75 (cortex-a75).
+	 Arm Cortex-A55 (cortex-a55).
+	 Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE (cortex-a75.cortex-a55).
+   
+   The GCC identifiers can be used
+   as arguments to the -mcpu or -mtune options,
+   for example: -mcpu=cortex-a75 or
+   -mtune=thunderx2t99p1 or as arguments to the equivalent target
+   attributes and pragmas.
+  
 
 
 ARM
@@ -169,14 +213,58 @@
 removed in a future release.
   
   
-The default link behavior for ARMv6 and ARMv7-R targets has been
+The default link behavior for Armv6 and Armv7-R targets has been
 changed to produce BE8 format when generating big-endian images.  A new
 flag -mbe32 can be used to force the linker to produce
 legacy BE32 format images.  There is no change of behavior for
-ARMv6-m and other ARMv7 or later targets: these already defaulted
+Armv6-M and other Armv7 or later targets: these already defaulted
 to BE8 format.  This change brings GCC into alignment with other
 compilers for the ARM architecture.
   
+  
+The Armv8-R architecture is now supported.  It can be used by specifying the
+-march=armv8-r option.
+  
+  
+The Armv8.3-A architecture is now supported.  It can be used by
+specifying the -march=armv8.3-a option.
+  
+  
+The Armv8.4-A architecture is now supported.  It can be used by
+specifying the -march=armv8.4-a option.
+  
+  
+ The Dot Product instructions are now supported as an optional extension to the
+ Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The extension can be used by
+ specifying the +dotprod architecture extension.  E.g. -march=armv8.2-a+dotprod.
+  
+
+  
+Support for setting extensions and architectures using the GCC target pragma and attribute has been added.
+It can be used by specifying #pragma GCC target ("arch=..."), #pragma GCC target ("+extension"),
+__attribute__((target("arch=..."))) or __attribute__((target("+extension"))).
+  
+  
+New Armv8.4-A FP16 Floating Point Multiplication Variant instructions have been added.  These instructions are
+mandatory in Armv8.4-A but available as an optional extension to Armv8.2-A and Armv8.3-A.  The new extension
+can be used by specifying the +fp16fml architectural extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
+the instructions can be enabled by specifying +fp16.
+  
+  
+   Support has been added for the following processors
+   (GCC identifiers in parentheses):
+   
+	 Arm Cortex-A75 (cortex-a75).
+	 Arm Cortex-A55 (cortex-a55).
+	 Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE (cortex-a75.cortex-a55).
+	 Arm Cortex-R52 for Armv8-R (cortex-r52).
+   
+   The GCC identifiers can be used
+   as arguments to the -mcpu or -mtune options,
+   for example: -mcpu=cortex-a75 or
+   -mtune=xgene1 or as arguments to the equivalent target
+   attributes and pragmas.
+  
 
 
 AVR

Re: [PATCH] Fix warn_if_not_align ICE (PR c/83844)

2018-01-16 Thread Jakub Jelinek

On Tue, Jan 16, 2018 at 08:57:38AM +0100, Richard Biener wrote:
> > -  unsigned HOST_WIDE_INT off
> > -= (tree_to_uhwi (DECL_FIELD_OFFSET (field))
> > -   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
> > -  if ((off % warn_if_not_align) != 0)
> > -warning (opt_w, "%q+D offset %wu in %qT isn't aligned to %u",
> > +  tree off = byte_position (field);
> > +  if (!multiple_of_p (TREE_TYPE (off), off, size_int (warn_if_not_align)))
> 
> multiple_of_p also returns 0 if it doesn't know (for the non-constant
> case obviously), so the warning should say "may be not aligned"?  Or
> we don't want any false positives which means multiple_of_p should get
> a worker factored out that returns a tri-state value?

tri-state sounds optimizing for the very uncommon case, I think it must be
very rare in practice when we could prove it must be not aligned and
especially we'd need to extend it a lot to handle those cases.

Here is an updated patch which says may not be aligned if off is
non-constant.  When extending the testcase, I've noticed we don't handle
IMHO quite important case in multiple_of_p, so the patch handles that too.
I've tried not to increase asymptotic complexity of multiple_of_p, so except
for the cases where both arguments are INTEGER_CSTs it shouldn't call
multiple_of_p more times than before.

Ok for trunk if this passes bootstrap/regtest?

2018-01-16  Jakub Jelinek  

PR c/83844
* stor-layout.c (handle_warn_if_not_align): Use byte_position and
multiple_of_p instead of unchecked tree_to_uhwi and UHWI check.
If off is not INTEGER_CST, issue a may not be aligned warning
rather than isn't aligned.  Use isn%'t rather than isn't.
* fold-const.c (multiple_of_p) : Don't fall through
into MULT_EXPR.
: Improve the case when bottom and one of the
MULT_EXPR operands are INTEGER_CSTs and bottom is multiple of that
operand, in that case check if the other operand is multiple of
bottom divided by the INTEGER_CST operand.

* gcc.dg/pr83844.c: New test.

--- gcc/stor-layout.c.jj2018-01-15 22:40:14.009263280 +0100
+++ gcc/stor-layout.c   2018-01-16 10:01:48.135111031 +0100
@@ -1150,12 +1150,16 @@ handle_warn_if_not_align (tree field, un
 warning (opt_w, "alignment %u of %qT is less than %u",
 record_align, context, warn_if_not_align);
 
-  unsigned HOST_WIDE_INT off
-= (tree_to_uhwi (DECL_FIELD_OFFSET (field))
-   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
-  if ((off % warn_if_not_align) != 0)
-warning (opt_w, "%q+D offset %wu in %qT isn't aligned to %u",
-field, off, context, warn_if_not_align);
+  tree off = byte_position (field);
+  if (!multiple_of_p (TREE_TYPE (off), off, size_int (warn_if_not_align)))
+{
+  if (TREE_CODE (off) == INTEGER_CST)
+   warning (opt_w, "%q+D offset %E in %qT isn%'t aligned to %u",
+field, off, context, warn_if_not_align);
+  else
+   warning (opt_w, "%q+D offset %E in %qT may not be aligned to %u",
+field, off, context, warn_if_not_align);
+}
 }
 
 /* Called from place_field to handle unions.  */
--- gcc/fold-const.c.jj 2018-01-15 10:02:04.119181355 +0100
+++ gcc/fold-const.c2018-01-16 10:48:10.444360796 +0100
@@ -12595,9 +12595,34 @@ multiple_of_p (tree type, const_tree top
 a multiple of BOTTOM then TOP is a multiple of BOTTOM.  */
   if (!integer_pow2p (bottom))
return 0;
-  /* FALLTHRU */
+  return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
+ || multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
 
 case MULT_EXPR:
+  if (TREE_CODE (bottom) == INTEGER_CST)
+   {
+ op1 = TREE_OPERAND (top, 0);
+ op2 = TREE_OPERAND (top, 1);
+ if (TREE_CODE (op1) == INTEGER_CST)
+   std::swap (op1, op2);
+ if (TREE_CODE (op2) == INTEGER_CST)
+   {
+ if (multiple_of_p (type, op2, bottom))
+   return 1;
+ /* Handle multiple_of_p ((x * 2 + 2) * 4, 8).  */
+ if (multiple_of_p (type, bottom, op2))
+   {
+ widest_int w = wi::sdiv_trunc (wi::to_widest (bottom),
+wi::to_widest (op2));
+ if (wi::fits_to_tree_p (w, TREE_TYPE (bottom)))
+   {
+ op2 = wide_int_to_tree (TREE_TYPE (bottom), w);
+ return multiple_of_p (type, op1, op2);
+   }
+   }
+ return multiple_of_p (type, op1, bottom);
+   }
+   }
   return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
  || multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
 
--- gcc/testsuite/gcc.dg/pr83844.c.jj   2018-01-16 09:56:57.459175232 +0100
+++ gcc/testsuite/gcc.dg/pr83844.c  2018-01-16 10:02:55.494096157 +0100
@@ -0,0 +1,36 @@
+/* PR c

Re: [PATCH 00/10][ARC] Critical fixes

2018-01-16 Thread Andrew Burgess

* Claudiu Zissulescu  [2018-01-08 15:18:30 
+]:

> >   [ARC][LRA] Use TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV.
> >   [ARC] Don't allow the last ZOL insn to be in a delay slot.
> >   [ARC] Add trap instruction.
> >   [ARC] Update legitimate constant hook.
> >   [ARC] Enable unaligned access.
> >   [ARC] Revamp trampoline implementation.
> >   [ARC][ZOL] Update uses for hw-loop labels.
> >   [ARC] Add ARCv2 core3 tune option.
> >   [ARC][FIX] Consider command line ffixed- option.
> >   [ARC] Update (u)maddsidi patterns.
> 
> Hi Andrew,
> 
> Thank you for reviewing this batch of fixes. Any chance to check also these 
> ones, they are hanging there for a long time now:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00078.html
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00081.html
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00080.html
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00079.html
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00084.html
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00083.html
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00082.html

Sorry for missing these, they somehow didn't make it onto my todo
list.

I'll review these over the next couple of days.

Thanks,
Andrew

Re: [PATCH 1/6] [ARC] Add JLI support.

2018-01-16 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-11-02 13:30:30 
+0100]:

> The ARCv2 ISA provides the JLI instruction, which is two-byte instructions
> that can be used to reduce code size in an application. To make use of it,
> we provide two new function attributes 'jli_always' and 'jli_fixed' which
> will force the compiler to call the indicated function using a jli_s
> instruction. The compiler also generates the entries in the JLI table for
> the case when we use 'jli_always' attribute. In the case of 'jli_fixed'
> the compiler assumes a fixed position of the function into JLI
> table. Thus, the user needs to provide an assembly file with the JLI table
> for the final link. This is usefully when we want to have a table in ROM
> and a second table in the RAM memory.
> 
> The jli instruction usage can be also forced without the need to annotate
> the source code via '-mjli-always' command.
> 
> gcc/
> 2017-02-10  Claudiu Zissulescu  
>   John Eric Martin 
> 
>   * config/arc/arc-protos.h: Add arc_is_jli_call_p proto.
>   * config/arc/arc.c (_arc_jli_section): New struct.
>   (arc_jli_section): New type.
>   (rc_jli_sections): New static variable.
>   (arc_handle_jli_attribute): New function.
>   (arc_attribute_table): Add jli_always and jli_fixed attribute.
>   (arc_file_end): New function.
>   (TARGET_ASM_FILE_END): Define.
>   (arc_print_operand): Reuse 'S' letter for JLI output instruction.
>   (arc_add_jli_section): New function.
>   (jli_call_scan): Likewise.
>   (arc_reorg): Call jli_call_scan.
>   (arc_output_addsi): Remove 'S' from printing asm operand.
>   (arc_is_jli_call_p): New function.
>   * config/arc/arc.md (movqi_insn): Remove 'S' from printing asm
>   operand.
>   (movhi_insn): Likewise.
>   (movsi_insn): Likewise.
>   (movsi_set_cc_insn): Likewise.
>   (loadqi_update): Likewise.
>   (load_zeroextendqisi_update): Likewise.
>   (load_signextendqisi_update): Likewise.
>   (loadhi_update): Likewise.
>   (load_zeroextendhisi_update): Likewise.
>   (load_signextendhisi_update): Likewise.
>   (loadsi_update): Likewise.
>   (loadsf_update): Likewise.
>   (movsicc_insn): Likewise.
>   (bset_insn): Likewise.
>   (bxor_insn): Likewise.
>   (bclr_insn): Likewise.
>   (bmsk_insn): Likewise.
>   (bicsi3_insn): Likewise.
>   (cmpsi_cc_c_insn): Likewise.
>   (movsi_ne): Likewise.
>   (movsi_cond_exec): Likewise.
>   (clrsbsi2): Likewise.
>   (norm_f): Likewise.
>   (normw): Likewise.
>   (swap): Likewise.
>   (divaw): Likewise.
>   (flag): Likewise.
>   (sr): Likewise.
>   (kflag): Likewise.
>   (ffs): Likewise.
>   (ffs_f): Likewise.
>   (fls): Likewise.
>   (call_i): Remove 'S' asm letter, add jli instruction.
>   (call_value_i): Likewise.
>   * config/arc/arc.op (mjli-always): New option.
>   * config/arc/constraints.md (Cji): New constraint.
>   * config/arc/fpx.md (addsf3_fpx): Remove 'S' from printing asm
>   operand.
>   (subsf3_fpx): Likewise.
>   (mulsf3_fpx): Likewise.
>   * config/arc/simdext.md (vendrec_insn): Remove 'S' from printing
>   asm operand.
>   * doc/extend.texi (ARC): Document 'jli-always' and 'jli-fixed'
>   function attrbutes.
>   * doc/invoke.texi (ARC): Document mjli-always option.
> 
> gcc/testsuite
> 2017-02-10  Claudiu Zissulescu  
> 
>   * gcc.target/arc/jli-1.c: New file.
>   * gcc.target/arc/jli-2.c: Likewise.

This looks fine, but I wonder if there should be some documentation
that mentions the new .jlitab section added?

There's one whitespace issue I also spotted...

> @@ -5026,6 +5062,36 @@ static void arc_file_start (void)
>fprintf (asm_out_file, "\t.cpu %s\n", arc_cpu_string);
>  }
>  
> +/* Implement `TARGET_ASM_FILE_END'.  */
> +/* Outputs to the stdio stream FILE jli related text.  */
> +
> +void arc_file_end (void)
> +{
> +  arc_jli_section *sec = arc_jli_sections;
> +
> +  while (sec != NULL)
> +  {

I think the '{' is not indented correctly.

Thanks,
Andrew

Re: [PATCH] Fix warn_if_not_align ICE (PR c/83844)

2018-01-16 Thread Richard Biener

On Tue, 16 Jan 2018, Jakub Jelinek wrote:

> On Tue, Jan 16, 2018 at 08:57:38AM +0100, Richard Biener wrote:
> > > -  unsigned HOST_WIDE_INT off
> > > -= (tree_to_uhwi (DECL_FIELD_OFFSET (field))
> > > -   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
> > > -  if ((off % warn_if_not_align) != 0)
> > > -warning (opt_w, "%q+D offset %wu in %qT isn't aligned to %u",
> > > +  tree off = byte_position (field);
> > > +  if (!multiple_of_p (TREE_TYPE (off), off, size_int 
> > > (warn_if_not_align)))
> > 
> > multiple_of_p also returns 0 if it doesn't know (for the non-constant
> > case obviously), so the warning should say "may be not aligned"?  Or
> > we don't want any false positives which means multiple_of_p should get
> > a worker factored out that returns a tri-state value?
> 
> tri-state sounds optimizing for the very uncommon case, I think it must be
> very rare in practice when we could prove it must be not aligned and
> especially we'd need to extend it a lot to handle those cases.
> 
> Here is an updated patch which says may not be aligned if off is
> non-constant.  When extending the testcase, I've noticed we don't handle
> IMHO quite important case in multiple_of_p, so the patch handles that too.
> I've tried not to increase asymptotic complexity of multiple_of_p, so except
> for the cases where both arguments are INTEGER_CSTs it shouldn't call
> multiple_of_p more times than before.
> 
> Ok for trunk if this passes bootstrap/regtest?

Ok.

Thanks,
Richard.

> 2018-01-16  Jakub Jelinek  
> 
>   PR c/83844
>   * stor-layout.c (handle_warn_if_not_align): Use byte_position and
>   multiple_of_p instead of unchecked tree_to_uhwi and UHWI check.
>   If off is not INTEGER_CST, issue a may not be aligned warning
>   rather than isn't aligned.  Use isn%'t rather than isn't.
>   * fold-const.c (multiple_of_p) : Don't fall through
>   into MULT_EXPR.
>   : Improve the case when bottom and one of the
>   MULT_EXPR operands are INTEGER_CSTs and bottom is multiple of that
>   operand, in that case check if the other operand is multiple of
>   bottom divided by the INTEGER_CST operand.
> 
>   * gcc.dg/pr83844.c: New test.
> 
> --- gcc/stor-layout.c.jj  2018-01-15 22:40:14.009263280 +0100
> +++ gcc/stor-layout.c 2018-01-16 10:01:48.135111031 +0100
> @@ -1150,12 +1150,16 @@ handle_warn_if_not_align (tree field, un
>  warning (opt_w, "alignment %u of %qT is less than %u",
>record_align, context, warn_if_not_align);
>  
> -  unsigned HOST_WIDE_INT off
> -= (tree_to_uhwi (DECL_FIELD_OFFSET (field))
> -   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
> -  if ((off % warn_if_not_align) != 0)
> -warning (opt_w, "%q+D offset %wu in %qT isn't aligned to %u",
> -  field, off, context, warn_if_not_align);
> +  tree off = byte_position (field);
> +  if (!multiple_of_p (TREE_TYPE (off), off, size_int (warn_if_not_align)))
> +{
> +  if (TREE_CODE (off) == INTEGER_CST)
> + warning (opt_w, "%q+D offset %E in %qT isn%'t aligned to %u",
> +  field, off, context, warn_if_not_align);
> +  else
> + warning (opt_w, "%q+D offset %E in %qT may not be aligned to %u",
> +  field, off, context, warn_if_not_align);
> +}
>  }
>  
>  /* Called from place_field to handle unions.  */
> --- gcc/fold-const.c.jj   2018-01-15 10:02:04.119181355 +0100
> +++ gcc/fold-const.c  2018-01-16 10:48:10.444360796 +0100
> @@ -12595,9 +12595,34 @@ multiple_of_p (tree type, const_tree top
>a multiple of BOTTOM then TOP is a multiple of BOTTOM.  */
>if (!integer_pow2p (bottom))
>   return 0;
> -  /* FALLTHRU */
> +  return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
> +   || multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
>  
>  case MULT_EXPR:
> +  if (TREE_CODE (bottom) == INTEGER_CST)
> + {
> +   op1 = TREE_OPERAND (top, 0);
> +   op2 = TREE_OPERAND (top, 1);
> +   if (TREE_CODE (op1) == INTEGER_CST)
> + std::swap (op1, op2);
> +   if (TREE_CODE (op2) == INTEGER_CST)
> + {
> +   if (multiple_of_p (type, op2, bottom))
> + return 1;
> +   /* Handle multiple_of_p ((x * 2 + 2) * 4, 8).  */
> +   if (multiple_of_p (type, bottom, op2))
> + {
> +   widest_int w = wi::sdiv_trunc (wi::to_widest (bottom),
> +  wi::to_widest (op2));
> +   if (wi::fits_to_tree_p (w, TREE_TYPE (bottom)))
> + {
> +   op2 = wide_int_to_tree (TREE_TYPE (bottom), w);
> +   return multiple_of_p (type, op1, op2);
> + }
> + }
> +   return multiple_of_p (type, op1, bottom);
> + }
> + }
>return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
> || multiple_of_p (type, TREE_OPERAND (top, 0)

Re: [PATCH][ARM] Fix test fail with conflicting -mfloat-abi

2018-01-16 Thread Sudakshina Das


Hi Christophe

On 12/01/18 18:32, Christophe Lyon wrote:

Le 12 janv. 2018 15:26, "Sudakshina Das"  a écrit :

Hi

This patch fixes my earlier test case that fails for arm-none-eabi
with explicit user option for -mfloat-abi which conflict with
the test case options. I have added a guard to skip the test
on those cases.

@Christophe:
Sorry about this. I think this should fix the test case.
Can you please confirm if this works for you?


Yes it does thanks


Thanks for checking that. I have added one more directive for armv5t as 
well to avoid any conflicts for mcpu options.


Sudi




Thanks
Sudi

gcc/testsuite/ChangeLog

2018-01-12  Sudakshina Das  

 * gcc.c-torture/compile/pr82096.c: Add dg-skip-if
 directive.



diff --git a/gcc/testsuite/gcc.c-torture/compile/pr82096.c b/gcc/testsuite/gcc.c-torture/compile/pr82096.c
index 9fed28c..35551f5 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr82096.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr82096.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target arm_arch_v5t_ok } */
+/* { dg-skip-if "Do not combine float-abi values" { arm*-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
 /* { dg-additional-options "-march=armv5t -mthumb -mfloat-abi=soft" { target arm*-*-* } } */
 
 static long long AL[24];

Re: [PATCH 2/6] [ARC] Add SJLI support.

2018-01-16 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-11-02 13:30:31 
+0100]:

> gcc/
> 2017-02-20  Claudiu Zissulescu  
> 
>   * config/arc/arc-protos.h: Add arc_is_secure_call_p proto.
>   * config/arc/arc.c (arc_handle_secure_attribute): New function.
>   (arc_attribute_table): Add 'secure_call' attribute.
>   (arc_print_operand): Print secure call operand.
>   (arc_function_ok_for_sibcall): Don't optimize tail calls when
>   secure.
>   (arc_is_secure_call_p): New function.
>   * config/arc/arc.md (call_i): Add support for sjli instruction.
>   (call_value_i): Likewise.
>   * config/arc/constraints.md (Csc): New constraint.
> ---
>  gcc/config/arc/arc-protos.h   |   1 +
>  gcc/config/arc/arc.c  | 164 
> +++---
>  gcc/config/arc/arc.md |  32 +
>  gcc/config/arc/constraints.md |   7 ++
>  gcc/doc/extend.texi   |   6 ++
>  5 files changed, 155 insertions(+), 55 deletions(-)

Looks fine, few comments inline below.

Thanks
Andrew

> 
> @@ -3939,6 +3985,9 @@ arc_print_operand (FILE *file, rtx x, int code)
>   : NULL_TREE);
> if (lookup_attribute ("jli_fixed", attrs))
>   {
> +   /* No special treatment for jli_fixed functions.  */
> +   if (code == 'j' )

Extra space before ')'.

> + break;
> fprintf (file, "%ld\t; @",
>  TREE_INT_CST_LOW (TREE_VALUE (TREE_VALUE (attrs;
> assemble_name (file, XSTR (x, 0));
> @@ -3947,6 +3996,22 @@ arc_print_operand (FILE *file, rtx x, int code)
>   }
> fprintf (file, "@__jli.");
> assemble_name (file, XSTR (x, 0));
> +   if (code == 'j')
> + arc_add_jli_section (x);
> +   return;
> + }
> +  if (GET_CODE (x) == SYMBOL_REF
> +   && arc_is_secure_call_p (x))
> + {
> +   /* No special treatment for secure functions.  */
> +   if (code == 'j' )
> + break;
> +   tree attrs = (TREE_TYPE (SYMBOL_REF_DECL (x)) != error_mark_node
> + ? TYPE_ATTRIBUTES (TREE_TYPE (SYMBOL_REF_DECL (x)))
> + : NULL_TREE);
> +   fprintf (file, "%ld\t; @",
> +TREE_INT_CST_LOW (TREE_VALUE (TREE_VALUE (attrs;
> +   assemble_name (file, XSTR (x, 0));
> return;
>   }
>break;
> @@ -6897,6 +6962,8 @@ arc_function_ok_for_sibcall (tree decl,
>   return false;
>if (lookup_attribute ("jli_fixed", attrs))
>   return false;
> +  if (lookup_attribute ("secure_call", attrs))
> + return false;
>  }
>  
>/* Everything else is ok.  */
> @@ -7594,46 +7661,6 @@ arc_reorg_loops (void)
>reorg_loops (true, &arc_doloop_hooks);
>  }
>  
> -/* Add the given function declaration to emit code in JLI section.  */
> -
> -static void
> -arc_add_jli_section (rtx pat)
> -{
> -  const char *name;
> -  tree attrs;
> -  arc_jli_section *sec = arc_jli_sections, *new_section;
> -  tree decl = SYMBOL_REF_DECL (pat);
> -
> -  if (!pat)
> -return;
> -
> -  if (decl)
> -{
> -  /* For fixed locations do not generate the jli table entry.  It
> -  should be provided by the user as an asm file.  */
> -  attrs = TYPE_ATTRIBUTES (TREE_TYPE (decl));
> -  if (lookup_attribute ("jli_fixed", attrs))
> - return;
> -}
> -
> -  name = XSTR (pat, 0);
> -
> -  /* Don't insert the same symbol twice.  */
> -  while (sec != NULL)
> -{
> -  if(strcmp (name, sec->name) == 0)
> - return;
> -  sec = sec->next;
> -}
> -
> -  /* New name, insert it.  */
> -  new_section = (arc_jli_section *) xmalloc (sizeof (arc_jli_section));
> -  gcc_assert (new_section != NULL);
> -  new_section->name = name;
> -  new_section->next = arc_jli_sections;
> -  arc_jli_sections = new_section;
> -}
> -
>  /* Scan all calls and add symbols to be emitted in the jli section if
> needed.  */
>  
> @@ -10968,6 +10995,63 @@ arc_handle_jli_attribute (tree *node 
> ATTRIBUTE_UNUSED,
> return NULL_TREE;
>  }
>  
> +/* Handle and "scure" attribute; arguments as in struct
> +   attribute_spec.handler.  */
> +
> +static tree
> +arc_handle_secure_attribute (tree *node ATTRIBUTE_UNUSED,
> +   tree name, tree args, int,
> +   bool *no_add_attrs)
> +{
> +  if (!TARGET_EM)
> +{
> +  warning (OPT_Wattributes,
> +"%qE attribute only valid for ARC EM architecture",
> +name);
> +  *no_add_attrs = true;
> +}
> +
> +  if (args == NULL_TREE)
> +{
> +  warning (OPT_Wattributes,
> +"argument of %qE attribute is missing",
> +name);
> +  *no_add_attrs = true;
> +}
> +  else
> +{
> +  if (TREE_CODE (TREE_VALUE (args)) == NON_LVALUE_EXPR)
> + TREE_VALUE (args) = TREE_OPERAND (TREE_VALUE (args), 0);
> +  tree arg = TREE_VALUE (args);
> +  if (TREE_CODE (arg) != INTEGER_CST)
> + {
> +   warning

[C++ Patch] PR 81054 ("[7/8 Regression] ICE with volatile variable in constexpr function")

2018-01-16 Thread Paolo Carlini


Hi,

in this error recovery regression we ICE when we end-up in an 
inconsistent state after meaningful diagnostic emitted by 
ensure_literal_type_for_constexpr_object and then some redundant / 
slightly misleading one emitted by check_static_variable_definition. I 
think we can just return early from cp_finish_decl and solve the primary 
and the secondary issue. I also checked that clang too doesn't emit an 
error for line #28 of constexpr-diag3.C, after the hard error for co1 
itself at line #27. Tested x86_64-linux.


Thanks, Paolo.

//

/cp
2018-01-61  Paolo Carlini  

PR c++/81054
* decl.c (cp_finish_decl): Early return when the
ensure_literal_type_for_constexpr_object fails.

/testsuite
2018-01-61  Paolo Carlini  

PR c++/81054
* g++.dg/cpp0x/constexpr-ice19.C: New.
* g++.dg/cpp0x/constexpr-diag3.C: Adjust.
Index: cp/decl.c
===
--- cp/decl.c   (revision 256728)
+++ cp/decl.c   (working copy)
@@ -6811,7 +6811,11 @@ cp_finish_decl (tree decl, tree init, bool init_co
 }
 
   if (!ensure_literal_type_for_constexpr_object (decl))
-DECL_DECLARED_CONSTEXPR_P (decl) = 0;
+{
+  DECL_DECLARED_CONSTEXPR_P (decl) = 0;
+  TREE_TYPE (decl) = error_mark_node;
+  return;
+}
 
   if (VAR_P (decl)
   && DECL_CLASS_SCOPE_P (decl)
Index: testsuite/g++.dg/cpp0x/constexpr-diag3.C
===
--- testsuite/g++.dg/cpp0x/constexpr-diag3.C(revision 256728)
+++ testsuite/g++.dg/cpp0x/constexpr-diag3.C(working copy)
@@ -25,7 +25,7 @@ struct complex// { dg-message "no 
.constexpr.
 };
 
 constexpr complex co1(0, 1);  // { dg-error "not literal" }
-constexpr double dd2 = co1.real(); // { dg-error "|in .constexpr. expansion of 
" }
+constexpr double dd2 = co1.real();
 
 // 
 
Index: testsuite/g++.dg/cpp0x/constexpr-ice19.C
===
--- testsuite/g++.dg/cpp0x/constexpr-ice19.C(nonexistent)
+++ testsuite/g++.dg/cpp0x/constexpr-ice19.C(working copy)
@@ -0,0 +1,13 @@
+// PR c++/81054
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  volatile int i;
+  constexpr A() : i() {}
+};
+
+struct B
+{
+  static constexpr A a {};  // { dg-error "not literal" }
+};

Re: [PATCH PR82096] Fix ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi

2018-01-16 Thread Sudakshina Das


Hi Jeff

On 12/01/18 23:00, Jeff Law wrote:

On 01/12/2018 01:45 AM, Christophe Lyon wrote:

Hi,

On 11 January 2018 at 11:58, Sudakshina Das  wrote:

Hi Jeff


On 10/01/18 21:08, Jeff Law wrote:


On 01/10/2018 09:25 AM, Sudakshina Das wrote:


Hi Jeff

On 10/01/18 10:44, Sudakshina Das wrote:


Hi Jeff

On 09/01/18 23:43, Jeff Law wrote:


On 01/05/2018 12:25 PM, Sudakshina Das wrote:


Hi Jeff

On 05/01/18 18:44, Jeff Law wrote:


On 01/04/2018 08:35 AM, Sudakshina Das wrote:


Hi

The bug reported a particular test di-longlong64-sync-1.c failing
when
run on arm-linux-gnueabi with options -mthumb -march=armv5t
-O[g,1,2,3]
and -mthumb -march=armv6 -O[g,1,2,3].

According to what I could see, the crash was caused because of the
explicit VOIDmode argument that was sent to emit_store_flag_force
().
Since the comparing argument was a long long, it was being forced
into a
VOID type register before the comparison (in prepare_cmp_insn()) is
done.



As pointed out by Kyrill, there is a comment on emit_store_flag()
which
says "MODE is the mode to use for OP0 and OP1 should they be
CONST_INTs.
 If it is VOIDmode, they cannot both be CONST_INT". This
condition is
not true in this case and thus I think it is suitable to change the
argument.

Testing done: Checked for regressions on bootstrapped
arm-none-linux-gnueabi and arm-none-linux-gnueabihf and added new
test
cases.

Sudi

ChangeLog entries:

*** gcc/ChangeLog ***

2017-01-04  Sudakshina Das  

PR target/82096
* optabs.c (expand_atomic_compare_and_swap): Change argument
to emit_store_flag_force.

*** gcc/testsuite/ChangeLog ***

2017-01-04  Sudakshina Das  

PR target/82096
* gcc.c-torture/compile/pr82096-1.c: New test.
* gcc.c-torture/compile/pr82096-2.c: Likwise.


In the case where both (op0/op1) to
emit_store_flag/emit_store_flag_force are constants, don't we know
the
result of the comparison and shouldn't we have optimized the store
flag
to something simpler?

I feel like I must be missing something here.



emit_store_flag_force () is comparing a register to op0.


?
/* Emit a store-flags instruction for comparison CODE on OP0 and OP1
  and storing in TARGET.  Normally return TARGET.
  Return 0 if that cannot be done.

  MODE is the mode to use for OP0 and OP1 should they be
CONST_INTs.  If
  it is VOIDmode, they cannot both be CONST_INT.


So we're comparing op0 and op1 AFAICT.  One, but not both can be a
CONST_INT.  If both are a CONST_INT, then you need to address the
problem in the caller (by optimizing away the condition).  If you've
got
a REG and a CONST_INT, then the mode should be taken from the REG
operand.







The 2 constant arguments are to the expand_atomic_compare_and_swap ()
function. emit_store_flag_force () is used in case when this
function is
called by the bool variant of the built-in function where the bool
return value is computed by comparing the result register with the
expected op0.


So if only one of the two objects is a CONST_INT, then the mode should
come from the other object.  I think that's the fundamental problem
here
and that you're just papering over it by changing the caller.


I think my earlier explanation was a bit misleading and I may have
rushed into quoting the comment about both operands being const for
emit_store_flag_force(). The problem is with the function and I do
agree with your suggestion of changing the function to add the code
below to be a better approach than the changing the caller. I will
change the patch and test it.



This is the updated patch according to your suggestions.

Testing: Checked for regressions on arm-none-linux-gnueabihf and added
new test case.

Thanks
Sudi

ChangeLog entries:

*** gcc/ChangeLog ***

2017-01-10  Sudakshina Das  

  PR target/82096
  * expmed.c (emit_store_flag_force): Swap if const op0
  and change VOIDmode to mode of op0.

*** gcc/testsuite/ChangeLog ***

2017-01-10  Sudakshina Das  

  PR target/82096
  * gcc.c-torture/compile/pr82096.c: New test.


OK.



Thanks. Committed as r256526.
Sudi



Could you add a guard like in other tests to skip it if the user added
-mfloat-abi=XXX when running the tests?

For instance, I have a configuration where I add
-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard
and the new test fails because:
xgcc: error: -mfloat-abi=soft and -mfloat-abi=hard may not be used together

It's starting to feel like the test should move into gcc.target/arm :-)
  I nearly suggested that already.  Consider moving it into
gcc.target/arm pre-approved along with adding the -O
to the options and whatever is needed to skip the test at the
appropriate time.


My initial thought was also to put the test in gcc.target/arm. But I 
wanted to put it in a torture suite as this was failing at different 
optimization levels. Creating several tests for different optimization 
levels or a new torture suite just for this test did not look like the 
better opt

Re: [PATCH 3/6] [ARC] Add support for "register file 16" reduced register set

2018-01-16 Thread Andrew Burgess

* Claudiu Zissulescu  [2017-11-02 13:30:32 
+0100]:

> gcc/
> 2017-03-20  Claudiu Zissulescu  
> 
>   * config/arc/arc-arches.def: Option mrf16 valid for all
>   architectures.
>   * config/arc/arc-c.def (__ARC_RF16__): New predefined macro.
>   * config/arc/arc-cpus.def (em_mini): New cpu with rf16 on.
>   * config/arc/arc-options.def (FL_RF16): Add mrf16 option.
>   * config/arc/arc-tables.opt: Regenerate.
>   * config/arc/arc.c (arc_conditional_register_usage): Handle
>   reduced register file case.
>   (arc_file_start): Set must have build attributes.
>   * config/arc/arc.h (MAX_ARC_PARM_REGS): Conditional define using
>   mrf16 option value.
>   * config/arc/arc.opt (mrf16): Add new option.
>   * config/arc/elf.h (ATTRIBUTE_PCS): Define.
>   * config/arc/genmultilib.awk: Handle new mrf16 option.
>   * config/arc/linux.h (ATTRIBUTE_PCS): Define.
>   * config/arc/t-multilib: Regenerate.
>   * doc/invoke.texi (ARC Options): Document mrf16 option.
> 
> gcc/testsuite/
> 2017-03-20  Claudiu Zissulescu  
> 
>   * gcc.dg/builtin-apply2.c: Change for the ARC's reduced register
>   set file case.
> 
> libgcc/
> 2017-09-18  Claudiu Zissulescu  
> 
>   * config/arc/lib1funcs.S (__udivmodsi4): Use safe version for RF16
>   option.
>   (__divsi3): Use RF16 safe registers.
>   (__modsi3): Likewise.

Looks fine, except I think that the new 'em_mini' cpu needs to be
added to the -mcpu= description in doc/invoke.texi.

Thanks,
Andrew




> ---
>  gcc/config/arc/arc-arches.def |  8 
>  gcc/config/arc/arc-c.def  |  1 +
>  gcc/config/arc/arc-cpus.def   |  1 +
>  gcc/config/arc/arc-options.def|  2 +-
>  gcc/config/arc/arc-tables.opt |  3 +++
>  gcc/config/arc/arc.c  | 27 +++
>  gcc/config/arc/arc.h  |  2 +-
>  gcc/config/arc/arc.opt|  4 
>  gcc/config/arc/elf.h  |  4 
>  gcc/config/arc/genmultilib.awk|  2 ++
>  gcc/config/arc/linux.h|  9 +
>  gcc/config/arc/t-multilib |  4 ++--
>  gcc/doc/invoke.texi   |  8 +++-
>  gcc/testsuite/gcc.dg/builtin-apply2.c |  8 +++-
>  libgcc/config/arc/lib1funcs.S | 22 +++---
>  15 files changed, 84 insertions(+), 21 deletions(-)
> 
> diff --git a/gcc/config/arc/arc-arches.def b/gcc/config/arc/arc-arches.def
> index 29cb9c4..a0d585b 100644
> --- a/gcc/config/arc/arc-arches.def
> +++ b/gcc/config/arc/arc-arches.def
> @@ -40,15 +40,15 @@
>  
>  ARC_ARCH ("arcem", em, FL_MPYOPT_1_6 | FL_DIVREM | FL_CD | FL_NORM   \
> | FL_BS | FL_SWAP | FL_FPUS | FL_SPFP | FL_DPFP   \
> -   | FL_SIMD | FL_FPUDA | FL_QUARK, 0)
> +   | FL_SIMD | FL_FPUDA | FL_QUARK | FL_RF16, 0)
>  ARC_ARCH ("archs", hs, FL_MPYOPT_7_9 | FL_DIVREM | FL_NORM | FL_CD   \
> | FL_ATOMIC | FL_LL64 | FL_BS | FL_SWAP   \
> -   | FL_FPUS | FL_FPUD,  \
> +   | FL_FPUS | FL_FPUD | FL_RF16,\
> FL_CD | FL_ATOMIC | FL_BS | FL_NORM | FL_SWAP)
>  ARC_ARCH ("arc6xx", 6xx, FL_BS | FL_NORM | FL_SWAP | FL_MUL64 | FL_MUL32x16 \
> -   | FL_SPFP | FL_ARGONAUT | FL_DPFP, 0)
> +   | FL_SPFP | FL_ARGONAUT | FL_DPFP | FL_RF16, 0)
>  ARC_ARCH ("arc700", 700, FL_ATOMIC | FL_BS | FL_NORM | FL_SWAP | FL_EA \
> -   | FL_SIMD | FL_SPFP | FL_ARGONAUT | FL_DPFP, \
> +   | FL_SIMD | FL_SPFP | FL_ARGONAUT | FL_DPFP | FL_RF16,   \
> FL_BS | FL_NORM | FL_SWAP)
>  
>  /* Local Variables: */
> diff --git a/gcc/config/arc/arc-c.def b/gcc/config/arc/arc-c.def
> index 8c5097e..c9443c9 100644
> --- a/gcc/config/arc/arc-c.def
> +++ b/gcc/config/arc/arc-c.def
> @@ -28,6 +28,7 @@ ARC_C_DEF ("__ARC_NORM__",  TARGET_NORM)
>  ARC_C_DEF ("__ARC_MUL64__",  TARGET_MUL64_SET)
>  ARC_C_DEF ("__ARC_MUL32BY16__", TARGET_MULMAC_32BY16_SET)
>  ARC_C_DEF ("__ARC_SIMD__",   TARGET_SIMD_SET)
> +ARC_C_DEF ("__ARC_RF16__",   TARGET_RF16)
>  
>  ARC_C_DEF ("__ARC_BARREL_SHIFTER__", TARGET_BARREL_SHIFTER)
>  
> diff --git a/gcc/config/arc/arc-cpus.def b/gcc/config/arc/arc-cpus.def
> index 60b4045..c2b0062 100644
> --- a/gcc/config/arc/arc-cpus.def
> +++ b/gcc/config/arc/arc-cpus.def
> @@ -46,6 +46,7 @@
> TUNETune value for the given configuration, otherwise NONE.  */
>  
>  ARC_CPU (em, em, 0, NONE)
> +ARC_CPU (em_mini,   em, FL_RF16, NONE)
>  ARC_CPU (arcem,  em, FL_MPYOPT_2|FL_CD|FL_BS, NONE)
>  ARC_CPU (em4,em, FL_CD, NONE)
>  ARC_CPU (em4_dmips, em, FL_MPYOPT_2|FL_CD|FL_DIVREM|FL_NORM|FL_SWAP|FL_BS, 
> NONE)
> diff --git a/gcc/config/arc/arc-options.def b/gcc/config/arc/arc-options.def
> index be51614..8fc7b50 100644
> --- a/gcc/config/arc/arc-options.def
> +++ b/gcc/config/arc/arc-options.def
> @@ -60,7 +60,7 @@
>  ARC_OPT (FL_CD,(1UL

Move pa.h FUNCTION_ARG_SIZE to pa.c (PR83858)

2018-01-16 Thread Richard Sandiford

The port-local FUNCTION_ARG_SIZE:

  MODE) != BLKmode \
 ? (HOST_WIDE_INT) GET_MODE_SIZE (MODE) \
 : int_size_in_bytes (TYPE)) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)

is used by code in pa.c and by ASM_DECLARE_FUNCTION_NAME in som.h.
Treating GET_MODE_SIZE as a constant is OK for the former but not
the latter, which is used in target-independent code.  This caused
a build failure on hppa2.0w-hp-hpux11.11.

Tested with a cross build of hppa2.0w-hp-hpux11.11.  OK to install?

Richard


2018-01-16  Richard Sandiford  

gcc/
PR target/83858
* config/pa/pa.h (FUNCTION_ARG_SIZE): Delete.
* config/pa/pa-protos.h (pa_function_arg_size): Declare.
* config/pa/som.h (ASM_DECLARE_FUNCTION_NAME): Use
pa_function_arg_size instead of FUNCTION_ARG_SIZE.
* config/pa/pa.c (pa_function_arg_advance): Likewise.
(pa_function_arg, pa_arg_partial_bytes): Likewise.
(pa_function_arg_size): New function.

Index: gcc/config/pa/pa.h
===
--- gcc/config/pa/pa.h  2018-01-03 11:12:55.202783713 +
+++ gcc/config/pa/pa.h  2018-01-16 10:50:31.245063090 +
@@ -592,15 +592,6 @@ #define INIT_CUMULATIVE_INCOMING_ARGS(CU
   (CUM).indirect = 0,  \
   (CUM).nargs_prototype = 1000
 
-/* Figure out the size in words of the function argument.  The size
-   returned by this macro should always be greater than zero because
-   we pass variable and zero sized objects by reference.  */
-
-#define FUNCTION_ARG_SIZE(MODE, TYPE)  \
-  MODE) != BLKmode \
- ? (HOST_WIDE_INT) GET_MODE_SIZE (MODE) \
- : int_size_in_bytes (TYPE)) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
-
 /* Determine where to put an argument to a function.
Value is zero to push the argument on the stack,
or a hard register in which to store the argument.
Index: gcc/config/pa/pa-protos.h
===
--- gcc/config/pa/pa-protos.h   2018-01-03 11:12:55.198783870 +
+++ gcc/config/pa/pa-protos.h   2018-01-16 10:50:31.244063125 +
@@ -107,5 +107,6 @@ extern void pa_asm_output_aligned_local
 unsigned int);
 extern void pa_hpux_asm_output_external (FILE *, tree, const char *);
 extern HOST_WIDE_INT pa_initial_elimination_offset (int, int);
+extern HOST_WIDE_INT pa_function_arg_size (machine_mode, const_tree);
 
 extern const int pa_magic_milli[];
Index: gcc/config/pa/som.h
===
--- gcc/config/pa/som.h 2018-01-03 11:12:55.191784145 +
+++ gcc/config/pa/som.h 2018-01-16 10:50:31.246063055 +
@@ -136,8 +136,8 @@ #define ASM_DECLARE_FUNCTION_NAME(FILE,
 else   \
   {\
 int arg_size = \
-  FUNCTION_ARG_SIZE (TYPE_MODE (DECL_ARG_TYPE (parm)),\
- DECL_ARG_TYPE (parm));\
+  pa_function_arg_size (TYPE_MODE (DECL_ARG_TYPE (parm)),\
+DECL_ARG_TYPE (parm)); \
 /* Passing structs by invisible reference uses \
one general register.  */   \
 if (arg_size > 2   \
Index: gcc/config/pa/pa.c
===
--- gcc/config/pa/pa.c  2018-01-03 11:12:55.201783752 +
+++ gcc/config/pa/pa.c  2018-01-16 10:50:31.245063090 +
@@ -9485,7 +9485,7 @@ pa_function_arg_advance (cumulative_args
 const_tree type, bool named ATTRIBUTE_UNUSED)
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
-  int arg_size = FUNCTION_ARG_SIZE (mode, type);
+  int arg_size = pa_function_arg_size (mode, type);
 
   cum->nargs_prototype--;
   cum->words += (arg_size
@@ -9517,7 +9517,7 @@ pa_function_arg (cumulative_args_t cum_v
   if (mode == VOIDmode)
 return NULL_RTX;
 
-  arg_size = FUNCTION_ARG_SIZE (mode, type);
+  arg_size = pa_function_arg_size (mode, type);
 
   /* If this arg would be passed partially or totally on the stack, then
  this routine should return zero.  pa_arg_partial_bytes will
@@ -9724,10 +9724,10 @@ pa_arg_partial_bytes (cumulative_args_t
   if (!TARGET_64BIT)
 return 0;
 
-  if (FUNCTION_ARG_SIZE (mode, type) > 1 && (cum->words & 1))
+  if (pa_function_arg_size (mode, type) > 1 && (cum->words & 1))
 offset = 1;
 
-  if (cum->words + offset + FUNCTION_ARG_SIZE (mode, type) <= max_arg_words)
+  if (cum->words + offset + pa_function_arg_size (mode, type) <= max_arg_words)
 /* Arg fits fully into registers.  */
 return 0;
   else if (cum->words + offset >= max_arg_words)
@@ -10835,4 +10835,16 @@ pa_starting_frame_offset (void)
   r

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread H.J. Lu

On Tue, Jan 16, 2018 at 12:34 AM, Jan Hubicka  wrote:
>> On Mon, Jan 15, 2018 at 5:53 PM, H.J. Lu  wrote:
>> > On Mon, Jan 15, 2018 at 3:38 AM, H.J. Lu  wrote:
>> >> On Mon, Jan 15, 2018 at 12:31 AM, Richard Biener
>> >>  wrote:
>> >>> On Sun, Jan 14, 2018 at 4:08 PM, H.J. Lu  wrote:
>>  Now my patch set has been checked into trunk.  Here is a patch set
>>  to move struct ix86_frame to machine_function on GCC 7, which is
>>  needed to backport the patch set to GCC 7:
>> 
>>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01239.html
>>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01240.html
>>  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01241.html
>> 
>>  OK for gcc-7-branch?
>> >>>
>> >>> Yes, backporting is ok - please watch for possible fallout on trunk and 
>> >>> make
>> >>> sure to adjust the backport accordingly.  I plan to do GCC 7.3 RC1 on
>> >>> Wednesday now with the final release about a week later if no issue shows
>> >>> up.
>> >>>
>> >>
>> >> Backport is blocked by
>> >>
>> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
>> >>
>> >> There are many test failures due to lack of comdat support in linker on 
>> >> Solaris.
>> >> I can limit these tests to Linux.
>> >
>> > These are testcase issues and shouldn't block backport to GCC 7.
>>
>> It makes the option using thunks unusable though, right?  Can you simply make
>> them hidden on systems without comdat support?  That duplicates them per TU
>> but at least the feature works.  Or those systems should provide the thunks 
>> via
>> libgcc.
>>
>> I agree we can followup with a fix for Solaris given lack of a public
>> testing machine.
>
> My memory is bit dim, but I am convinced I was fixing specific errors for 
> comdats
> on Solaris, so I think the toolchain supports them in some sort, just is more
> restrictive/different from GNU implementation.
>
> Indeed, i think just producing sorry, unimplemented message is what we should 
> do
> if we can't support retpoline on given target.
>

It still works without comdat.  GCC just generate a local thunk in each object
file.

-- 
H.J.

Re: [PATCH] Fix warn_if_not_align ICE (PR c/83844)

2018-01-16 Thread Richard Sandiford

Jakub Jelinek  writes:
> On Tue, Jan 16, 2018 at 08:57:38AM +0100, Richard Biener wrote:
>> > -  unsigned HOST_WIDE_INT off
>> > -= (tree_to_uhwi (DECL_FIELD_OFFSET (field))
>> > -   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
>> > -  if ((off % warn_if_not_align) != 0)
>> > -warning (opt_w, "%q+D offset %wu in %qT isn't aligned to %u",
>> > +  tree off = byte_position (field);
>> > +  if (!multiple_of_p (TREE_TYPE (off), off, size_int (warn_if_not_align)))
>> 
>> multiple_of_p also returns 0 if it doesn't know (for the non-constant
>> case obviously), so the warning should say "may be not aligned"?  Or
>> we don't want any false positives which means multiple_of_p should get
>> a worker factored out that returns a tri-state value?
>
> tri-state sounds optimizing for the very uncommon case, I think it must be
> very rare in practice when we could prove it must be not aligned and
> especially we'd need to extend it a lot to handle those cases.
>
> Here is an updated patch which says may not be aligned if off is
> non-constant.  When extending the testcase, I've noticed we don't handle
> IMHO quite important case in multiple_of_p, so the patch handles that too.
> I've tried not to increase asymptotic complexity of multiple_of_p, so except
> for the cases where both arguments are INTEGER_CSTs it shouldn't call
> multiple_of_p more times than before.
>
> Ok for trunk if this passes bootstrap/regtest?
>
> 2018-01-16  Jakub Jelinek  
>
>   PR c/83844
>   * stor-layout.c (handle_warn_if_not_align): Use byte_position and
>   multiple_of_p instead of unchecked tree_to_uhwi and UHWI check.
>   If off is not INTEGER_CST, issue a may not be aligned warning
>   rather than isn't aligned.  Use isn%'t rather than isn't.
>   * fold-const.c (multiple_of_p) : Don't fall through
>   into MULT_EXPR.
>   : Improve the case when bottom and one of the
>   MULT_EXPR operands are INTEGER_CSTs and bottom is multiple of that
>   operand, in that case check if the other operand is multiple of
>   bottom divided by the INTEGER_CST operand.
>
>   * gcc.dg/pr83844.c: New test.
>
> --- gcc/stor-layout.c.jj  2018-01-15 22:40:14.009263280 +0100
> +++ gcc/stor-layout.c 2018-01-16 10:01:48.135111031 +0100
> @@ -1150,12 +1150,16 @@ handle_warn_if_not_align (tree field, un
>  warning (opt_w, "alignment %u of %qT is less than %u",
>record_align, context, warn_if_not_align);
>  
> -  unsigned HOST_WIDE_INT off
> -= (tree_to_uhwi (DECL_FIELD_OFFSET (field))
> -   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
> -  if ((off % warn_if_not_align) != 0)
> -warning (opt_w, "%q+D offset %wu in %qT isn't aligned to %u",
> -  field, off, context, warn_if_not_align);
> +  tree off = byte_position (field);
> +  if (!multiple_of_p (TREE_TYPE (off), off, size_int (warn_if_not_align)))
> +{
> +  if (TREE_CODE (off) == INTEGER_CST)
> + warning (opt_w, "%q+D offset %E in %qT isn%'t aligned to %u",
> +  field, off, context, warn_if_not_align);
> +  else
> + warning (opt_w, "%q+D offset %E in %qT may not be aligned to %u",
> +  field, off, context, warn_if_not_align);
> +}
>  }
>  
>  /* Called from place_field to handle unions.  */
> --- gcc/fold-const.c.jj   2018-01-15 10:02:04.119181355 +0100
> +++ gcc/fold-const.c  2018-01-16 10:48:10.444360796 +0100
> @@ -12595,9 +12595,34 @@ multiple_of_p (tree type, const_tree top
>a multiple of BOTTOM then TOP is a multiple of BOTTOM.  */
>if (!integer_pow2p (bottom))
>   return 0;
> -  /* FALLTHRU */
> +  return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom)
> +   || multiple_of_p (type, TREE_OPERAND (top, 0), bottom));
>  
>  case MULT_EXPR:
> +  if (TREE_CODE (bottom) == INTEGER_CST)
> + {
> +   op1 = TREE_OPERAND (top, 0);
> +   op2 = TREE_OPERAND (top, 1);
> +   if (TREE_CODE (op1) == INTEGER_CST)
> + std::swap (op1, op2);
> +   if (TREE_CODE (op2) == INTEGER_CST)
> + {
> +   if (multiple_of_p (type, op2, bottom))
> + return 1;
> +   /* Handle multiple_of_p ((x * 2 + 2) * 4, 8).  */
> +   if (multiple_of_p (type, bottom, op2))
> + {
> +   widest_int w = wi::sdiv_trunc (wi::to_widest (bottom),
> +  wi::to_widest (op2));
> +   if (wi::fits_to_tree_p (w, TREE_TYPE (bottom)))
> + {
> +   op2 = wide_int_to_tree (TREE_TYPE (bottom), w);
> +   return multiple_of_p (type, op1, op2);
> + }
> + }

It doesn't really matter since this isn't performance-critical code,
but FWIW, there's a wi::multiple_of_p that would avoid the recursion
and do the sdiv_trunc as a side-effect.

Thanks,
Richard

Re: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread Kyrill Tkachov


Hi Tamar,

On 16/01/18 10:04, Tamar Christina wrote:

Hi All,

This patch updates the GCC 8 release notes for ARM and AArch64.

Ok for cvs?

Thanks,
Tamar

--



+  
+  
+New Armv8.4-A FP16 Floating Point Multiplication Variant instructions have 
been added.  These instructions are
+mandatory in Armv8.4-A but available as an optional extension to Armv8.2-A 
and Armv8.3-A.  The new extension
+can be used by specifying the +fp16fml architectural 
extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
+the instructions can be enabled by specifying +fp16.
+  
+  
+   Support has been added for the following processors
+   (GCC identifiers in parentheses):
+   
+Arm Cortex-A75 (cortex-a75).
+Arm Cortex-A55 (cortex-a55).
+Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE 
(cortex-a75.cortex-a55).
+Arm Cortex-R52 for Armv8-R (cortex-r52).
+   
+   The GCC identifiers can be used
+   as arguments to the -mcpu or -mtune options,
+   for example: -mcpu=cortex-a75 or
+   -mtune=xgene1 or as arguments to the equivalent target

xgene1 was added a few releases ago, better to use one of the new additions 
from the above list.
For example -mtune=cortex-r52.

With that nit the arm changes look ok to me.
Thanks for compiling this!
Kyrill

[PATCH] i386: More use reference of struct ix86_frame to avoid copy

2018-01-16 Thread H.J. Lu

This patch has been used with my Spectre backport for GCC 7 for many
weeks and has been checked into GCC 7 branch.  Should I revert it on
GCC 7 branch or check it into trunk?

H.J.
---
When there is no need to make a copy of ix86_frame, we can use reference
of struct ix86_frame to avoid copy.

* config/i386/i386.c (ix86_expand_prologue): Use reference of
struct ix86_frame.
(ix86_expand_epilogue): Likewise.
---
 gcc/config/i386/i386.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index bfb31db8752..9eba3ffd5d6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13385,7 +13385,6 @@ ix86_expand_prologue (void)
 {
   struct machine_function *m = cfun->machine;
   rtx insn, t;
-  struct ix86_frame frame;
   HOST_WIDE_INT allocate;
   bool int_registers_saved;
   bool sse_registers_saved;
@@ -13413,7 +13412,7 @@ ix86_expand_prologue (void)
   m->fs.sp_valid = true;
   m->fs.sp_realigned = false;
 
-  frame = m->frame;
+  struct ix86_frame &frame = cfun->machine->frame;
 
   if (!TARGET_64BIT && ix86_function_ms_hook_prologue (current_function_decl))
 {
@@ -14291,7 +14290,6 @@ ix86_expand_epilogue (int style)
 {
   struct machine_function *m = cfun->machine;
   struct machine_frame_state frame_state_save = m->fs;
-  struct ix86_frame frame;
   bool restore_regs_via_mov;
   bool using_drap;
   bool restore_stub_is_tail = false;
@@ -14304,7 +14302,7 @@ ix86_expand_epilogue (int style)
 }
 
   ix86_finalize_stack_frame_flags ();
-  frame = m->frame;
+  struct ix86_frame &frame = cfun->machine->frame;
 
   m->fs.sp_realigned = stack_realign_fp;
   m->fs.sp_valid = stack_realign_fp
-- 
2.14.3

Avoid GCC 4.1 build failure in fold-const.c

2018-01-16 Thread Richard Sandiford

We had:

  tree t = fold_vec_perm (type, arg1, arg2,
  vec_perm_indices (sel, 2, nelts));

where fold_vec_perm takes a const vec_perm_indices &.  GCC 4.1 apparently
required a public copy constructor:

gcc/vec-perm-indices.h:85: error: 'vec_perm_indices::vec_perm_indices(const 
vec_perm_indices&)' is private
gcc/fold-const.c:11410: error: within this context

even though no copy should be made here.  This patch tries to work
around that by constructing the vec_perm_indices separately.

Tested on aarch64-linux-gnu.  OK to install?

Richard


2018-01-16  Richard Sandiford  

gcc/
* fold-const.c (fold_ternary_loc): Construct the vec_perm_indices
in a separate statement.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c2018-01-15 12:38:28.967896418 +
+++ gcc/fold-const.c2018-01-16 12:08:10.08501 +
@@ -11406,8 +11406,8 @@ fold_ternary_loc (location_t loc, enum t
  else /* Currently unreachable.  */
return NULL_TREE;
}
- tree t = fold_vec_perm (type, arg1, arg2,
- vec_perm_indices (sel, 2, nelts));
+ vec_perm_indices indices (sel, 2, nelts);
+ tree t = fold_vec_perm (type, arg1, arg2, indices);
  if (t != NULL_TREE)
return t;
}

Re: Avoid GCC 4.1 build failure in fold-const.c

2018-01-16 Thread Jakub Jelinek

On Tue, Jan 16, 2018 at 12:11:28PM +, Richard Sandiford wrote:
> We had:
> 
> tree t = fold_vec_perm (type, arg1, arg2,
> vec_perm_indices (sel, 2, nelts));
> 
> where fold_vec_perm takes a const vec_perm_indices &.  GCC 4.1 apparently
> required a public copy constructor:
> 
> gcc/vec-perm-indices.h:85: error: 'vec_perm_indices::vec_perm_indices(const 
> vec_perm_indices&)' is private
> gcc/fold-const.c:11410: error: within this context
> 
> even though no copy should be made here.  This patch tries to work
> around that by constructing the vec_perm_indices separately.
> 
> Tested on aarch64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2018-01-16  Richard Sandiford  
> 
> gcc/
>   * fold-const.c (fold_ternary_loc): Construct the vec_perm_indices
>   in a separate statement.

Ok, thanks.

> Index: gcc/fold-const.c
> ===
> --- gcc/fold-const.c  2018-01-15 12:38:28.967896418 +
> +++ gcc/fold-const.c  2018-01-16 12:08:10.08501 +
> @@ -11406,8 +11406,8 @@ fold_ternary_loc (location_t loc, enum t
> else /* Currently unreachable.  */
>   return NULL_TREE;
>   }
> -   tree t = fold_vec_perm (type, arg1, arg2,
> -   vec_perm_indices (sel, 2, nelts));
> +   vec_perm_indices indices (sel, 2, nelts);
> +   tree t = fold_vec_perm (type, arg1, arg2, indices);
> if (t != NULL_TREE)
>   return t;
>   }

Jakub

Re: [PATCH v3, rs6000] Add -mspeculate-indirect-jumps option and implement non-speculating bctr / bctrl

2018-01-16 Thread Segher Boessenkool

Hi!

On Tue, Jan 16, 2018 at 09:29:13AM +0100, Richard Biener wrote:
> Did you consider simply removing the tablejump/casesi support so
> expansion always
> expands to a balanced tree?  At least if we have any knobs to tune we
> should probably
> tweak them away from the indirect jump using variants with
> -mno-speculate-indirect-jumps,
> right?

We can generate indirect jumps for other situations so this patch will
still be needed.

> Performance optimization, so shouldn't block this patch - I just
> thought I should probably
> mention this.

Yeah let's get this done first :-)


Segher

[PATCH, committed] Add myself to MAINTAINERS

2018-01-16 Thread Sebastian Perta

Hi,

Just added myself to MAINTAINERS (write after approval)

Best Regards,
Sebastian

Index: ChangeLog
===
--- ChangeLog(revision 256737)
+++ ChangeLog(working copy)
@@ -1,3 +1,7 @@
+2018-01-16  Sebastian Perta  
+
+* MAINTAINERS (write after approval): Add myself.
+
 2018-01-03  Jakub Jelinek  

 Update copyright years.
Index: MAINTAINERS
===
--- MAINTAINERS(revision 256737)
+++ MAINTAINERS(working copy)
@@ -535,6 +535,7 @@
 Devang Patel
 Andris Pavenis
 Fernando Pereira
+Sebastian Perta
 Sebastian Peryt
 Kaushik Phatak
 Nicolas Pitre



Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, 
Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered 
No. 04586709.

Re: [PATCH v3, rs6000] Add -mspeculate-indirect-jumps option and implement non-speculating bctr / bctrl

2018-01-16 Thread Segher Boessenkool

Hi!

On Mon, Jan 15, 2018 at 05:09:06PM -0600, Bill Schmidt wrote:
> @@ -12933,9 +12974,27 @@
>""
>  {
>if (TARGET_32BIT)
> -emit_jump_insn (gen_tablejumpsi (operands[0], operands[1]));
> +{
> +  if (rs6000_speculate_indirect_jumps)
> + emit_jump_insn (gen_tablejumpsi (operands[0], operands[1]));
> +  else
> + {
> +   rtx ccreg = gen_reg_rtx (CCmode);
> +   rtx jump = gen_tablejumpsi_nospec (operands[0], operands[1], ccreg);
> +   emit_jump_insn (jump);
> + }
> +}
>else
> -emit_jump_insn (gen_tablejumpdi (operands[0], operands[1]));
> +{
> +  if (rs6000_speculate_indirect_jumps)
> + emit_jump_insn (gen_tablejumpdi (operands[0], operands[1]));
> +  else
> + {
> +   rtx ccreg = gen_reg_rtx (CCmode);
> +   rtx jump = gen_tablejumpdi_nospec (operands[0], operands[1], ccreg);
> +   emit_jump_insn (jump);
> + }
> +}
>DONE;
>  })

This is easier to read if you swap the "if"s (put the
rs6000_speculate_indirect_jumps test on the outside).

Okay for trunk with or without such a change.  Also okay for the branches
after some testing (esp. on other ABIs, it is easy to break those together
with -mno-speculate-indirect-branches since no one sane would use that
combo on purpose).

Thanks!


Segher

Re: [PATCH] i386: More use reference of struct ix86_frame to avoid copy

2018-01-16 Thread H.J. Lu

On Tue, Jan 16, 2018 at 3:40 AM, H.J. Lu  wrote:
> This patch has been used with my Spectre backport for GCC 7 for many
> weeks and has been checked into GCC 7 branch.  Should I revert it on
> GCC 7 branch or check it into trunk?

Ada build failed with this on trunk:

raised STORAGE_ERROR : stack overflow or erroneous memory access
make[5]: *** [/export/gnu/import/git/sources/gcc/gcc/ada/Make-generated.in:45:
ada/sinfo.h] Error 1

Let me revert it on gcc-7-branch.

H.J.
> H.J.
> ---
> When there is no need to make a copy of ix86_frame, we can use reference
> of struct ix86_frame to avoid copy.
>
> * config/i386/i386.c (ix86_expand_prologue): Use reference of
> struct ix86_frame.
> (ix86_expand_epilogue): Likewise.
> ---
>  gcc/config/i386/i386.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index bfb31db8752..9eba3ffd5d6 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -13385,7 +13385,6 @@ ix86_expand_prologue (void)
>  {
>struct machine_function *m = cfun->machine;
>rtx insn, t;
> -  struct ix86_frame frame;
>HOST_WIDE_INT allocate;
>bool int_registers_saved;
>bool sse_registers_saved;
> @@ -13413,7 +13412,7 @@ ix86_expand_prologue (void)
>m->fs.sp_valid = true;
>m->fs.sp_realigned = false;
>
> -  frame = m->frame;
> +  struct ix86_frame &frame = cfun->machine->frame;
>
>if (!TARGET_64BIT && ix86_function_ms_hook_prologue 
> (current_function_decl))
>  {
> @@ -14291,7 +14290,6 @@ ix86_expand_epilogue (int style)
>  {
>struct machine_function *m = cfun->machine;
>struct machine_frame_state frame_state_save = m->fs;
> -  struct ix86_frame frame;
>bool restore_regs_via_mov;
>bool using_drap;
>bool restore_stub_is_tail = false;
> @@ -14304,7 +14302,7 @@ ix86_expand_epilogue (int style)
>  }
>
>ix86_finalize_stack_frame_flags ();
> -  frame = m->frame;
> +  struct ix86_frame &frame = cfun->machine->frame;
>
>m->fs.sp_realigned = stack_realign_fp;
>m->fs.sp_valid = stack_realign_fp
> --
> 2.14.3
>



-- 
H.J.

[PATCH] PR libstdc++/83834 replace wildcard pattern in linker script

2018-01-16 Thread Jonathan Wakely


The soon-to-be-released binutils 2.30 makes a small change to how
lambda functions are demangled, which causes some unwanted symbols to
match a wildcard pattern in the GLIBCXX_3.4 version node of our linker
script. The only symbol that is supposed to match the pattern is
std::cerr so we should just name that explicitly. That prevents other
new symbols matching and being added to the old version.

See PR 83893 for the general problem, which we should fix later.

PR libstdc++/83834
* config/abi/pre/gnu.ver (GLIBCXX_3.4): Replace std::c[a-g]* wildcard
pattern with exact match for std::cerr.

Tested powerpc64le-linux with binutils 2.25.1-32.base.el7_4.1 and on
x86_64-linux with a binutils-2.3.0.0 snapshot from 2018-01-13.

Committed to trunk, backports to follow.


commit f8896e7451cd61008e0ceb0ac9a770d5cb77d85b
Author: Jonathan Wakely 
Date:   Tue Jan 16 12:01:36 2018 +

PR libstdc++/83834 replace wildcard pattern in linker script

PR libstdc++/83834
* config/abi/pre/gnu.ver (GLIBCXX_3.4): Replace std::c[a-g]* 
wildcard
pattern with exact match for std::cerr.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index 774bedec9bc..5e66dc5cc3f 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -60,7 +60,7 @@ GLIBCXX_3.4 {
   std::basic_[t-z]*;
   std::ba[t-z]*;
   std::b[b-z]*;
-  std::c[a-g]*;
+  std::cerr;
 # std::char_traits;
 # std::c[i-z]*;
   std::c[i-n]*;

Re: GCC 8.0.0 Status Report (2018-01-15), Trunk in Regression and Documentation fixes only mode

2018-01-16 Thread Segher Boessenkool

On Mon, Jan 15, 2018 at 09:21:07AM +0100, Richard Biener wrote:
> We're still in pretty bad shape regression-wise.  Please also take
> the opportunity to check the state of your favorite host/target
> combination to make sure building and testing works appropriately.

I tested building Linux (the kernel) for all supported architectures.
Everything builds (with my usual tweaks, link with libgcc etc.);
except x86_64 and sh have more problems in the kernel, and mips has
an ICE.  I'll open a PR for that one.

Segher

Re: [PATCH] Fix store-merging for ~ of bswap (PR tree-optimization/83843)

2018-01-16 Thread Christophe Lyon

On 15 January 2018 at 22:44, Jakub Jelinek  wrote:
> Hi!
>
> When using the bswap pass infrastructure, BIT_NOT_EXPRs aren't allowed in
> the middle, but due to the way process_store handles those it can appear
> around the value, which is something output_merged_store didn't handle.
>
> Fixed thusly, where we handle not just the case when the bswap (or nop)
> value needs inversion as whole, but also cases where only a few portions of
> it need xoring with some mask.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2018-01-15  Jakub Jelinek  
>
> PR tree-optimization/83843
> * gimple-ssa-store-merging.c
> (imm_store_chain_info::output_merged_store): Handle bit_not_p on
> store_immediate_info for bswap/nop orig_stores.
>
> * gcc.dg/store_merging_18.c: New test.
>
Hi Jakub,

I've noticed that this new test fails on arm, eg:
arm-none-linux-gnueabihf
--with-mode arm
--with-cpu cortex-a9
--with-fpu neon-fp16
FAIL: gcc.dg/store_merging_18.c scan-tree-dump-times store-merging
"Merging successful" 3 (found 0 times)

Do you want me to file a PR?

Christophe



> --- gcc/gimple-ssa-store-merging.c.jj   2018-01-04 00:43:17.629703230 +0100
> +++ gcc/gimple-ssa-store-merging.c  2018-01-15 12:29:14.105789381 +0100
> @@ -3619,6 +3619,15 @@ imm_store_chain_info::output_merged_stor
>   gimple_seq_add_stmt_without_update (&seq, stmt);
>   src = gimple_assign_lhs (stmt);
> }
> + inv_op = invert_op (split_store, 2, int_type, xor_mask);
> + if (inv_op != NOP_EXPR)
> +   {
> + stmt = gimple_build_assign (make_ssa_name (int_type),
> + inv_op, src, xor_mask);
> + gimple_set_location (stmt, loc);
> + gimple_seq_add_stmt_without_update (&seq, stmt);
> + src = gimple_assign_lhs (stmt);
> +   }
>   break;
> default:
>   src = ops[0];
> --- gcc/testsuite/gcc.dg/store_merging_18.c.jj  2018-01-15 12:43:49.607227365 
> +0100
> +++ gcc/testsuite/gcc.dg/store_merging_18.c 2018-01-15 12:43:24.882245004 
> +0100
> @@ -0,0 +1,51 @@
> +/* PR tree-optimization/83843 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-store-merging" } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 3 "store-merging" 
> { target store_merge } } } */
> +
> +__attribute__((noipa)) void
> +foo (unsigned char *buf, unsigned char *tab)
> +{
> +  unsigned v = tab[1] ^ (tab[0] << 8);
> +  buf[0] = ~(v >> 8);
> +  buf[1] = ~v;
> +}
> +
> +__attribute__((noipa)) void
> +bar (unsigned char *buf, unsigned char *tab)
> +{
> +  unsigned v = tab[1] ^ (tab[0] << 8);
> +  buf[0] = (v >> 8);
> +  buf[1] = ~v;
> +}
> +
> +__attribute__((noipa)) void
> +baz (unsigned char *buf, unsigned char *tab)
> +{
> +  unsigned v = tab[1] ^ (tab[0] << 8);
> +  buf[0] = ~(v >> 8);
> +  buf[1] = v;
> +}
> +
> +int
> +main ()
> +{
> +  volatile unsigned char l1 = 0;
> +  volatile unsigned char l2 = 1;
> +  unsigned char buf[2];
> +  unsigned char tab[2] = { l1 + 1, l2 * 2 };
> +  foo (buf, tab);
> +  if (buf[0] != (unsigned char) ~1 || buf[1] != (unsigned char) ~2)
> +__builtin_abort ();
> +  buf[0] = l1 + 7;
> +  buf[1] = l2 * 8;
> +  bar (buf, tab);
> +  if (buf[0] != 1 || buf[1] != (unsigned char) ~2)
> +__builtin_abort ();
> +  buf[0] = l1 + 9;
> +  buf[1] = l2 * 10;
> +  baz (buf, tab);
> +  if (buf[0] != (unsigned char) ~1 || buf[1] != 2)
> +__builtin_abort ();
> +  return 0;
> +}
>
> Jakub

Re: [C++ Patch] PR 81054 ("[7/8 Regression] ICE with volatile variable in constexpr function")

2018-01-16 Thread Paolo Carlini

.. nevermind, this requires more work: my simple patchlet would cause a 
few regression in the libstdc++-v3 testsuite (the assert at the 
beginning of finish_expr_stmt triggers)


Paolo.

Two fixes for live-out SLP inductions (PR 83857)

2018-01-16 Thread Richard Sandiford

vect_analyze_loop_operations was calling vectorizable_live_operation
for all live-out phis, which led to a bogus ncopies calculation in
the pure SLP case.  I think v_a_l_o should only be passing phis
that are vectorised using normal loop vectorisation, since
vect_slp_analyze_node_operations handles the SLP side (and knows
the correct slp_index and slp_node arguments to pass in, via
vect_analyze_stmt).

With that fixed we hit an older bug that vectorizable_live_operation
didn't handle live-out SLP inductions.  Fixed by using gimple_phi_result
rather than gimple_get_lhs for phis.

Tested on aarch64-linux-gnu.  OK to install?

Richard


2018-01-16  Richard Sandiford  

gcc/
PR tree-optimization/83857
* tree-vect-loop.c (vect_analyze_loop_operations): Don't call
vectorizable_live_operation for pure SLP statements.
(vectorizable_live_operation): Handle PHIs.

gcc/testsuite/
PR tree-optimization/83857
* gcc.dg/vect/pr83857.c: New test.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-01-13 18:02:00.950360196 +
+++ gcc/tree-vect-loop.c2018-01-16 13:24:33.022528019 +
@@ -1851,7 +1851,10 @@ vect_analyze_loop_operations (loop_vec_i
ok = vectorizable_reduction (phi, NULL, NULL, NULL, NULL);
 }
 
- if (ok && STMT_VINFO_LIVE_P (stmt_info))
+ /* SLP PHIs are tested by vect_slp_analyze_node_operations.  */
+ if (ok
+ && STMT_VINFO_LIVE_P (stmt_info)
+ && !PURE_SLP_STMT (stmt_info))
ok = vectorizable_live_operation (phi, NULL, NULL, -1, NULL);
 
   if (!ok)
@@ -8217,7 +8220,11 @@ vectorizable_live_operation (gimple *stm
   gcc_assert (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo));
 
   /* Get the correct slp vectorized stmt.  */
-  vec_lhs = gimple_get_lhs (SLP_TREE_VEC_STMTS (slp_node)[vec_entry]);
+  gimple *vec_stmt = SLP_TREE_VEC_STMTS (slp_node)[vec_entry];
+  if (gphi *phi = dyn_cast  (vec_stmt))
+   vec_lhs = gimple_phi_result (phi);
+  else
+   vec_lhs = gimple_get_lhs (vec_stmt);
 
   /* Get entry to use.  */
   bitstart = bitsize_int (vec_index);
Index: gcc/testsuite/gcc.dg/vect/pr83857.c
===
--- /dev/null   2018-01-15 18:48:25.844002736 +
+++ gcc/testsuite/gcc.dg/vect/pr83857.c 2018-01-16 13:24:33.021528058 +
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ffast-math" } */
+
+#define N 100
+
+double __attribute__ ((noinline, noclone))
+f (double *x, double y)
+{
+  double a = 0;
+  for (int i = 0; i < N; ++i)
+{
+  a += y;
+  x[i * 2] += a;
+  x[i * 2 + 1] += a;
+}
+  return a - y;
+}
+
+double x[N * 2];
+
+int
+main (void)
+{
+  if (f (x, 5) != (N - 1) * 5)
+__builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" { target 
vect_double } } } */
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target 
vect_double } } } */

Re: [PATCH] rtlanal: dead_or_set_regno_p should handle CLOBBER (PR83424)

2018-01-16 Thread Segher Boessenkool

On Mon, Dec 18, 2017 at 12:16:13PM -0700, Jeff Law wrote:
> On 12/16/2017 02:03 PM, Segher Boessenkool wrote:
> > In PR83424 combine's move_deaths puts a REG_DEAD not in the wrong place
> > because dead_or_set_regno_p does not account for CLOBBER insns.  This
> > fixes it.
> > 
> > Bootstrapped and tested on powerpc64-linux {-m32,-m64} and on x86_64-linux.
> > Is this okay for trunk?
> > 
> > 
> > Segher
> > 
> > 
> > 2017-12-16  Segher Boessenkool  
> > 
> > PR rtl-optimization/83424
> > * rtlanal.c (dead_or_set_regno_p): Handle CLOBBER just like SET.
> > 
> > gcc/testsuite/
> > PR rtl-optimization/83424
> > * gcc.dg/pr83424.c: New testsuite.
> OK.

Is this okay for backports to 7 and 6, too?


Segher

Re: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread Tamar Christina

Hi Kyrill,

> 
> xgene1 was added a few releases ago, better to use one of the new additions 
> from the above list.
> For example -mtune=cortex-r52.

Thanks, I have updated the patch. I'll wait for an ok from an AArch64 
maintainer and a Docs maintainer.

> 
> With that nit the arm changes look ok to me.
> Thanks for compiling this!
> Kyrill
> 

Cheers,
Tamar

-- 
Index: htdocs/gcc-8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.26
diff -u -r1.26 changes.html
--- htdocs/gcc-8/changes.html	11 Jan 2018 09:31:53 -	1.26
+++ htdocs/gcc-8/changes.html	16 Jan 2018 14:12:57 -
@@ -147,7 +147,51 @@
 
 AArch64
 
-  
+  
+The Armv8.4-A architecture is now supported.  It can be used by
+specifying the -march=armv8.4-a option.
+  
+  
+The Dot Product instructions are now supported as an optional extension to the
+Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The extension can be used by
+specifying the +dotprod architecture extension.  E.g. -march=armv8.2-a+dotprod.
+  
+  
+The Armv8-A +crypto extension has now been split into two extensions for finer grained control:
+
+   +aes which contains the Armv8-A AES crytographic instructions.
+   +sha2 which contains the Armv8-A SHA2 and SHA1 cryptographic instructions.
+
+Using +crypto will now enable these two extensions.
+  
+  
+New Armv8.4-A FP16 Floating Point Multiplication Variant instructions have been added.  These instructions are
+mandatory in Armv8.4-A but available as an optional extension to Armv8.2-A and Armv8.3-A.  The new extension
+can be used by specifying the +fp16fml architectural extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
+the instructions can be enabled by specifying +fp16.
+  
+  
+New cryptographic instructions have been added as optional extensions to Armv8.2-A and newer.  These instructions can
+be enabled with:
+
+  +sha3 New SHA3 and SHA2 instructions from Armv8.4-A.  This implies +sha2.
+  +sm4 New SM3 and SM4 instructions from Armv8.4-A.
+
+ 
+  
+   Support has been added for the following processors
+   (GCC identifiers in parentheses):
+   
+ Arm Cortex-A75 (cortex-a75).
+	 Arm Cortex-A55 (cortex-a55).
+	 Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE (cortex-a75.cortex-a55).
+   
+   The GCC identifiers can be used
+   as arguments to the -mcpu or -mtune options,
+   for example: -mcpu=cortex-a75 or
+   -mtune=thunderx2t99p1 or as arguments to the equivalent target
+   attributes and pragmas.
+  
 
 
 ARM
@@ -169,14 +213,58 @@
 removed in a future release.
   
   
-The default link behavior for ARMv6 and ARMv7-R targets has been
+The default link behavior for Armv6 and Armv7-R targets has been
 changed to produce BE8 format when generating big-endian images.  A new
 flag -mbe32 can be used to force the linker to produce
 legacy BE32 format images.  There is no change of behavior for
-ARMv6-m and other ARMv7 or later targets: these already defaulted
+Armv6-M and other Armv7 or later targets: these already defaulted
 to BE8 format.  This change brings GCC into alignment with other
 compilers for the ARM architecture.
   
+  
+The Armv8-R architecture is now supported.  It can be used by specifying the
+-march=armv8-r option.
+  
+  
+The Armv8.3-A architecture is now supported.  It can be used by
+specifying the -march=armv8.3-a option.
+  
+  
+The Armv8.4-A architecture is now supported.  It can be used by
+specifying the -march=armv8.4-a option.
+  
+  
+ The Dot Product instructions are now supported as an optional extension to the
+ Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The extension can be used by
+ specifying the +dotprod architecture extension.  E.g. -march=armv8.2-a+dotprod.
+  
+
+  
+Support for setting extensions and architectures using the GCC target pragma and attribute has been added.
+It can be used by specifying #pragma GCC target ("arch=..."), #pragma GCC target ("+extension"),
+__attribute__((target("arch=..."))) or __attribute__((target("+extension"))).
+  
+  
+New Armv8.4-A FP16 Floating Point Multiplication Variant instructions have been added.  These instructions are
+mandatory in Armv8.4-A but available as an optional extension to Armv8.2-A and Armv8.3-A.  The new extension
+can be used by specifying the +fp16fml architectural extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
+the instructions can be enabled by specifying +fp16.
+  
+  
+   Support has been added for the following processors
+   (GCC identifiers in parentheses):
+   
+	 Arm Cortex-A75 (cortex-a75).
+	 Arm Cortex-A55 (cortex-a55).
+	 Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE (cortex-a75.cortex-a55).
+	 Arm Cortex-R52 for Armv8-R (cortex

Re: [PATCH v3, rs6000] Add -mspeculate-indirect-jumps option and implement non-speculating bctr / bctrl

2018-01-16 Thread Bill Schmidt

On Jan 16, 2018, at 6:13 AM, Segher Boessenkool  
wrote:
> 
> Hi!
> 
> On Tue, Jan 16, 2018 at 09:29:13AM +0100, Richard Biener wrote:
>> Did you consider simply removing the tablejump/casesi support so
>> expansion always
>> expands to a balanced tree?  At least if we have any knobs to tune we
>> should probably
>> tweak them away from the indirect jump using variants with
>> -mno-speculate-indirect-jumps,
>> right?
> 
> We can generate indirect jumps for other situations so this patch will
> still be needed.

Also, I'm not convinced that a balanced tree for a large jump table
is a slam dunk better performer than this (adding hundreds of poorly
predictable branches that can clog up hardware predictors for, say,
an interpreter loop).  I'd want to do some performance testing to look
for crossover points (as you say, tuning knobs).  But for smaller tables
this is a good idea.

Thanks,
Bill
> 
>> Performance optimization, so shouldn't block this patch - I just
>> thought I should probably
>> mention this.
> 
> Yeah let's get this done first :-)
> 
> 
> Segher
>

Re: Move pa.h FUNCTION_ARG_SIZE to pa.c (PR83858)

2018-01-16 Thread John David Anglin


On 2018-01-16 5:52 AM, Richard Sandiford wrote:

2018-01-16  Richard Sandiford

gcc/
PR target/83858
* config/pa/pa.h (FUNCTION_ARG_SIZE): Delete.
* config/pa/pa-protos.h (pa_function_arg_size): Declare.
* config/pa/som.h (ASM_DECLARE_FUNCTION_NAME): Use
pa_function_arg_size instead of FUNCTION_ARG_SIZE.
* config/pa/pa.c (pa_function_arg_advance): Likewise.
(pa_function_arg, pa_arg_partial_bytes): Likewise.
(pa_function_arg_size): New function.
Thanks Richard.  I started a build yesterday evening with essentially 
the same change.


Two little nits.  I believe a declaration for pa_function_arg_size needs 
to be added
be added to added pa-protos.h.  Secondly, the comment for 
pa_function_arg_size
needs to be updated to say "function" instead of "macro". Otherwise, the 
change

is okay.

I want to see if ASM_DECLARE_FUNCTION_NAME can be turned into a function in
pa.c as well. This would allow pa_function_arg_size to be static.

Dave

--
John David Anglin  dave.ang...@bell.net

Re: Move pa.h FUNCTION_ARG_SIZE to pa.c (PR83858)

2018-01-16 Thread Richard Sandiford

John David Anglin  writes:
> On 2018-01-16 5:52 AM, Richard Sandiford wrote:
>> 2018-01-16  Richard Sandiford
>>
>> gcc/
>>  PR target/83858
>>  * config/pa/pa.h (FUNCTION_ARG_SIZE): Delete.
>>  * config/pa/pa-protos.h (pa_function_arg_size): Declare.
>>  * config/pa/som.h (ASM_DECLARE_FUNCTION_NAME): Use
>>  pa_function_arg_size instead of FUNCTION_ARG_SIZE.
>>  * config/pa/pa.c (pa_function_arg_advance): Likewise.
>>  (pa_function_arg, pa_arg_partial_bytes): Likewise.
>>  (pa_function_arg_size): New function.
> Thanks Richard.  I started a build yesterday evening with essentially 
> the same change.
>
> Two little nits.  I believe a declaration for pa_function_arg_size needs 
> to be added to pa-protos.h.

The patch did have this.

> Secondly, the comment for pa_function_arg_size needs to be updated to
> say "function" instead of "macro". Otherwise, the change is okay.

Oops, yes.  Installed with that change, thanks.

Richard

[PATCH] Fix gimplify_one_sizepos (PR libgomp/83590, take 4)

2018-01-16 Thread Jakub Jelinek

Hi!

After lengthy IRC discussions, here is an updated patch, which should also
fix the problem that variably_modified_type_p on a REAL_TYPE returns true
even when it has constant maximum and minimum.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-01-16  Jakub Jelinek  
Richard Biener  

PR libgomp/83590
* gimplify.c (gimplify_one_sizepos): For is_gimple_constant (expr)
return early, inline manually is_gimple_sizepos.  Make sure if we
call gimplify_expr we don't end up with a gimple constant.
* tree.c (variably_modified_type_p): Don't return true for
is_gimple_constant (_t).  Inline manually is_gimple_sizepos.
* gimplify.h (is_gimple_sizepos): Remove.

--- gcc/gimplify.c.jj   2018-01-12 16:38:50.705238254 +0100
+++ gcc/gimplify.c  2018-01-16 12:21:15.895859416 +0100
@@ -12562,7 +12562,10 @@ gimplify_one_sizepos (tree *expr_p, gimp
  a VAR_DECL.  If it's a VAR_DECL from another function, the gimplifier
  will want to replace it with a new variable, but that will cause problems
  if this type is from outside the function.  It's OK to have that here.  */
-  if (is_gimple_sizepos (expr))
+  if (expr == NULL_TREE
+  || is_gimple_constant (expr)
+  || TREE_CODE (expr) == VAR_DECL
+  || CONTAINS_PLACEHOLDER_P (expr))
 return;
 
   *expr_p = unshare_expr (expr);
@@ -12570,6 +12573,12 @@ gimplify_one_sizepos (tree *expr_p, gimp
   /* SSA names in decl/type fields are a bad idea - they'll get reclaimed
  if the def vanishes.  */
   gimplify_expr (expr_p, stmt_p, NULL, is_gimple_val, fb_rvalue, false);
+
+  /* If expr wasn't already is_gimple_sizepos or is_gimple_constant from the
+ FE, ensure that it is a VAR_DECL, otherwise we might handle some decls
+ as gimplify_vla_decl even when they would have all sizes INTEGER_CSTs.  */
+  if (is_gimple_constant (*expr_p))
+*expr_p = get_initialized_tmp_var (*expr_p, stmt_p, NULL, false);
 }
 
 /* Gimplify the body of statements of FNDECL and return a GIMPLE_BIND node
--- gcc/tree.c.jj   2018-01-15 10:01:40.830186474 +0100
+++ gcc/tree.c  2018-01-16 12:24:11.254821615 +0100
@@ -8825,11 +8825,12 @@ variably_modified_type_p (tree type, tre
   do { tree _t = (T);  \
 if (_t != NULL_TREE
\
&& _t != error_mark_node\
-   && TREE_CODE (_t) != INTEGER_CST\
+   && !CONSTANT_CLASS_P (_t)   \
&& TREE_CODE (_t) != PLACEHOLDER_EXPR   \
&& (!fn \
|| (!TYPE_SIZES_GIMPLIFIED (type)   \
-   && !is_gimple_sizepos (_t)) \
+   && (TREE_CODE (_t) != VAR_DECL  \
+   && !CONTAINS_PLACEHOLDER_P (_t)))   \
|| walk_tree (&_t, find_var_from_fn, fn, NULL)))\
   return true;  } while (0)
 
--- gcc/gimplify.h.jj   2018-01-03 10:19:53.757533721 +0100
+++ gcc/gimplify.h  2018-01-16 12:24:51.995812831 +0100
@@ -85,23 +85,4 @@ extern enum gimplify_status gimplify_va_
  gimple_seq *);
 gimple *gimplify_assign (tree, tree, gimple_seq *);
 
-/* Return true if gimplify_one_sizepos doesn't need to gimplify
-   expr (when in TYPE_SIZE{,_UNIT} and similar type/decl size/bitsize
-   fields).  */
-
-static inline bool
-is_gimple_sizepos (tree expr)
-{
-  /* gimplify_one_sizepos doesn't need to do anything if the value isn't there,
- is constant, or contains A PLACEHOLDER_EXPR.  We also don't want to do
- anything if it's already a VAR_DECL.  If it's a VAR_DECL from another
- function, the gimplifier will want to replace it with a new variable,
- but that will cause problems if this type is from outside the function.
- It's OK to have that here.  */
-  return (expr == NULL_TREE
- || TREE_CODE (expr) == INTEGER_CST
- || TREE_CODE (expr) == VAR_DECL
- || CONTAINS_PLACEHOLDER_P (expr));
-}
-
 #endif /* GCC_GIMPLIFY_H */

Jakub

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Rainer Orth

Hi Richard,

>>> Backport is blocked by
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83838
>>>
>>> There are many test failures due to lack of comdat support in linker on
>>> Solaris.

actually this is lack of hidden .gnu.linkonce support right now.
Currently that's disabled for all but gld; I'm looking to make that
dynamic on newer versions of Solaris 11.

>>> I can limit these tests to Linux.
>>
>> These are testcase issues and shouldn't block backport to GCC 7.
>
> It makes the option using thunks unusable though, right?  Can you simply make
> them hidden on systems without comdat support?  That duplicates them per TU
> but at least the feature works.  Or those systems should provide the thunks 
> via
> libgcc.
>
> I agree we can followup with a fix for Solaris given lack of a public
> testing machine.

I do have both an x86 and sparc machine running Solaris 11 around to
serve as testing machines.  Still checking with legal how best to handle
external access, either locally or integrated into the compile farm.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Rainer Orth

Hi Jan,

>> It makes the option using thunks unusable though, right?  Can you simply make
>> them hidden on systems without comdat support?  That duplicates them per TU
>> but at least the feature works.  Or those systems should provide the
>> thunks via
>> libgcc.
>> 
>> I agree we can followup with a fix for Solaris given lack of a public
>> testing machine.
>
> My memory is bit dim, but I am convinced I was fixing specific errors for
> comdats
> on Solaris, so I think the toolchain supports them in some sort, just is more
> restrictive/different from GNU implementation.

comdat does work just fine in Solaris 11, but the Solaris 10 linker has
problems with what gcc generates.

> Indeed, i think just producing sorry, unimplemented message is what we should 
> do
> if we can't support retpoline on given target.

Certainly, coupled with an appropriate effective-target keyword to limit
testcases appropriately.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: Move pa.h FUNCTION_ARG_SIZE to pa.c (PR83858)

2018-01-16 Thread John David Anglin


On 2018-01-16 9:48 AM, Richard Sandiford wrote:

Oops, yes.  Installed with that change, thanks.
Oops, I just realized the CEIL function needs to be applied to the 
GET_MODE_SIZE

return as well...

Dave

--
John David Anglin  dave.ang...@bell.net

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Rainer Orth

Hi Richard,

> I'm quite sure Solaris supports comdats, after all it invented ELF ;)

true: gcc/configure.ac has

  # Sun ld has COMDAT group support since Solaris 9, but it doesn't
  # interoperate with GNU as until Solaris 11 build 130, i.e. ld
  # version 1.688.
  #
  # If using Sun as for COMDAT group as emitted by GCC, one needs at
  # least ld version 1.2267.

> I've also seen
> comdats in debugging early LTO issues.  We might run into Solaris as
> issues though.

The Solaris code has been taught to deal with that, so it should
hopefully be hidden from the rest of the compiler.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] i386: More use reference of struct ix86_frame to avoid copy

2018-01-16 Thread Martin Liška

On 01/16/2018 01:35 PM, H.J. Lu wrote:
> On Tue, Jan 16, 2018 at 3:40 AM, H.J. Lu  wrote:
>> This patch has been used with my Spectre backport for GCC 7 for many
>> weeks and has been checked into GCC 7 branch.  Should I revert it on
>> GCC 7 branch or check it into trunk?
> 
> Ada build failed with this on trunk:
> 
> raised STORAGE_ERROR : stack overflow or erroneous memory access
> make[5]: *** [/export/gnu/import/git/sources/gcc/gcc/ada/Make-generated.in:45:
> ada/sinfo.h] Error 1

Hello.

I know that you've already reverted the change, but it's possible to replace
struct ix86_frame &frame = cfun->machine->frame;

with:
struct ix86_frame *frame = &cfun->machine->frame;

And replace usages with point access operator (->). That would also avoid 
copying.

One another question. After you switched to references, isn't the behavior of 
function
ix86_expand_epilogue as it also contains write to frame struct like:

 14799/* Special care must be taken for the normal return case of a function
 14800   using eh_return: the eax and edx registers are marked as saved, but
 14801   not restored along this path.  Adjust the save location to match.  
*/
 14802if (crtl->calls_eh_return && style != 2)
 14803  frame.reg_save_offset -= 2 * UNITS_PER_WORD;

Thanks for clarification.
Martin

> 
> Let me revert it on gcc-7-branch.
> 
> H.J.
>> H.J.
>> ---
>> When there is no need to make a copy of ix86_frame, we can use reference
>> of struct ix86_frame to avoid copy.
>>
>> * config/i386/i386.c (ix86_expand_prologue): Use reference of
>> struct ix86_frame.
>> (ix86_expand_epilogue): Likewise.
>> ---
>>  gcc/config/i386/i386.c | 6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index bfb31db8752..9eba3ffd5d6 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -13385,7 +13385,6 @@ ix86_expand_prologue (void)
>>  {
>>struct machine_function *m = cfun->machine;
>>rtx insn, t;
>> -  struct ix86_frame frame;
>>HOST_WIDE_INT allocate;
>>bool int_registers_saved;
>>bool sse_registers_saved;
>> @@ -13413,7 +13412,7 @@ ix86_expand_prologue (void)
>>m->fs.sp_valid = true;
>>m->fs.sp_realigned = false;
>>
>> -  frame = m->frame;
>> +  struct ix86_frame &frame = cfun->machine->frame;
>>
>>if (!TARGET_64BIT && ix86_function_ms_hook_prologue 
>> (current_function_decl))
>>  {
>> @@ -14291,7 +14290,6 @@ ix86_expand_epilogue (int style)
>>  {
>>struct machine_function *m = cfun->machine;
>>struct machine_frame_state frame_state_save = m->fs;
>> -  struct ix86_frame frame;
>>bool restore_regs_via_mov;
>>bool using_drap;
>>bool restore_stub_is_tail = false;
>> @@ -14304,7 +14302,7 @@ ix86_expand_epilogue (int style)
>>  }
>>
>>ix86_finalize_stack_frame_flags ();
>> -  frame = m->frame;
>> +  struct ix86_frame &frame = cfun->machine->frame;
>>
>>m->fs.sp_realigned = stack_realign_fp;
>>m->fs.sp_valid = stack_realign_fp
>> --
>> 2.14.3
>>
> 
> 
>

Re: [PATCH] Fix gimplify_one_sizepos (PR libgomp/83590, take 4)

2018-01-16 Thread Richard Biener

On Tue, 16 Jan 2018, Jakub Jelinek wrote:

> Hi!
> 
> After lengthy IRC discussions, here is an updated patch, which should also
> fix the problem that variably_modified_type_p on a REAL_TYPE returns true
> even when it has constant maximum and minimum.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> 2018-01-16  Jakub Jelinek  
>   Richard Biener  
> 
>   PR libgomp/83590
>   * gimplify.c (gimplify_one_sizepos): For is_gimple_constant (expr)
>   return early, inline manually is_gimple_sizepos.  Make sure if we
>   call gimplify_expr we don't end up with a gimple constant.
>   * tree.c (variably_modified_type_p): Don't return true for
>   is_gimple_constant (_t).  Inline manually is_gimple_sizepos.
>   * gimplify.h (is_gimple_sizepos): Remove.
> 
> --- gcc/gimplify.c.jj 2018-01-12 16:38:50.705238254 +0100
> +++ gcc/gimplify.c2018-01-16 12:21:15.895859416 +0100
> @@ -12562,7 +12562,10 @@ gimplify_one_sizepos (tree *expr_p, gimp
>   a VAR_DECL.  If it's a VAR_DECL from another function, the gimplifier
>   will want to replace it with a new variable, but that will cause 
> problems
>   if this type is from outside the function.  It's OK to have that here.  
> */
> -  if (is_gimple_sizepos (expr))
> +  if (expr == NULL_TREE
> +  || is_gimple_constant (expr)
> +  || TREE_CODE (expr) == VAR_DECL
> +  || CONTAINS_PLACEHOLDER_P (expr))
>  return;
>  
>*expr_p = unshare_expr (expr);
> @@ -12570,6 +12573,12 @@ gimplify_one_sizepos (tree *expr_p, gimp
>/* SSA names in decl/type fields are a bad idea - they'll get reclaimed
>   if the def vanishes.  */
>gimplify_expr (expr_p, stmt_p, NULL, is_gimple_val, fb_rvalue, false);
> +
> +  /* If expr wasn't already is_gimple_sizepos or is_gimple_constant from the
> + FE, ensure that it is a VAR_DECL, otherwise we might handle some decls
> + as gimplify_vla_decl even when they would have all sizes INTEGER_CSTs.  
> */
> +  if (is_gimple_constant (*expr_p))
> +*expr_p = get_initialized_tmp_var (*expr_p, stmt_p, NULL, false);
>  }
>  
>  /* Gimplify the body of statements of FNDECL and return a GIMPLE_BIND node
> --- gcc/tree.c.jj 2018-01-15 10:01:40.830186474 +0100
> +++ gcc/tree.c2018-01-16 12:24:11.254821615 +0100
> @@ -8825,11 +8825,12 @@ variably_modified_type_p (tree type, tre
>do { tree _t = (T);
> \
>  if (_t != NULL_TREE  
> \
>   && _t != error_mark_node\
> - && TREE_CODE (_t) != INTEGER_CST\
> + && !CONSTANT_CLASS_P (_t)   \
>   && TREE_CODE (_t) != PLACEHOLDER_EXPR   \
>   && (!fn \
>   || (!TYPE_SIZES_GIMPLIFIED (type)   \
> - && !is_gimple_sizepos (_t)) \
> + && (TREE_CODE (_t) != VAR_DECL  \
> + && !CONTAINS_PLACEHOLDER_P (_t)))   \
>   || walk_tree (&_t, find_var_from_fn, fn, NULL)))\
>return true;  } while (0)
>  
> --- gcc/gimplify.h.jj 2018-01-03 10:19:53.757533721 +0100
> +++ gcc/gimplify.h2018-01-16 12:24:51.995812831 +0100
> @@ -85,23 +85,4 @@ extern enum gimplify_status gimplify_va_
> gimple_seq *);
>  gimple *gimplify_assign (tree, tree, gimple_seq *);
>  
> -/* Return true if gimplify_one_sizepos doesn't need to gimplify
> -   expr (when in TYPE_SIZE{,_UNIT} and similar type/decl size/bitsize
> -   fields).  */
> -
> -static inline bool
> -is_gimple_sizepos (tree expr)
> -{
> -  /* gimplify_one_sizepos doesn't need to do anything if the value isn't 
> there,
> - is constant, or contains A PLACEHOLDER_EXPR.  We also don't want to do
> - anything if it's already a VAR_DECL.  If it's a VAR_DECL from another
> - function, the gimplifier will want to replace it with a new variable,
> - but that will cause problems if this type is from outside the function.
> - It's OK to have that here.  */
> -  return (expr == NULL_TREE
> -   || TREE_CODE (expr) == INTEGER_CST
> -   || TREE_CODE (expr) == VAR_DECL
> -   || CONTAINS_PLACEHOLDER_P (expr));
> -}
> -
>  #endif /* GCC_GIMPLIFY_H */
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [C++ PATCH] Fix ICE in member_vec_dedup (PR c++/83825)

2018-01-16 Thread Nathan Sidwell


On 01/15/2018 04:46 PM, Jakub Jelinek wrote:

Hi!

As the testcase shows, calls to member_vec_dedup and qsort are just guarded
by the vector being non-NULL, which doesn't mean it must be non-empty,
so we can't do (*member_vec)[0] on it.  Fixed by the second hunk, the
rest is just a small cleanup to use the vec.h methods.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


Ok I'm a little surprised we get this case, but I think we've both found 
other strange boundary cases here.  thanks.


nathan
--
Nathan Sidwell

Re: Two fixes for live-out SLP inductions (PR 83857)

2018-01-16 Thread Richard Biener

On Tue, Jan 16, 2018 at 2:29 PM, Richard Sandiford
 wrote:
> vect_analyze_loop_operations was calling vectorizable_live_operation
> for all live-out phis, which led to a bogus ncopies calculation in
> the pure SLP case.  I think v_a_l_o should only be passing phis
> that are vectorised using normal loop vectorisation, since
> vect_slp_analyze_node_operations handles the SLP side (and knows
> the correct slp_index and slp_node arguments to pass in, via
> vect_analyze_stmt).
>
> With that fixed we hit an older bug that vectorizable_live_operation
> didn't handle live-out SLP inductions.  Fixed by using gimple_phi_result
> rather than gimple_get_lhs for phis.
>
> Tested on aarch64-linux-gnu.  OK to install?

Ok.

Richard.

> Richard
>
>
> 2018-01-16  Richard Sandiford  
>
> gcc/
> PR tree-optimization/83857
> * tree-vect-loop.c (vect_analyze_loop_operations): Don't call
> vectorizable_live_operation for pure SLP statements.
> (vectorizable_live_operation): Handle PHIs.
>
> gcc/testsuite/
> PR tree-optimization/83857
> * gcc.dg/vect/pr83857.c: New test.
>
> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c2018-01-13 18:02:00.950360196 +
> +++ gcc/tree-vect-loop.c2018-01-16 13:24:33.022528019 +
> @@ -1851,7 +1851,10 @@ vect_analyze_loop_operations (loop_vec_i
> ok = vectorizable_reduction (phi, NULL, NULL, NULL, NULL);
>  }
>
> - if (ok && STMT_VINFO_LIVE_P (stmt_info))
> + /* SLP PHIs are tested by vect_slp_analyze_node_operations.  */
> + if (ok
> + && STMT_VINFO_LIVE_P (stmt_info)
> + && !PURE_SLP_STMT (stmt_info))
> ok = vectorizable_live_operation (phi, NULL, NULL, -1, NULL);
>
>if (!ok)
> @@ -8217,7 +8220,11 @@ vectorizable_live_operation (gimple *stm
>gcc_assert (!LOOP_VINFO_FULLY_MASKED_P (loop_vinfo));
>
>/* Get the correct slp vectorized stmt.  */
> -  vec_lhs = gimple_get_lhs (SLP_TREE_VEC_STMTS (slp_node)[vec_entry]);
> +  gimple *vec_stmt = SLP_TREE_VEC_STMTS (slp_node)[vec_entry];
> +  if (gphi *phi = dyn_cast  (vec_stmt))
> +   vec_lhs = gimple_phi_result (phi);
> +  else
> +   vec_lhs = gimple_get_lhs (vec_stmt);
>
>/* Get entry to use.  */
>bitstart = bitsize_int (vec_index);
> Index: gcc/testsuite/gcc.dg/vect/pr83857.c
> ===
> --- /dev/null   2018-01-15 18:48:25.844002736 +
> +++ gcc/testsuite/gcc.dg/vect/pr83857.c 2018-01-16 13:24:33.021528058 +
> @@ -0,0 +1,30 @@
> +/* { dg-do run } */
> +/* { dg-additional-options "-ffast-math" } */
> +
> +#define N 100
> +
> +double __attribute__ ((noinline, noclone))
> +f (double *x, double y)
> +{
> +  double a = 0;
> +  for (int i = 0; i < N; ++i)
> +{
> +  a += y;
> +  x[i * 2] += a;
> +  x[i * 2 + 1] += a;
> +}
> +  return a - y;
> +}
> +
> +double x[N * 2];
> +
> +int
> +main (void)
> +{
> +  if (f (x, 5) != (N - 1) * 5)
> +__builtin_abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" { 
> target vect_double } } } */
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target 
> vect_double } } } */

[PATCH] Fix PR83867

2018-01-16 Thread Richard Biener


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-01-16  Richard Biener  

PR tree-optimization/83867
* tree-vect-stmts.c (vect_transform_stmt): Precompute
nested_in_vect_loop_p since the scalar stmt may get invalidated.

* gcc.dg/vect/pr83867.c: New testcase.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 256722)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -9426,6 +9426,11 @@ vect_transform_stmt (gimple *stmt, gimpl
   gcc_assert (slp_node || !PURE_SLP_STMT (stmt_info));
   gimple *old_vec_stmt = STMT_VINFO_VEC_STMT (stmt_info);
 
+  bool nested_p = (STMT_VINFO_LOOP_VINFO (stmt_info)
+  && nested_in_vect_loop_p
+   (LOOP_VINFO_LOOP (STMT_VINFO_LOOP_VINFO (stmt_info)),
+stmt));
+
   switch (STMT_VINFO_TYPE (stmt_info))
 {
 case type_demotion_vec_info_type:
@@ -9525,9 +9530,7 @@ vect_transform_stmt (gimple *stmt, gimpl
   /* Handle inner-loop stmts whose DEF is used in the loop-nest that
  is being vectorized, but outside the immediately enclosing loop.  */
   if (vec_stmt
-  && STMT_VINFO_LOOP_VINFO (stmt_info)
-  && nested_in_vect_loop_p (LOOP_VINFO_LOOP (
-STMT_VINFO_LOOP_VINFO (stmt_info)), stmt)
+  && nested_p
   && STMT_VINFO_TYPE (stmt_info) != reduc_vec_info_type
   && (STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_outer
   || STMT_VINFO_RELEVANT (stmt_info) ==
Index: gcc/testsuite/gcc.dg/vect/pr83867.c
===
--- gcc/testsuite/gcc.dg/vect/pr83867.c (nonexistent)
+++ gcc/testsuite/gcc.dg/vect/pr83867.c (working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O -ftrapv" } */
+
+int
+k5 (int u5, int aw)
+{
+  int v6;
+
+  while (u5 < 1)
+{
+  while (v6 < 4)
+   ++v6;
+
+  v6 = 0;
+  aw += u5 > 0;
+  ++u5;
+}
+
+  return aw;
+}

Re: [PATCH] rtlanal: dead_or_set_regno_p should handle CLOBBER (PR83424)

2018-01-16 Thread Jeff Law

On 01/16/2018 06:41 AM, Segher Boessenkool wrote:
> On Mon, Dec 18, 2017 at 12:16:13PM -0700, Jeff Law wrote:
>> On 12/16/2017 02:03 PM, Segher Boessenkool wrote:
>>> In PR83424 combine's move_deaths puts a REG_DEAD not in the wrong place
>>> because dead_or_set_regno_p does not account for CLOBBER insns.  This
>>> fixes it.
>>>
>>> Bootstrapped and tested on powerpc64-linux {-m32,-m64} and on x86_64-linux.
>>> Is this okay for trunk?
>>>
>>>
>>> Segher
>>>
>>>
>>> 2017-12-16  Segher Boessenkool  
>>>
>>> PR rtl-optimization/83424
>>> * rtlanal.c (dead_or_set_regno_p): Handle CLOBBER just like SET.
>>>
>>> gcc/testsuite/
>>> PR rtl-optimization/83424
>>> * gcc.dg/pr83424.c: New testsuite.
>> OK.
> 
> Is this okay for backports to 7 and 6, too?
Yes.
jeff

Re: [PATCH v2] Change default to -fno-math-errno

2018-01-16 Thread Wilco Dijkstra

Joseph Myers wrote:

> Another question to consider: what about configurations (mostly 
> soft-float) where floating-point exceptions are not supported?  (glibc 
> wrongly defines math_errhandling to include MATH_ERREXCEPT there, but the 
> only option actually permitted by C99 in that case would be to define it 
> to MATH_ERRNO.)
> 
> If we wish to distinguish that case, the 
> targetm.float_exceptions_rounding_supported_p hook is the one to use (in 
> the absence of anyone identifying a target that supports exceptions but 
> not rounding modes) - possibly together with flag_iso.

I looked into this and the issue is that calling targetm functions is not 
possible until
the backend is fully initialized (whether the pattern exists or not is not 
sufficient,
the pattern condition must be valid to evaluate as well), and that happens after
option parsing.

In general soft-float is used on tiny targets which don't use errno at all (as 
in
remove all the code dealing with it, including the errno variable itself!), so I
believe it's best to let people explicitly enable -fmath-errno in the rare case
when they really want to.

>> lroundf in GLIBC doesn't set errno, so all the inefficiency was for nothing:
>
> (glibc bug 6797.)

I see, that explains it! A decade old bug - it shows the popularity of errno...

Wilco

Re: [PATCH] i386: More use reference of struct ix86_frame to avoid copy

2018-01-16 Thread H.J. Lu

On Tue, Jan 16, 2018 at 7:03 AM, Martin Liška  wrote:
> On 01/16/2018 01:35 PM, H.J. Lu wrote:
>> On Tue, Jan 16, 2018 at 3:40 AM, H.J. Lu  wrote:
>>> This patch has been used with my Spectre backport for GCC 7 for many
>>> weeks and has been checked into GCC 7 branch.  Should I revert it on
>>> GCC 7 branch or check it into trunk?
>>
>> Ada build failed with this on trunk:
>>
>> raised STORAGE_ERROR : stack overflow or erroneous memory access
>> make[5]: *** 
>> [/export/gnu/import/git/sources/gcc/gcc/ada/Make-generated.in:45:
>> ada/sinfo.h] Error 1
>
> Hello.
>
> I know that you've already reverted the change, but it's possible to replace
> struct ix86_frame &frame = cfun->machine->frame;
>
> with:
> struct ix86_frame *frame = &cfun->machine->frame;
>
> And replace usages with point access operator (->). That would also avoid 
> copying.

Won't it be equivalent to reference?

> One another question. After you switched to references, isn't the behavior of 
> function
> ix86_expand_epilogue as it also contains write to frame struct like:
>
>  14799/* Special care must be taken for the normal return case of a 
> function
>  14800   using eh_return: the eax and edx registers are marked as saved, 
> but
>  14801   not restored along this path.  Adjust the save location to 
> match.  */
>  14802if (crtl->calls_eh_return && style != 2)
>  14803  frame.reg_save_offset -= 2 * UNITS_PER_WORD;

That could be the issue.  I will double check it.

Thanks.

H.J.
> Thanks for clarification.
> Martin
>
>>
>> Let me revert it on gcc-7-branch.
>>
>> H.J.
>>> H.J.
>>> ---
>>> When there is no need to make a copy of ix86_frame, we can use reference
>>> of struct ix86_frame to avoid copy.
>>>
>>> * config/i386/i386.c (ix86_expand_prologue): Use reference of
>>> struct ix86_frame.
>>> (ix86_expand_epilogue): Likewise.
>>> ---
>>>  gcc/config/i386/i386.c | 6 ++
>>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>> index bfb31db8752..9eba3ffd5d6 100644
>>> --- a/gcc/config/i386/i386.c
>>> +++ b/gcc/config/i386/i386.c
>>> @@ -13385,7 +13385,6 @@ ix86_expand_prologue (void)
>>>  {
>>>struct machine_function *m = cfun->machine;
>>>rtx insn, t;
>>> -  struct ix86_frame frame;
>>>HOST_WIDE_INT allocate;
>>>bool int_registers_saved;
>>>bool sse_registers_saved;
>>> @@ -13413,7 +13412,7 @@ ix86_expand_prologue (void)
>>>m->fs.sp_valid = true;
>>>m->fs.sp_realigned = false;
>>>
>>> -  frame = m->frame;
>>> +  struct ix86_frame &frame = cfun->machine->frame;
>>>
>>>if (!TARGET_64BIT && ix86_function_ms_hook_prologue 
>>> (current_function_decl))
>>>  {
>>> @@ -14291,7 +14290,6 @@ ix86_expand_epilogue (int style)
>>>  {
>>>struct machine_function *m = cfun->machine;
>>>struct machine_frame_state frame_state_save = m->fs;
>>> -  struct ix86_frame frame;
>>>bool restore_regs_via_mov;
>>>bool using_drap;
>>>bool restore_stub_is_tail = false;
>>> @@ -14304,7 +14302,7 @@ ix86_expand_epilogue (int style)
>>>  }
>>>
>>>ix86_finalize_stack_frame_flags ();
>>> -  frame = m->frame;
>>> +  struct ix86_frame &frame = cfun->machine->frame;
>>>
>>>m->fs.sp_realigned = stack_realign_fp;
>>>m->fs.sp_valid = stack_realign_fp
>>> --
>>> 2.14.3
>>>
>>
>>
>>
>



-- 
H.J.

VIEW_CONVERT_EXPR slots for strict-align targets (PR 83884)

2018-01-16 Thread Richard Sandiford

This PR is about a case in which we VIEW_CONVERT a variable-sized
unaligned record:

 
unit-size 
align:8 ...>

to an aligned 32-bit integer.  The strict-alignment handling of
this case creates an aligned temporary slot, moves the operand
into the slot in the operand's original mode, then accesses the
slot in the more-aligned result mode.

Previously the size of the temporary slot was calculated using:

  HOST_WIDE_INT temp_size
= MAX (int_size_in_bytes (inner_type),
   (HOST_WIDE_INT) GET_MODE_SIZE (mode));

int_size_in_bytes would return -1 for the variable-length type,
so we'd use the size of the result mode for the slot.  r256152 replaced
int_size_in_bytes with tree_to_poly_uint64, which triggered an ICE.

I'd assumed that variable-length types couldn't occur here, since it
seems strange to view-convert a variable-length type to a fixed-length
one.  It also seemed strange that (with the old code) we'd ignore the
size of the operand if it was a variable V but honour it if it was a
constant C, even though it's presumably possible for V to equal that
C at runtime.

If op0 has BLKmode we do a block copy of GET_MODE_SIZE (mode) bytes
and then convert the slot to "mode":

  poly_uint64 mode_size = GET_MODE_SIZE (mode);
  ...
  if (GET_MODE (op0) == BLKmode)
{
  rtx size_rtx = gen_int_mode (mode_size, Pmode);
  emit_block_move (new_with_op0_mode, op0, size_rtx,
   (modifier == EXPAND_STACK_PARM
? BLOCK_OP_CALL_PARM
: BLOCK_OP_NORMAL));
}
  else
...

  op0 = new_rtx;
}
}

  op0 = adjust_address (op0, mode, 0);

so I think in that case just the size of "mode" is enough, even if op0
is a fixed-size type.  For non-BLKmode op0 we first move in op0's mode
and then convert the slot to "mode":

emit_move_insn (new_with_op0_mode, op0);

  op0 = new_rtx;
}
}

  op0 = adjust_address (op0, mode, 0);

so I think we want the maximum of the two mode sizes in that case
(assuming they can be different sizes).

But is this VIEW_CONVERT_EXPR really valid?  Maybe this is just
papering over a deeper issue.  There again, the MAX in the old
code was presumably there because the sizes can be different...

Richard


2018-01-16  Richard Sandiford  

gcc/
PR middle-end/83884
* expr.c (expand_expr_real_1): Use the size of GET_MODE (op0)
rather than the size of inner_type to determine the stack slot size
when handling VIEW_CONVERT_EXPRs on strict-alignment targets.

Index: gcc/expr.c
===
--- gcc/expr.c  2018-01-14 08:42:44.497155977 +
+++ gcc/expr.c  2018-01-16 16:07:22.737883774 +
@@ -11145,11 +11145,11 @@ expand_expr_real_1 (tree exp, rtx target
}
  else if (STRICT_ALIGNMENT)
{
- tree inner_type = TREE_TYPE (treeop0);
  poly_uint64 mode_size = GET_MODE_SIZE (mode);
- poly_uint64 op0_size
-   = tree_to_poly_uint64 (TYPE_SIZE_UNIT (inner_type));
- poly_int64 temp_size = upper_bound (op0_size, mode_size);
+ poly_uint64 temp_size = mode_size;
+ if (GET_MODE (op0) != BLKmode)
+   temp_size = upper_bound (temp_size,
+GET_MODE_SIZE (GET_MODE (op0)));
  rtx new_rtx
= assign_stack_temp_for_type (mode, temp_size, type);
  rtx new_with_op0_mode

[PATCH v2][AArch64] Remove remaining uses of * in patterns

2018-01-16 Thread Wilco Dijkstra

v2: Rebased after the big SVE commits

Remove the remaining uses of '*' from aarch64.md.
Using '*' in alternatives is typically incorrect as it tells the register
allocator to ignore those alternatives.  Also add a missing '?' so we
prefer a floating point register for same-size int<->fp conversions.

Passes regress & bootstrap, OK for commit?

ChangeLog:
2018-01-16  Wilco Dijkstra  

* config/aarch64/aarch64.md (mov): Remove '*' in alternatives.
(movsi_aarch64): Likewise.
(load_pairsi): Likewise.
(load_pairdi): Likewise.
(store_pairsi): Likewise.
(store_pairdi): Likewise.
(load_pairsf): Likewise.
(load_pairdf): Likewise.
(store_pairsf): Likewise.
(store_pairdf): Likewise.
(zero_extend): Likewise.
(fcvt_target): Add '?' to prefer w over r.

gcc/testsuite/
* gcc.target/aarch64/vfp-1.c: Update test.

--
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
e52e8350a203b288208c1acb12c8b881d5e8039a..088ed8cb0aad0be08a7e19064708ea14499230f2
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -907,8 +907,8 @@ (define_expand "mov"
 )
 
 (define_insn "*mov_aarch64"
-  [(set (match_operand:SHORT 0 "nonimmediate_operand" "=r,r,   *w,r ,r,*w, m, 
m, r,*w,*w")
-   (match_operand:SHORT 1 "aarch64_mov_operand"  " r,M,D,Usv,m, 
m,rZ,*w,*w, r,*w"))]
+  [(set (match_operand:SHORT 0 "nonimmediate_operand" "=r,r, w,r ,r,w, 
m,m,r,w,w")
+   (match_operand:SHORT 1 "aarch64_mov_operand"  " 
r,M,D,Usv,m,m,rZ,w,w,r,w"))]
   "(register_operand (operands[0], mode)
 || aarch64_reg_or_zero (operands[1], mode))"
 {
@@ -974,7 +974,7 @@ (define_expand "mov"
 
 (define_insn_and_split "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m,  
r,  r, w,r,w, w")
-   (match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,*w,Usa,Ush,rZ,w,w,Ds"))]
+   (match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,w,Usa,Ush,rZ,w,w,Ds"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@
@@ -1281,9 +1281,9 @@ (define_expand "movmemdi"
 ;; Operands 1 and 3 are tied together by the final condition; so we allow
 ;; fairly lax checking on the second memory operation.
 (define_insn "load_pairsi"
-  [(set (match_operand:SI 0 "register_operand" "=r,*w")
+  [(set (match_operand:SI 0 "register_operand" "=r,w")
(match_operand:SI 1 "aarch64_mem_pair_operand" "Ump,Ump"))
-   (set (match_operand:SI 2 "register_operand" "=r,*w")
+   (set (match_operand:SI 2 "register_operand" "=r,w")
(match_operand:SI 3 "memory_operand" "m,m"))]
   "rtx_equal_p (XEXP (operands[3], 0),
plus_constant (Pmode,
@@ -1297,9 +1297,9 @@ (define_insn "load_pairsi"
 )
 
 (define_insn "load_pairdi"
-  [(set (match_operand:DI 0 "register_operand" "=r,*w")
+  [(set (match_operand:DI 0 "register_operand" "=r,w")
(match_operand:DI 1 "aarch64_mem_pair_operand" "Ump,Ump"))
-   (set (match_operand:DI 2 "register_operand" "=r,*w")
+   (set (match_operand:DI 2 "register_operand" "=r,w")
(match_operand:DI 3 "memory_operand" "m,m"))]
   "rtx_equal_p (XEXP (operands[3], 0),
plus_constant (Pmode,
@@ -1317,9 +1317,9 @@ (define_insn "load_pairdi"
 ;; fairly lax checking on the second memory operation.
 (define_insn "store_pairsi"
   [(set (match_operand:SI 0 "aarch64_mem_pair_operand" "=Ump,Ump")
-   (match_operand:SI 1 "aarch64_reg_or_zero" "rZ,*w"))
+   (match_operand:SI 1 "aarch64_reg_or_zero" "rZ,w"))
(set (match_operand:SI 2 "memory_operand" "=m,m")
-   (match_operand:SI 3 "aarch64_reg_or_zero" "rZ,*w"))]
+   (match_operand:SI 3 "aarch64_reg_or_zero" "rZ,w"))]
   "rtx_equal_p (XEXP (operands[2], 0),
plus_constant (Pmode,
   XEXP (operands[0], 0),
@@ -1333,9 +1333,9 @@ (define_insn "store_pairsi"
 
 (define_insn "store_pairdi"
   [(set (match_operand:DI 0 "aarch64_mem_pair_operand" "=Ump,Ump")
-   (match_operand:DI 1 "aarch64_reg_or_zero" "rZ,*w"))
+   (match_operand:DI 1 "aarch64_reg_or_zero" "rZ,w"))
(set (match_operand:DI 2 "memory_operand" "=m,m")
-   (match_operand:DI 3 "aarch64_reg_or_zero" "rZ,*w"))]
+   (match_operand:DI 3 "aarch64_reg_or_zero" "rZ,w"))]
   "rtx_equal_p (XEXP (operands[2], 0),
plus_constant (Pmode,
   XEXP (operands[0], 0),
@@ -1350,9 +1350,9 @@ (define_insn "store_pairdi"
 ;; Operands 1 and 3 are tied together by the final condition; so we allow
 ;; fairly lax checking on the second memory operation.
 (define_insn "load_pairsf"
-  [(set (match_operand:SF 0 "register_operand" "=w,*r")
+  [(set (match_operand:SF 0 "register_operand" "=w,r")
(match_operand:SF 1 "aarch64_mem_pair_operand" "Ump,Ump"))
-   (set (match_operand:SF 2 "register_operand" "=w,*r")
+   (set (match_operand:SF 2 "reg

Re: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread James Greenhalgh

On Tue, Jan 16, 2018 at 02:21:30PM +, Tamar Christina wrote:
> Hi Kyrill,
> 
> > 
> > xgene1 was added a few releases ago, better to use one of the new additions 
> > from the above list.
> > For example -mtune=cortex-r52.
> 
> Thanks, I have updated the patch. I'll wait for an ok from an AArch64 
> maintainer and a Docs maintainer.

OK. But you have the same issue in the AArch64 part.

James

> Index: htdocs/gcc-8/changes.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
> retrieving revision 1.26
> diff -u -r1.26 changes.html
> --- htdocs/gcc-8/changes.html 11 Jan 2018 09:31:53 -  1.26
> +++ htdocs/gcc-8/changes.html 16 Jan 2018 14:12:57 -
> @@ -147,7 +147,51 @@
>  
>  AArch64
>  
> -  
> +  
> +The Armv8.4-A architecture is now supported.  It can be used by
> +specifying the -march=armv8.4-a option.
> +  
> +  
> +The Dot Product instructions are now supported as an optional extension 
> to the
> +Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The 
> extension can be used by
> +specifying the +dotprod architecture extension.  E.g. 
> -march=armv8.2-a+dotprod.
> +  
> +  
> +The Armv8-A +crypto extension has now been split into two 
> extensions for finer grained control:
> +
> +   +aes which contains the Armv8-A AES crytographic 
> instructions.
> +   +sha2 which contains the Armv8-A SHA2 and SHA1 
> cryptographic instructions.
> +
> +Using +crypto will now enable these two extensions.
> +  
> +  
> +New Armv8.4-A FP16 Floating Point Multiplication Variant instructions 
> have been added.  These instructions are
> +mandatory in Armv8.4-A but available as an optional extension to 
> Armv8.2-A and Armv8.3-A.  The new extension
> +can be used by specifying the +fp16fml architectural 
> extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
> +the instructions can be enabled by specifying +fp16.
> +  
> +  
> +New cryptographic instructions have been added as optional extensions to 
> Armv8.2-A and newer.  These instructions can
> +be enabled with:
> +
> +  +sha3 New SHA3 and SHA2 instructions from Armv8.4-A.  
> This implies +sha2.
> +  +sm4 New SM3 and SM4 instructions from Armv8.4-A.
> +
> + 
> +  
> +   Support has been added for the following processors
> +   (GCC identifiers in parentheses):
> +   
> + Arm Cortex-A75 (cortex-a75).
> +  Arm Cortex-A55 (cortex-a55).
> +  Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE 
> (cortex-a75.cortex-a55).
> +   
> +   The GCC identifiers can be used
> +   as arguments to the -mcpu or -mtune options,
> +   for example: -mcpu=cortex-a75 or
> +   -mtune=thunderx2t99p1 or as arguments to the equivalent 
> target
> +   attributes and pragmas.
> +  
>  
>  
>  ARM
> @@ -169,14 +213,58 @@
>  removed in a future release.
>
>
> -The default link behavior for ARMv6 and ARMv7-R targets has been
> +The default link behavior for Armv6 and Armv7-R targets has been
>  changed to produce BE8 format when generating big-endian images.  A new
>  flag -mbe32 can be used to force the linker to produce
>  legacy BE32 format images.  There is no change of behavior for
> -ARMv6-m and other ARMv7 or later targets: these already defaulted
> +Armv6-M and other Armv7 or later targets: these already defaulted
>  to BE8 format.  This change brings GCC into alignment with other
>  compilers for the ARM architecture.
>
> +  
> +The Armv8-R architecture is now supported.  It can be used by specifying 
> the
> +-march=armv8-r option.
> +  
> +  
> +The Armv8.3-A architecture is now supported.  It can be used by
> +specifying the -march=armv8.3-a option.
> +  
> +  
> +The Armv8.4-A architecture is now supported.  It can be used by
> +specifying the -march=armv8.4-a option.
> +  
> +  
> + The Dot Product instructions are now supported as an optional extension 
> to the
> + Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The 
> extension can be used by
> + specifying the +dotprod architecture extension.  E.g. 
> -march=armv8.2-a+dotprod.
> +  
> +
> +  
> +Support for setting extensions and architectures using the GCC target 
> pragma and attribute has been added.
> +It can be used by specifying #pragma GCC target 
> ("arch=..."), #pragma GCC target ("+extension"),
> +__attribute__((target("arch=..."))) or 
> __attribute__((target("+extension"))).
> +  
> +  
> +New Armv8.4-A FP16 Floating Point Multiplication Variant instructions 
> have been added.  These instructions are
> +mandatory in Armv8.4-A but available as an optional extension to 
> Armv8.2-A and Armv8.3-A.  The new extension
> +can be used by specifying the +fp16fml architectural 
> extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
> +the instructions can be ena

Re: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread Tamar Christina

Th 01/16/2018 16:36, James Greenhalgh wrote:
> On Tue, Jan 16, 2018 at 02:21:30PM +, Tamar Christina wrote:
> > Hi Kyrill,
> > 
> > > 
> > > xgene1 was added a few releases ago, better to use one of the new 
> > > additions from the above list.
> > > For example -mtune=cortex-r52.
> > 
> > Thanks, I have updated the patch. I'll wait for an ok from an AArch64 
> > maintainer and a Docs maintainer.
> 
> OK. But you have the same issue in the AArch64 part.

Thanks, I've updated the patch, I'll wait for a bit for a doc reviewer if I 
don't hear anything I'll assume
the patch is OK.

Thanks,
Tamar
> 
> James
> 
> > Index: htdocs/gcc-8/changes.html
> > ===
> > RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
> > retrieving revision 1.26
> > diff -u -r1.26 changes.html
> > --- htdocs/gcc-8/changes.html   11 Jan 2018 09:31:53 -  1.26
> > +++ htdocs/gcc-8/changes.html   16 Jan 2018 14:12:57 -
> > @@ -147,7 +147,51 @@
> >  
> >  AArch64
> >  
> > -  
> > +  
> > +The Armv8.4-A architecture is now supported.  It can be used by
> > +specifying the -march=armv8.4-a option.
> > +  
> > +  
> > +The Dot Product instructions are now supported as an optional 
> > extension to the
> > +Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The 
> > extension can be used by
> > +specifying the +dotprod architecture extension.  E.g. 
> > -march=armv8.2-a+dotprod.
> > +  
> > +  
> > +The Armv8-A +crypto extension has now been split into two 
> > extensions for finer grained control:
> > +
> > +   +aes which contains the Armv8-A AES crytographic 
> > instructions.
> > +   +sha2 which contains the Armv8-A SHA2 and SHA1 
> > cryptographic instructions.
> > +
> > +Using +crypto will now enable these two extensions.
> > +  
> > +  
> > +New Armv8.4-A FP16 Floating Point Multiplication Variant instructions 
> > have been added.  These instructions are
> > +mandatory in Armv8.4-A but available as an optional extension to 
> > Armv8.2-A and Armv8.3-A.  The new extension
> > +can be used by specifying the +fp16fml architectural 
> > extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
> > +the instructions can be enabled by specifying +fp16.
> > +  
> > +  
> > +New cryptographic instructions have been added as optional extensions 
> > to Armv8.2-A and newer.  These instructions can
> > +be enabled with:
> > +
> > +  +sha3 New SHA3 and SHA2 instructions from 
> > Armv8.4-A.  This implies +sha2.
> > +  +sm4 New SM3 and SM4 instructions from Armv8.4-A.
> > +
> > + 
> > +  
> > +   Support has been added for the following processors
> > +   (GCC identifiers in parentheses):
> > +   
> > + Arm Cortex-A75 (cortex-a75).
> > +Arm Cortex-A55 (cortex-a55).
> > +Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE 
> > (cortex-a75.cortex-a55).
> > +   
> > +   The GCC identifiers can be used
> > +   as arguments to the -mcpu or -mtune 
> > options,
> > +   for example: -mcpu=cortex-a75 or
> > +   -mtune=thunderx2t99p1 or as arguments to the 
> > equivalent target
> > +   attributes and pragmas.
> > +  
> >  
> >  
> >  ARM
> > @@ -169,14 +213,58 @@
> >  removed in a future release.
> >
> >
> > -The default link behavior for ARMv6 and ARMv7-R targets has been
> > +The default link behavior for Armv6 and Armv7-R targets has been
> >  changed to produce BE8 format when generating big-endian images.  A new
> >  flag -mbe32 can be used to force the linker to produce
> >  legacy BE32 format images.  There is no change of behavior for
> > -ARMv6-m and other ARMv7 or later targets: these already defaulted
> > +Armv6-M and other Armv7 or later targets: these already defaulted
> >  to BE8 format.  This change brings GCC into alignment with other
> >  compilers for the ARM architecture.
> >
> > +  
> > +The Armv8-R architecture is now supported.  It can be used by 
> > specifying the
> > +-march=armv8-r option.
> > +  
> > +  
> > +The Armv8.3-A architecture is now supported.  It can be used by
> > +specifying the -march=armv8.3-a option.
> > +  
> > +  
> > +The Armv8.4-A architecture is now supported.  It can be used by
> > +specifying the -march=armv8.4-a option.
> > +  
> > +  
> > + The Dot Product instructions are now supported as an optional 
> > extension to the
> > + Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The 
> > extension can be used by
> > + specifying the +dotprod architecture extension.  E.g. 
> > -march=armv8.2-a+dotprod.
> > +  
> > +
> > +  
> > +Support for setting extensions and architectures using the GCC target 
> > pragma and attribute has been added.
> > +It can be used by specifying #pragma GCC target 
> > ("arch=..."), #pragma GCC target ("+extension"),
> > +__attribute__((target

[PATCH 2/2] GCC 6: ii386: Use reference of struct ix86_frame to avoid copy

2018-01-16 Thread H.J. Lu

From: hjl 

When there is no need to make a copy of ix86_frame, we can use reference
of struct ix86_frame to avoid copy.

Backport from mainline
2017-11-06  H.J. Lu  

* config/i386/i386.c (ix86_can_use_return_insn_p): Use reference
of struct ix86_frame.
(ix86_initial_elimination_offset): Likewise.
(ix86_expand_split_stack_prologue): Likewise.
---
 gcc/config/i386/i386.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a1ff32b648b..13ebf107e90 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10887,7 +10887,6 @@ symbolic_reference_mentioned_p (rtx op)
 bool
 ix86_can_use_return_insn_p (void)
 {
-  struct ix86_frame frame;
 
   if (! reload_completed || frame_pointer_needed)
 return 0;
@@ -10898,7 +10897,7 @@ ix86_can_use_return_insn_p (void)
 return 0;
 
   ix86_compute_frame_layout ();
-  frame = cfun->machine->frame;
+  struct ix86_frame &frame = cfun->machine->frame;
   return (frame.stack_pointer_offset == UNITS_PER_WORD
  && (frame.nregs + frame.nsseregs) == 0);
 }
@@ -11310,7 +11309,7 @@ HOST_WIDE_INT
 ix86_initial_elimination_offset (int from, int to)
 {
   ix86_compute_frame_layout ();
-  struct ix86_frame frame = cfun->machine->frame;
+  struct ix86_frame &frame = cfun->machine->frame;
 
   if (from == ARG_POINTER_REGNUM && to == HARD_FRAME_POINTER_REGNUM)
 return frame.hard_frame_pointer_offset;
@@ -13821,7 +13820,6 @@ static GTY(()) rtx split_stack_fn_large;
 void
 ix86_expand_split_stack_prologue (void)
 {
-  struct ix86_frame frame;
   HOST_WIDE_INT allocate;
   unsigned HOST_WIDE_INT args_size;
   rtx_code_label *label;
@@ -13834,7 +13832,7 @@ ix86_expand_split_stack_prologue (void)
 
   ix86_finalize_stack_realign_flags ();
   ix86_compute_frame_layout ();
-  frame = cfun->machine->frame;
+  struct ix86_frame &frame = cfun->machine->frame;
   allocate = frame.stack_pointer_offset - INCOMING_FRAME_SP_OFFSET;
 
   /* This is the label we will branch to if we have enough stack
-- 
2.14.3

Re: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread Tamar Christina

I seem to have forgotten the patch :)

The 01/16/2018 16:56, Tamar Christina wrote:
> Th 01/16/2018 16:36, James Greenhalgh wrote:
> > On Tue, Jan 16, 2018 at 02:21:30PM +, Tamar Christina wrote:
> > > Hi Kyrill,
> > > 
> > > > 
> > > > xgene1 was added a few releases ago, better to use one of the new 
> > > > additions from the above list.
> > > > For example -mtune=cortex-r52.
> > > 
> > > Thanks, I have updated the patch. I'll wait for an ok from an AArch64 
> > > maintainer and a Docs maintainer.
> > 
> > OK. But you have the same issue in the AArch64 part.
> 
> Thanks, I've updated the patch, I'll wait for a bit for a doc reviewer if I 
> don't hear anything I'll assume
> the patch is OK.
> 
> Thanks,
> Tamar
> > 
> > James
> > 
> > > Index: htdocs/gcc-8/changes.html
> > > ===
> > > RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
> > > retrieving revision 1.26
> > > diff -u -r1.26 changes.html
> > > --- htdocs/gcc-8/changes.html 11 Jan 2018 09:31:53 -  1.26
> > > +++ htdocs/gcc-8/changes.html 16 Jan 2018 14:12:57 -
> > > @@ -147,7 +147,51 @@
> > >  
> > >  AArch64
> > >  
> > > -  
> > > +  
> > > +The Armv8.4-A architecture is now supported.  It can be used by
> > > +specifying the -march=armv8.4-a option.
> > > +  
> > > +  
> > > +The Dot Product instructions are now supported as an optional 
> > > extension to the
> > > +Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  
> > > The extension can be used by
> > > +specifying the +dotprod architecture extension.  E.g. 
> > > -march=armv8.2-a+dotprod.
> > > +  
> > > +  
> > > +The Armv8-A +crypto extension has now been split into 
> > > two extensions for finer grained control:
> > > +
> > > +   +aes which contains the Armv8-A AES crytographic 
> > > instructions.
> > > +   +sha2 which contains the Armv8-A SHA2 and SHA1 
> > > cryptographic instructions.
> > > +
> > > +Using +crypto will now enable these two extensions.
> > > +  
> > > +  
> > > +New Armv8.4-A FP16 Floating Point Multiplication Variant 
> > > instructions have been added.  These instructions are
> > > +mandatory in Armv8.4-A but available as an optional extension to 
> > > Armv8.2-A and Armv8.3-A.  The new extension
> > > +can be used by specifying the +fp16fml architectural 
> > > extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
> > > +the instructions can be enabled by specifying +fp16.
> > > +  
> > > +  
> > > +New cryptographic instructions have been added as optional 
> > > extensions to Armv8.2-A and newer.  These instructions can
> > > +be enabled with:
> > > +
> > > +  +sha3 New SHA3 and SHA2 instructions from 
> > > Armv8.4-A.  This implies +sha2.
> > > +  +sm4 New SM3 and SM4 instructions from Armv8.4-A.
> > > +
> > > + 
> > > +  
> > > +   Support has been added for the following processors
> > > +   (GCC identifiers in parentheses):
> > > +   
> > > + Arm Cortex-A75 (cortex-a75).
> > > +  Arm Cortex-A55 (cortex-a55).
> > > +  Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE 
> > > (cortex-a75.cortex-a55).
> > > +   
> > > +   The GCC identifiers can be used
> > > +   as arguments to the -mcpu or -mtune 
> > > options,
> > > +   for example: -mcpu=cortex-a75 or
> > > +   -mtune=thunderx2t99p1 or as arguments to the 
> > > equivalent target
> > > +   attributes and pragmas.
> > > +  
> > >  
> > >  
> > >  ARM
> > > @@ -169,14 +213,58 @@
> > >  removed in a future release.
> > >
> > >
> > > -The default link behavior for ARMv6 and ARMv7-R targets has been
> > > +The default link behavior for Armv6 and Armv7-R targets has been
> > >  changed to produce BE8 format when generating big-endian images.  A 
> > > new
> > >  flag -mbe32 can be used to force the linker to produce
> > >  legacy BE32 format images.  There is no change of behavior for
> > > -ARMv6-m and other ARMv7 or later targets: these already defaulted
> > > +Armv6-M and other Armv7 or later targets: these already defaulted
> > >  to BE8 format.  This change brings GCC into alignment with other
> > >  compilers for the ARM architecture.
> > >
> > > +  
> > > +The Armv8-R architecture is now supported.  It can be used by 
> > > specifying the
> > > +-march=armv8-r option.
> > > +  
> > > +  
> > > +The Armv8.3-A architecture is now supported.  It can be used by
> > > +specifying the -march=armv8.3-a option.
> > > +  
> > > +  
> > > +The Armv8.4-A architecture is now supported.  It can be used by
> > > +specifying the -march=armv8.4-a option.
> > > +  
> > > +  
> > > + The Dot Product instructions are now supported as an optional 
> > > extension to the
> > > + Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  
> > > The extension can be used by
> > > + specifying the +dotprod ar

[PATCH 0/2] GCC 6: i386: Move struct ix86_frame to machine_function

2018-01-16 Thread H.J. Lu

This patch set makes ix86_frame available to i386 code generation.  They
are needed to backport the patch set of -mindirect-branch= to mitigate
variant #2 of the speculative execution vulnerabilities on x86 processors
identified by CVE-2017-5715, aka Spectre.

Tested on Linux/i686 and Linux/x86-64.

hjl (2):
  i386: Move struct ix86_frame to machine_function
  i386: Use reference of struct ix86_frame to avoid copy

 gcc/config/i386/i386.c | 70 ++
 gcc/config/i386/i386.h | 53 +-
 2 files changed, 65 insertions(+), 58 deletions(-)

-- 
2.14.3

[PATCH 1/2] GCC 6: ii386: Move struct ix86_frame to machine_function

2018-01-16 Thread H.J. Lu

From: hjl 

Make ix86_frame available to i386 code generation.  This is needed to
backport the patch set of -mindirect-branch= to mitigate variant #2 of
the speculative execution vulnerabilities on x86 processors identified
by CVE-2017-5715, aka Spectre.

Backport from mainline
2017-06-01  Bernd Edlinger  

* config/i386/i386.c (ix86_frame): Moved to ...
* config/i386/i386.h (ix86_frame): Here.
(machine_function): Add frame.
* config/i386/i386.c (ix86_compute_frame_layout): Repace the
frame argument with &cfun->machine->frame.
(ix86_can_use_return_insn_p): Don't pass &frame to
ix86_compute_frame_layout.  Copy frame from cfun->machine->frame.
(ix86_can_eliminate): Likewise.
(ix86_expand_prologue): Likewise.
(ix86_expand_epilogue): Likewise.
(ix86_expand_split_stack_prologue): Likewise.
---
 gcc/config/i386/i386.c | 68 ++
 gcc/config/i386/i386.h | 53 ++-
 2 files changed, 65 insertions(+), 56 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8b5faac5129..a1ff32b648b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2434,53 +2434,6 @@ struct GTY(()) stack_local_entry {
   struct stack_local_entry *next;
 };
 
-/* Structure describing stack frame layout.
-   Stack grows downward:
-
-   [arguments]
-   <- ARG_POINTER
-   saved pc
-
-   saved static chain  if ix86_static_chain_on_stack
-
-   saved frame pointer if frame_pointer_needed
-   <- HARD_FRAME_POINTER
-   [saved regs]
-   <- regs_save_offset
-   [padding0]
-
-   [saved SSE regs]
-   <- sse_regs_save_offset
-   [padding1]  |
-  |<- FRAME_POINTER
-   [va_arg registers]  |
-  |
-   [frame]|
-  |
-   [padding2] | = to_allocate
-   <- STACK_POINTER
-  */
-struct ix86_frame
-{
-  int nsseregs;
-  int nregs;
-  int va_arg_size;
-  int red_zone_size;
-  int outgoing_arguments_size;
-
-  /* The offsets relative to ARG_POINTER.  */
-  HOST_WIDE_INT frame_pointer_offset;
-  HOST_WIDE_INT hard_frame_pointer_offset;
-  HOST_WIDE_INT stack_pointer_offset;
-  HOST_WIDE_INT hfp_save_offset;
-  HOST_WIDE_INT reg_save_offset;
-  HOST_WIDE_INT sse_reg_save_offset;
-
-  /* When save_regs_using_mov is set, emit prologue using
- move instead of push instructions.  */
-  bool save_regs_using_mov;
-};
-
 /* Which cpu are we scheduling for.  */
 enum attr_cpu ix86_schedule;
 
@@ -2572,7 +2525,7 @@ static unsigned int ix86_function_arg_boundary 
(machine_mode,
const_tree);
 static rtx ix86_static_chain (const_tree, bool);
 static int ix86_function_regparm (const_tree, const_tree);
-static void ix86_compute_frame_layout (struct ix86_frame *);
+static void ix86_compute_frame_layout (void);
 static bool ix86_expand_vector_init_one_nonzero (bool, machine_mode,
 rtx, rtx, int);
 static void ix86_add_new_builtins (HOST_WIDE_INT);
@@ -10944,7 +10897,8 @@ ix86_can_use_return_insn_p (void)
   if (crtl->args.pops_args && crtl->args.size >= 32768)
 return 0;
 
-  ix86_compute_frame_layout (&frame);
+  ix86_compute_frame_layout ();
+  frame = cfun->machine->frame;
   return (frame.stack_pointer_offset == UNITS_PER_WORD
  && (frame.nregs + frame.nsseregs) == 0);
 }
@@ -11355,8 +11309,8 @@ ix86_can_eliminate (const int from, const int to)
 HOST_WIDE_INT
 ix86_initial_elimination_offset (int from, int to)
 {
-  struct ix86_frame frame;
-  ix86_compute_frame_layout (&frame);
+  ix86_compute_frame_layout ();
+  struct ix86_frame frame = cfun->machine->frame;
 
   if (from == ARG_POINTER_REGNUM && to == HARD_FRAME_POINTER_REGNUM)
 return frame.hard_frame_pointer_offset;
@@ -11395,8 +11349,9 @@ ix86_builtin_setjmp_frame_value (void)
 /* Fill structure ix86_frame about frame of currently computed function.  */
 
 static void
-ix86_compute_frame_layout (struct ix86_frame *frame)
+ix86_compute_frame_layout (void)
 {
+  struct ix86_frame *frame = &cfun->machine->frame;
   unsigned HOST_WIDE_INT stack_alignment_needed;
   HOST_WIDE_INT offset;
   unsigned HOST_WIDE_INT preferred_alignment;
@@ -12702,7 +12657,8 @@ ix86_expand_prologue (void)
   m->fs.sp_offset = INCOMING_FRAME_SP_OFFSET;
   m->fs.sp_valid = true;
 
-  ix86_compute_frame_layout (&frame);
+  ix86_compute_frame_layout ();
+  frame = m->frame;
 
   if (!TARGET_64BIT && ix86_function_ms_hook_prologue (current_function_decl))
 {
@@ -13379,7 +13335,8 @@ ix86_expand_epilogue (int style)
   bool using_drap;
 
   ix86_finalize_stack_realign_flags ();
-  ix86_compute_frame_layout (&fram

GCC 6: i386: Move struct ix86_frame to machine_function

2018-01-16 Thread H.J. Lu

This is needed for GCC 6 backport of Spectre patches:

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01465.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01466.html
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01464.html

-- 
H.J.

Re: GCC 8.0.0 Status Report (2018-01-15), Trunk in Regression and Documentation fixes only mode

2018-01-16 Thread Joseph Myers

On Tue, 16 Jan 2018, Segher Boessenkool wrote:

> On Mon, Jan 15, 2018 at 09:21:07AM +0100, Richard Biener wrote:
> > We're still in pretty bad shape regression-wise.  Please also take
> > the opportunity to check the state of your favorite host/target
> > combination to make sure building and testing works appropriately.
> 
> I tested building Linux (the kernel) for all supported architectures.
> Everything builds (with my usual tweaks, link with libgcc etc.);
> except x86_64 and sh have more problems in the kernel, and mips has
> an ICE.  I'll open a PR for that one.

And all glibc architectures compile (and compile the testsuite) OK except 
for the sh4eb ICE reported in bug 83760 (and the longstanding coldfire 
issue, bug 68467).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] PR82964: Fix 128-bit immediate ICEs

2018-01-16 Thread James Greenhalgh

On Mon, Jan 15, 2018 at 11:34:19AM +, Wilco Dijkstra wrote:
> This fixes PR82964 which reports ICEs for some CONST_WIDE_INT immediates.
> It turns out decimal floating point CONST_DOUBLE get changed into
> CONST_WIDE_INT without checking the constraint on the operand, which 
> results in failures.  Avoid this by only allowing SF/DF/TF mode floating
> point constants in aarch64_legitimate_constant_p.  A similar issue can
> occur with 128-bit immediates which may be emitted even when disallowed
> in aarch64_legitimate_constant_p, and the constraints in movti_aarch64
> don't match.  Fix this with a new constraint and allowing valid immediates
> in aarch64_legitimate_constant_p.
> 
> Rather than allowing all 128-bit immediates and expanding in up to 8
> MOV/MOVK instructions, limit them to 4 instructions and use a literal
> load for other cases.  Improve the pr79041-2.c test to use a literal and
> skip it for -fpic.
> 
> This fixes all reported failures. OK for commit?

Most of this makes sense, but I don't understand this relaxation in
aarch64_legitimate_constant_p

> -  /* Do not allow wide int constants - this requires support in movti.  */
> +  /* Only allow simple 128-bit immediates.  */
>if (CONST_WIDE_INT_P (x))
> -return false;
> +return aarch64_mov128_immediate (x);

I can see why this could be correct, but it is unclear why it is neccessary
to fix the bug. What goes wrong if we leave this as "return false".

I think the patch looks OK otherwise, but I'd appreciate an answer on that
point before you commit.

Thanks,
James

[PATCH, rs6000] Fix ICE caused by recent patch: Generate lvx and stvx without swaps for aligned vector loads and stores

2018-01-16 Thread Kelvin Nilsen


A patch committed on 2018-01-10 is causing an ICE with existing test
program $GCC_SRC/gcc/testsuite/gcc.target/powerpc/pr83399.c, when
compiled with the -m32 option.  At the time of the commit, it was
thought that this was a problem with the recent resolution of PR83399.
However, further investigation revealed a problem with the patch that
was just committed.  The generated code did not distinguish between 32-
and 64-bit targets.

This patch corrects that problem.

This has been bootstrapped and tested without regressions on
powerpc64le-unknown-linux (P8) and on powerpc64-unknown-linux (P7) with
both -m32 and -m64 target options.  Is this ok for trunk?


gcc/ChangeLog:

2018-01-16  Kelvin Nilsen  

* config/rs6000/rs6000-p8swap.c (rs6000_gen_stvx): Generate
different rtl trees depending on TARGET_64BIT.
(rs6000_gen_lvx): Likewise.

Index: gcc/config/rs6000/rs6000-p8swap.c
===
--- gcc/config/rs6000/rs6000-p8swap.c   (revision 256710)
+++ gcc/config/rs6000/rs6000-p8swap.c   (working copy)
@@ -1554,23 +1554,31 @@ rs6000_gen_stvx (enum machine_mode mode, rtx dest_
   op1 = XEXP (memory_address, 0);
   op2 = XEXP (memory_address, 1);
   if (mode == V16QImode)
-   stvx = gen_altivec_stvx_v16qi_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v16qi_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v16qi_2op_si (src_exp, op1, op2);
   else if (mode == V8HImode)
-   stvx = gen_altivec_stvx_v8hi_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v8hi_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v8hi_2op_si (src_exp, op1, op2);
 #ifdef HAVE_V8HFmode
   else if (mode == V8HFmode)
-   stvx = gen_altivec_stvx_v8hf_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v8hf_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v8hf_2op_si (src_exp, op1, op2);
 #endif
   else if (mode == V4SImode)
-   stvx = gen_altivec_stvx_v4si_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v4si_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v4si_2op_si (src_exp, op1, op2);
   else if (mode == V4SFmode)
-   stvx = gen_altivec_stvx_v4sf_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v4sf_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v4sf_2op_si (src_exp, op1, op2);
   else if (mode == V2DImode)
-   stvx = gen_altivec_stvx_v2di_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v2di_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v2di_2op_si (src_exp, op1, op2);
   else if (mode == V2DFmode)
-   stvx = gen_altivec_stvx_v2df_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v2df_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v2df_2op_si (src_exp, op1, op2);
   else if (mode == V1TImode)
-   stvx = gen_altivec_stvx_v1ti_2op (src_exp, op1, op2);
+   stvx = TARGET_64BIT ? gen_altivec_stvx_v1ti_2op (src_exp, op1, op2)
+ : gen_altivec_stvx_v1ti_2op_si (src_exp, op1, op2);
   else
/* KFmode, TFmode, other modes not expected in this context.  */
gcc_unreachable ();
@@ -1578,23 +1586,39 @@ rs6000_gen_stvx (enum machine_mode mode, rtx dest_
   else /* REG_P (memory_address) */
 {
   if (mode == V16QImode)
-   stvx = gen_altivec_stvx_v16qi_1op (src_exp, memory_address);
+   stvx = TARGET_64BIT ?
+ gen_altivec_stvx_v16qi_1op (src_exp, memory_address)
+ : gen_altivec_stvx_v16qi_1op_si (src_exp, memory_address);
   else if (mode == V8HImode)
-   stvx = gen_altivec_stvx_v8hi_1op (src_exp, memory_address);
+   stvx = TARGET_64BIT ?
+ gen_altivec_stvx_v8hi_1op (src_exp, memory_address)
+ : gen_altivec_stvx_v8hi_1op_si (src_exp, memory_address);
 #ifdef HAVE_V8HFmode
   else if (mode == V8HFmode)
-   stvx = gen_altivec_stvx_v8hf_1op (src_exp, memory_address);
+   stvx = TARGET_64BIT ?
+ gen_altivec_stvx_v8hf_1op (src_exp, memory_address)
+ : gen_altivec_stvx_v8hf_1op_si (src_exp, memory_address);
 #endif
   else if (mode == V4SImode)
-   stvx = gen_altivec_stvx_v4si_1op (src_exp, memory_address);
+   stvx =TARGET_64BIT ?
+ gen_altivec_stvx_v4si_1op (src_exp, memory_address)
+ : gen_altivec_stvx_v4si_1op_si (src_exp, memory_address);
   else if (mode == V4SFmode)
-   stvx = gen_altivec_stvx_v4sf_1op (src_exp, memory_address);
+   stvx = TARGET_64BIT ?
+ gen_altivec_stvx_v4sf_1op (src_exp, memory_address)
+ : gen_altivec_stvx_v4sf_1op_si (src_exp, memory_address);
   else if (mode == V2DImode)
-   stvx = gen_altivec_stvx_v2di_1op (src_exp, memory_address);
+   stvx = TARGET_64BIT ?
+ gen_altivec_stvx_v2di_1op (src_exp, memory_address)
+ : gen_altivec_stvx_v2di_1op_si (src_exp, memory_address)

Re: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread Kyrill Tkachov


Hi Tamar,

On 16/01/18 16:56, Tamar Christina wrote:

Th 01/16/2018 16:36, James Greenhalgh wrote:

On Tue, Jan 16, 2018 at 02:21:30PM +, Tamar Christina wrote:

Hi Kyrill,


xgene1 was added a few releases ago, better to use one of the new additions 
from the above list.
For example -mtune=cortex-r52.

Thanks, I have updated the patch. I'll wait for an ok from an AArch64 
maintainer and a Docs maintainer.

OK. But you have the same issue in the AArch64 part.

Thanks, I've updated the patch, I'll wait for a bit for a doc reviewer if I 
don't hear anything I'll assume
the patch is OK.


Gerald has confirmed a few times in the past that port maintainers can approve
target-specific changes to the web pages, and there are words to that effect at:
https://gcc.gnu.org/svnwrite.html .
So I'd recommend you commit your patch once you've got approval for aarch64 and 
arm.
Unless there's some specific part of the patch you'd like the docs maintainer 
to give you feedback on...

Thanks again for working on this.
Kyrill



Thanks,
Tamar

James


Index: htdocs/gcc-8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.26
diff -u -r1.26 changes.html
--- htdocs/gcc-8/changes.html   11 Jan 2018 09:31:53 -  1.26
+++ htdocs/gcc-8/changes.html   16 Jan 2018 14:12:57 -
@@ -147,7 +147,51 @@
  
  AArch64

  
-  
+  
+The Armv8.4-A architecture is now supported.  It can be used by
+specifying the -march=armv8.4-a option.
+  
+  
+The Dot Product instructions are now supported as an optional extension to 
the
+Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The 
extension can be used by
+specifying the +dotprod architecture extension.  E.g. 
-march=armv8.2-a+dotprod.
+  
+  
+The Armv8-A +crypto extension has now been split into two 
extensions for finer grained control:
+
+   +aes which contains the Armv8-A AES crytographic 
instructions.
+   +sha2 which contains the Armv8-A SHA2 and SHA1 cryptographic 
instructions.
+
+Using +crypto will now enable these two extensions.
+  
+  
+New Armv8.4-A FP16 Floating Point Multiplication Variant instructions have 
been added.  These instructions are
+mandatory in Armv8.4-A but available as an optional extension to Armv8.2-A 
and Armv8.3-A.  The new extension
+can be used by specifying the +fp16fml architectural 
extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
+the instructions can be enabled by specifying +fp16.
+  
+  
+New cryptographic instructions have been added as optional extensions to 
Armv8.2-A and newer.  These instructions can
+be enabled with:
+
+  +sha3 New SHA3 and SHA2 instructions from Armv8.4-A.  This implies 
+sha2.
+  +sm4 New SM3 and SM4 instructions from Armv8.4-A.
+
+ 
+  
+   Support has been added for the following processors
+   (GCC identifiers in parentheses):
+   
+ Arm Cortex-A75 (cortex-a75).
+Arm Cortex-A55 (cortex-a55).
+Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE 
(cortex-a75.cortex-a55).
+   
+   The GCC identifiers can be used
+   as arguments to the -mcpu or -mtune options,
+   for example: -mcpu=cortex-a75 or
+   -mtune=thunderx2t99p1 or as arguments to the equivalent 
target
+   attributes and pragmas.
+  
  
  
  ARM

@@ -169,14 +213,58 @@
  removed in a future release.


-The default link behavior for ARMv6 and ARMv7-R targets has been
+The default link behavior for Armv6 and Armv7-R targets has been
  changed to produce BE8 format when generating big-endian images.  A new
  flag -mbe32 can be used to force the linker to produce
  legacy BE32 format images.  There is no change of behavior for
-ARMv6-m and other ARMv7 or later targets: these already defaulted
+Armv6-M and other Armv7 or later targets: these already defaulted
  to BE8 format.  This change brings GCC into alignment with other
  compilers for the ARM architecture.

+  
+The Armv8-R architecture is now supported.  It can be used by specifying 
the
+-march=armv8-r option.
+  
+  
+The Armv8.3-A architecture is now supported.  It can be used by
+specifying the -march=armv8.3-a option.
+  
+  
+The Armv8.4-A architecture is now supported.  It can be used by
+specifying the -march=armv8.4-a option.
+  
+  
+ The Dot Product instructions are now supported as an optional extension 
to the
+ Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.  The 
extension can be used by
+ specifying the +dotprod architecture extension.  E.g. 
-march=armv8.2-a+dotprod.
+  
+
+  
+Support for setting extensions and architectures using the GCC target 
pragma and attribute has been added.
+It can be used by specifying #pragma GCC target ("arch=..."), #pragma 
GCC target ("+extension"),
+__attribute__((target("arch=...")

Compilation warning in simple-object-xcoff.c

2018-01-16 Thread Eli Zaretskii

Compiling GDB 8.0.91 with mingw.org's MinGW GCC 6.0.3 produces this
warning in libiberty:

 gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS  -I. 
-I./../include   -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes 
-pedantic  -D_GNU_SOURCE ./simple-object-xcoff.c -o simple-object-xcoff.o
 ./simple-object-xcoff.c: In function 'simple_object_xcoff_find_sections':
 ./simple-object-xcoff.c:605:25: warning: left shift count >= width of type 
[-Wshift-count-overflow]
  x_scnlen = x_scnlen << 32
 ^~

And indeed x_scnlen is declared as a 32-bit data type off_t.

I'm willing to test patches if needed.

Thanks.

Re: [PATCH, rs6000] Fix ICE caused by recent patch: Generate lvx and stvx without swaps for aligned vector loads and stores

2018-01-16 Thread Segher Boessenkool

Hi Kelvin,

On Tue, Jan 16, 2018 at 11:15:12AM -0600, Kelvin Nilsen wrote:
> 
> A patch committed on 2018-01-10 is causing an ICE with existing test
> program $GCC_SRC/gcc/testsuite/gcc.target/powerpc/pr83399.c, when
> compiled with the -m32 option.  At the time of the commit, it was
> thought that this was a problem with the recent resolution of PR83399.
> However, further investigation revealed a problem with the patch that
> was just committed.  The generated code did not distinguish between 32-
> and 64-bit targets.
> 
> This patch corrects that problem.
> 
> This has been bootstrapped and tested without regressions on
> powerpc64le-unknown-linux (P8) and on powerpc64-unknown-linux (P7) with
> both -m32 and -m64 target options.  Is this ok for trunk?
> 
> 
> gcc/ChangeLog:
> 
> 2018-01-16  Kelvin Nilsen  
> 

PR target/83399
?  Or is there another PR?

>   * config/rs6000/rs6000-p8swap.c (rs6000_gen_stvx): Generate
>   different rtl trees depending on TARGET_64BIT.
>   (rs6000_gen_lvx): Likewise.
> 
> Index: gcc/config/rs6000/rs6000-p8swap.c
> ===
> --- gcc/config/rs6000/rs6000-p8swap.c (revision 256710)
> +++ gcc/config/rs6000/rs6000-p8swap.c (working copy)
> @@ -1554,23 +1554,31 @@ rs6000_gen_stvx (enum machine_mode mode, rtx dest_
>op1 = XEXP (memory_address, 0);
>op2 = XEXP (memory_address, 1);
>if (mode == V16QImode)
> - stvx = gen_altivec_stvx_v16qi_2op (src_exp, op1, op2);
> + stvx = TARGET_64BIT ? gen_altivec_stvx_v16qi_2op (src_exp, op1, op2)
> +   : gen_altivec_stvx_v16qi_2op_si (src_exp, op1, op2);

Please indent this like

stvx = TARGET_64BIT
   ? gen_altivec_stvx_v16qi_2op (src_exp, op1, op2)
   : gen_altivec_stvx_v16qi_2op_si (src_exp, op1, op2);

>if (mode == V16QImode)
> - stvx = gen_altivec_stvx_v16qi_1op (src_exp, memory_address);
> + stvx = TARGET_64BIT ?
> +   gen_altivec_stvx_v16qi_1op (src_exp, memory_address)
> +   : gen_altivec_stvx_v16qi_1op_si (src_exp, memory_address);

You should never have ? at the end of line; and ? and : indent with the
controlling expression.  So:

stvx = TARGET_64BIT
   ? gen_altivec_stvx_v16qi_1op (src_exp, memory_address)
   : gen_altivec_stvx_v16qi_1op_si (src_exp, memory_address);

Similar everywhere.  Okay with that changed.  Thanks!


Segher

Re: VIEW_CONVERT_EXPR slots for strict-align targets (PR 83884)

2018-01-16 Thread Richard Biener

On January 16, 2018 5:14:50 PM GMT+01:00, Richard Sandiford 
 wrote:
>This PR is about a case in which we VIEW_CONVERT a variable-sized
>unaligned record:
>
>sizes-gimplified type_7 BLK
>size 
>unit-size 
>align:8 ...>
>
>to an aligned 32-bit integer.  The strict-alignment handling of
>this case creates an aligned temporary slot, moves the operand
>into the slot in the operand's original mode, then accesses the
>slot in the more-aligned result mode.
>
>Previously the size of the temporary slot was calculated using:
>
>  HOST_WIDE_INT temp_size
>= MAX (int_size_in_bytes (inner_type),
>   (HOST_WIDE_INT) GET_MODE_SIZE (mode));
>
>int_size_in_bytes would return -1 for the variable-length type,
>so we'd use the size of the result mode for the slot.  r256152 replaced
>int_size_in_bytes with tree_to_poly_uint64, which triggered an ICE.
>
>I'd assumed that variable-length types couldn't occur here, since it
>seems strange to view-convert a variable-length type to a fixed-length
>one.  It also seemed strange that (with the old code) we'd ignore the
>size of the operand if it was a variable V but honour it if it was a
>constant C, even though it's presumably possible for V to equal that
>C at runtime.
>
>If op0 has BLKmode we do a block copy of GET_MODE_SIZE (mode) bytes
>and then convert the slot to "mode":
>
> poly_uint64 mode_size = GET_MODE_SIZE (mode);
> ...
> if (GET_MODE (op0) == BLKmode)
>   {
> rtx size_rtx = gen_int_mode (mode_size, Pmode);
> emit_block_move (new_with_op0_mode, op0, size_rtx,
>  (modifier == EXPAND_STACK_PARM
>   ? BLOCK_OP_CALL_PARM
>   : BLOCK_OP_NORMAL));
>   }
> else
>   ...
>
> op0 = new_rtx;
>   }
>   }
>
> op0 = adjust_address (op0, mode, 0);
>
>so I think in that case just the size of "mode" is enough, even if op0
>is a fixed-size type.  For non-BLKmode op0 we first move in op0's mode
>and then convert the slot to "mode":
>
>   emit_move_insn (new_with_op0_mode, op0);
>
> op0 = new_rtx;
>   }
>   }
>
> op0 = adjust_address (op0, mode, 0);
>
>so I think we want the maximum of the two mode sizes in that case
>(assuming they can be different sizes).
>
>But is this VIEW_CONVERT_EXPR really valid?  

IMHO it is on the border of be being invalid (verify_gimple doesn't diagnose 
it). Using a BIT_FIELD_REF would be much better here. 

Richard. 

Maybe this is just
>papering over a deeper issue.  There again, the MAX in the old
>code was presumably there because the sizes can be different...
>
>Richard
>
>
>2018-01-16  Richard Sandiford  
>
>gcc/
>   PR middle-end/83884
>   * expr.c (expand_expr_real_1): Use the size of GET_MODE (op0)
>   rather than the size of inner_type to determine the stack slot size
>   when handling VIEW_CONVERT_EXPRs on strict-alignment targets.
>
>Index: gcc/expr.c
>===
>--- gcc/expr.c 2018-01-14 08:42:44.497155977 +
>+++ gcc/expr.c 2018-01-16 16:07:22.737883774 +
>@@ -11145,11 +11145,11 @@ expand_expr_real_1 (tree exp, rtx target
>   }
> else if (STRICT_ALIGNMENT)
>   {
>-tree inner_type = TREE_TYPE (treeop0);
> poly_uint64 mode_size = GET_MODE_SIZE (mode);
>-poly_uint64 op0_size
>-  = tree_to_poly_uint64 (TYPE_SIZE_UNIT (inner_type));
>-poly_int64 temp_size = upper_bound (op0_size, mode_size);
>+poly_uint64 temp_size = mode_size;
>+if (GET_MODE (op0) != BLKmode)
>+  temp_size = upper_bound (temp_size,
>+   GET_MODE_SIZE (GET_MODE (op0)));
> rtx new_rtx
>   = assign_stack_temp_for_type (mode, temp_size, type);
> rtx new_with_op0_mode

Re: Compilation warning in simple-object-xcoff.c

2018-01-16 Thread DJ Delorie


I think that warning is valid - the host has a 32-bit limit to file
sizes (off_t) but it's trying to read a 64-bit offset (in that clause).
It's warning you that you won't be able to handle files as large as the
field implies.

Can we hide the warning?  Probably.  Should we?  Debatable, as long as
we want 64-bit xcoff support in 32-bit filesystems.

Otherwise, we'd need to detect off_t overflow somehow, down the slippery
slope of reporting the error to the caller...

RE: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread Tamar Christina

Hi Kyrill,

> 
> Hi Tamar,
> 
> On 16/01/18 16:56, Tamar Christina wrote:
> > Th 01/16/2018 16:36, James Greenhalgh wrote:
> >> On Tue, Jan 16, 2018 at 02:21:30PM +, Tamar Christina wrote:
> >>> Hi Kyrill,
> >>>
>  xgene1 was added a few releases ago, better to use one of the new
> additions from the above list.
>  For example -mtune=cortex-r52.
> >>> Thanks, I have updated the patch. I'll wait for an ok from an AArch64
> maintainer and a Docs maintainer.
> >> OK. But you have the same issue in the AArch64 part.
> > Thanks, I've updated the patch, I'll wait for a bit for a doc reviewer
> > if I don't hear anything I'll assume the patch is OK.
> 
> Gerald has confirmed a few times in the past that port maintainers can
> approve target-specific changes to the web pages, and there are words to
> that effect at:
> https://gcc.gnu.org/svnwrite.html .
> So I'd recommend you commit your patch once you've got approval for
> aarch64 and arm.
> Unless there's some specific part of the patch you'd like the docs maintainer
> to give you feedback on...

Ah, thanks! I'll commit the patch then. 

> Thanks again for working on this.
> Kyrill
> 
> >
> > Thanks,
> > Tamar
> >> James
> >>
> >>> Index: htdocs/gcc-8/changes.html
> >>>
> ==
> =
> >>> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
> >>> retrieving revision 1.26
> >>> diff -u -r1.26 changes.html
> >>> --- htdocs/gcc-8/changes.html 11 Jan 2018 09:31:53 -  1.26
> >>> +++ htdocs/gcc-8/changes.html 16 Jan 2018 14:12:57 -
> >>> @@ -147,7 +147,51 @@
> >>>
> >>>   AArch64
> >>>   
> >>> -  
> >>> +  
> >>> +The Armv8.4-A architecture is now supported.  It can be used by
> >>> +specifying the -march=armv8.4-a option.
> >>> +  
> >>> +  
> >>> +The Dot Product instructions are now supported as an optional
> extension to the
> >>> +Armv8.2-A architecture and newer and are mandatory on Armv8.4-A.
> The extension can be used by
> >>> +specifying the +dotprod architecture extension.  E.g.
> -march=armv8.2-a+dotprod.
> >>> +  
> >>> +  
> >>> +The Armv8-A +crypto extension has now been split
> into two extensions for finer grained control:
> >>> +
> >>> +   +aes which contains the Armv8-A AES
> crytographic instructions.
> >>> +   +sha2 which contains the Armv8-A SHA2 and
> SHA1 cryptographic instructions.
> >>> +
> >>> +Using +crypto will now enable these two extensions.
> >>> +  
> >>> +  
> >>> +New Armv8.4-A FP16 Floating Point Multiplication Variant instructions
> have been added.  These instructions are
> >>> +mandatory in Armv8.4-A but available as an optional extension to
> Armv8.2-A and Armv8.3-A.  The new extension
> >>> +can be used by specifying the +fp16fml architectural
> extension on Armv8.2-A and Armv8.3-A. On Armv8.4-A
> >>> +the instructions can be enabled by specifying +fp16.
> >>> +  
> >>> +  
> >>> +New cryptographic instructions have been added as optional
> extensions to Armv8.2-A and newer.  These instructions can
> >>> +be enabled with:
> >>> +
> >>> +  +sha3 New SHA3 and SHA2 instructions from
> Armv8.4-A.  This implies +sha2.
> >>> +  +sm4 New SM3 and SM4 instructions from
> Armv8.4-A.
> >>> +
> >>> + 
> >>> +  
> >>> +   Support has been added for the following processors
> >>> +   (GCC identifiers in parentheses):
> >>> +   
> >>> + Arm Cortex-A75 (cortex-a75).
> >>> +  Arm Cortex-A55 (cortex-a55).
> >>> +  Arm Cortex-A55/Cortex-A75 DynamIQ big.LITTLE (cortex-
> a75.cortex-a55).
> >>> +   
> >>> +   The GCC identifiers can be used
> >>> +   as arguments to the -mcpu or -
> mtune options,
> >>> +   for example: -mcpu=cortex-a75 or
> >>> +   -mtune=thunderx2t99p1 or as arguments to the
> equivalent target
> >>> +   attributes and pragmas.
> >>> +  
> >>>   
> >>>
> >>>   ARM
> >>> @@ -169,14 +213,58 @@
> >>>   removed in a future release.
> >>> 
> >>> 
> >>> -The default link behavior for ARMv6 and ARMv7-R targets has been
> >>> +The default link behavior for Armv6 and Armv7-R targets has
> >>> + been
> >>>   changed to produce BE8 format when generating big-endian images.
> A new
> >>>   flag -mbe32 can be used to force the linker to
> produce
> >>>   legacy BE32 format images.  There is no change of behavior for
> >>> -ARMv6-m and other ARMv7 or later targets: these already defaulted
> >>> +Armv6-M and other Armv7 or later targets: these already
> >>> + defaulted
> >>>   to BE8 format.  This change brings GCC into alignment with other
> >>>   compilers for the ARM architecture.
> >>> 
> >>> +  
> >>> +The Armv8-R architecture is now supported.  It can be used by
> specifying the
> >>> +-march=armv8-r option.
> >>> +  
> >>> +  
> >>> +The Armv8.3-A architecture is now supported.  It can be used by
> >>> +specifying the -march=armv8.3-a option.
> >>> +

Re: Compilation warning in simple-object-xcoff.c

2018-01-16 Thread Eli Zaretskii

> From: DJ Delorie 
> Cc: gcc-patches@gcc.gnu.org, gdb-patc...@sourceware.org
> Date: Tue, 16 Jan 2018 13:00:48 -0500
> 
> 
> I think that warning is valid - the host has a 32-bit limit to file
> sizes (off_t) but it's trying to read a 64-bit offset (in that clause).
> It's warning you that you won't be able to handle files as large as the
> field implies.

If 32-bit off_t cannot handle this, then perhaps this file (or that
function) should not be compiled for a 32-bit host?

Re: Compilation warning in simple-object-xcoff.c

2018-01-16 Thread DJ Delorie


Well, it should all work fine as long as the xcoff64 file is less than 4
Gb.

And it's not the host's bit size that counts; there are usually ways to
get 64-bit file operations on 32-bit hosts.

[PATCH, rs6000] Implement ABI_AIX indirect call handling for -mno-speculate-indirect-jumps

2018-01-16 Thread Bill Schmidt

Hi,

This patch fills in a gap from the previous -mno-speculate-indirect-jumps
patch.  That patch didn't provide support for indirect calls using ABI_AIX
as the default ABI.  This fills in that missing support and changes the
one related powerpc64le-only test case to be compiled for all subtargets.

After some analysis, it doesn't appear possible for sibcalls to be
generated for ELFv1 or ELFv2 using a bctr, given the need for a local
call to avoid the required TOC restore afterwards.  I haven't been able
to find a way to get a bctr generated even when one could theoretically
prove the bctr must go to a local function.

This has been bootstrapped and tested on powerpc64le-linux-gnu with no
regressions.  Testing is still ongoing for powerpc64-linux-gnu.  Provided
that testing completes with no surprises, is this okay for trunk (and
shortly for backport to 7)?

Thanks!
Bill


[gcc]

2018-01-16  Bill Schmidt  

* config/rs6000/rs6000.md (*call_indirect_aix): Disable for
-mno-speculate-indirect-jumps.
(*call_indirect_aix_nospec): New define_insn.
(*call_value_indirect_aix): Disable for
-mno-speculate-indirect-jumps.
(*call_value_indirect_aix_nospec): New define_insn.

[gcc/testsuite]

2018-01-16  Bill Schmidt  

* gcc.target/powerpc/safe-indirect-jump-1.c: Remove
powerpc64le-only restriction.


Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 256753)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -10669,11 +10669,22 @@
(use (match_operand:P 2 "memory_operand" ","))
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
-  "DEFAULT_ABI == ABI_AIX"
+  "DEFAULT_ABI == ABI_AIX && rs6000_speculate_indirect_jumps"
   " 2,%2\;b%T0l\; 2,%3(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
+(define_insn "*call_indirect_aix_nospec"
+  [(call (mem:SI (match_operand:P 0 "register_operand" "c,*l"))
+(match_operand 1 "" "g,g"))
+   (use (match_operand:P 2 "memory_operand" ","))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 3 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && !rs6000_speculate_indirect_jumps"
+  "crset eq\; 2,%2\;beq%T0l-\; 2,%3(1)"
+  [(set_attr "type" "jmpreg")
+   (set_attr "length" "16")])
+
 (define_insn "*call_value_indirect_aix"
   [(set (match_operand 0 "" "")
(call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
@@ -10681,11 +10692,23 @@
(use (match_operand:P 3 "memory_operand" ","))
(set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 4 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
(clobber (reg:P LR_REGNO))]
-  "DEFAULT_ABI == ABI_AIX"
+  "DEFAULT_ABI == ABI_AIX && rs6000_speculate_indirect_jumps"
   " 2,%3\;b%T1l\; 2,%4(1)"
   [(set_attr "type" "jmpreg")
(set_attr "length" "12")])
 
+(define_insn "*call_value_indirect_aix_nospec"
+  [(set (match_operand 0 "" "")
+   (call (mem:SI (match_operand:P 1 "register_operand" "c,*l"))
+ (match_operand 2 "" "g,g")))
+   (use (match_operand:P 3 "memory_operand" ","))
+   (set (reg:P TOC_REGNUM) (unspec:P [(match_operand:P 4 "const_int_operand" 
"n,n")] UNSPEC_TOCSLOT))
+   (clobber (reg:P LR_REGNO))]
+  "DEFAULT_ABI == ABI_AIX && !rs6000_speculate_indirect_jumps"
+  "crset eq\; 2,%3\;beq%T1l-\; 2,%4(1)"
+  [(set_attr "type" "jmpreg")
+   (set_attr "length" "16")])
+
 ;; Call to indirect functions with the ELFv2 ABI.
 ;; Operand0 is the addresss of the function to call
 ;; Operand2 is the offset of the stack location holding the current TOC pointer
Index: gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-1.c
===
--- gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-1.c (revision 
256753)
+++ gcc/testsuite/gcc.target/powerpc/safe-indirect-jump-1.c (working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-do compile } */
 /* { dg-additional-options "-mno-speculate-indirect-jumps" } */
 
 /* Test for deliberate misprediction of indirect calls for ELFv2.  */

[PATCH] avoid assuming known string length is constant (PR 83896)

2018-01-16 Thread Martin Sebor


Recent improvements to the strlen pass introduced the assumption
that when the length of a string has been recorded by the pass
the length is necessarily constant.  Bug 83896 shows that this
assumption is not always true, and that GCC fails with an ICE
when it doesn't hold.  To avoid the ICE the attached patch
removes the assumption.

x86_64-linux bootstrap successful, regression test in progress.

Martin
PR tree-optimization/83896 - ice in get_string_len on a call to strlen with non-constant length

gcc/ChangeLog:

	PR tree-optimization/83896
	* tree-ssa-strlen.c (get_string_len): Avoid assuming length is constant.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83896
	* gcc.dg/strlenopt-43.c: New test.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 256752)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -2772,7 +2772,9 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
 }
 }
 
-/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
+/* If RHS, either directly or indirectly, refers to a string of constant
+   length, return it.  Otherwise return a negative value.  */
+
 static int
 get_string_len (tree rhs)
 {
@@ -2789,7 +2791,8 @@ get_string_len (tree rhs)
 	  if (idx > 0)
 		{
 		  strinfo *si = get_strinfo (idx);
-		  if (si && si->full_string_p)
+		  if (si && si->full_string_p
+		  && TREE_CODE (si->nonzero_chars) == INTEGER_CST)
 		return tree_to_shwi (si->nonzero_chars);
 		}
 	}
Index: gcc/testsuite/gcc.dg/strlenopt-43.c
===
--- gcc/testsuite/gcc.dg/strlenopt-43.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/strlenopt-43.c	(working copy)
@@ -0,0 +1,13 @@
+/* PR tree-optimization/83896 - ice in get_string_len on a call to strlen
+   with non-constant length
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+extern char a[5];
+extern char b[];
+
+void f (void)
+{
+  if (__builtin_strlen (b) != 4)
+__builtin_memcpy (a, b, sizeof a);
+}

Re: [PATCH] avoid assuming known string length is constant (PR 83896)

2018-01-16 Thread Richard Sandiford

Martin Sebor  writes:
> Recent improvements to the strlen pass introduced the assumption
> that when the length of a string has been recorded by the pass
> the length is necessarily constant.  Bug 83896 shows that this
> assumption is not always true, and that GCC fails with an ICE
> when it doesn't hold.  To avoid the ICE the attached patch
> removes the assumption.
>
> x86_64-linux bootstrap successful, regression test in progress.
>
> Martin
>
> PR tree-optimization/83896 - ice in get_string_len on a call to strlen with 
> non-constant length
>
> gcc/ChangeLog:
>
>   PR tree-optimization/83896
>   * tree-ssa-strlen.c (get_string_len): Avoid assuming length is constant.
>
> gcc/testsuite/ChangeLog:
>
>   PR tree-optimization/83896
>   * gcc.dg/strlenopt-43.c: New test.
>
> Index: gcc/tree-ssa-strlen.c
> ===
> --- gcc/tree-ssa-strlen.c (revision 256752)
> +++ gcc/tree-ssa-strlen.c (working copy)
> @@ -2772,7 +2772,9 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
>  }
>  }
>  
> -/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
> +/* If RHS, either directly or indirectly, refers to a string of constant
> +   length, return it.  Otherwise return a negative value.  */
> +
>  static int
>  get_string_len (tree rhs)
>  {

I think this should be returning HOST_WIDE_INT given the unconstrained
tree_to_shwi return.  Same type change for rhslen in the caller.

(Not my call, but it might be better to have a more specific function name,
given that the file already had "get_string_length" before this function
was added.)

> @@ -2789,7 +2791,8 @@ get_string_len (tree rhs)
> if (idx > 0)
>   {
> strinfo *si = get_strinfo (idx);
> -   if (si && si->full_string_p)
> +   if (si && si->full_string_p
> +   && TREE_CODE (si->nonzero_chars) == INTEGER_CST)
>   return tree_to_shwi (si->nonzero_chars);

tree_fits_shwi_p?

Thanks,
Richard

>   }
>   }
> Index: gcc/testsuite/gcc.dg/strlenopt-43.c
> ===
> --- gcc/testsuite/gcc.dg/strlenopt-43.c   (nonexistent)
> +++ gcc/testsuite/gcc.dg/strlenopt-43.c   (working copy)
> @@ -0,0 +1,13 @@
> +/* PR tree-optimization/83896 - ice in get_string_len on a call to strlen
> +   with non-constant length
> +   { dg-do compile }
> +   { dg-options "-O2 -Wall" } */
> +
> +extern char a[5];
> +extern char b[];
> +
> +void f (void)
> +{
> +  if (__builtin_strlen (b) != 4)
> +__builtin_memcpy (a, b, sizeof a);
> +}

[PATCH, rs6000] (v2) Support for gimple folding of mergeh, mergel intrinsics

2018-01-16 Thread Will Schmidt

Hi,
  Add support for gimple folding of the mergeh, mergel intrinsics.  Since the
low and high versions are almost identical, a new helper function is added
so that code can be shared.

The changes introduced here affect the existing target testcases
gcc.target/powerpc/builtins-1-be.c and builtins-1-le.c, such that
a number of the scan-assembler tests would fail due to instruction counts
changing.  Since the purpose of that test is to primarily ensure those
intrinsics are accepted by the compiler, I have disabled gimple-folding for
the existing tests that count instructions, and created new variants of those
tests with folding enabled and a higher optimization level, that do not count
instructions.

V2 updates,
  * thanks for the feedback & hints in how to make these improvements :-)
  * Reworked to merge the xxmrg* instructions into the existing define_insn
  stanzas.
  * Reworked to use the tree-vector-builder.h helpers, eliminating some
  constructor and assign statements.
  * a few more cosmetic touch-ups in nearby define_insns.
  * update target stanza for builtins-1-be-folded.c test.

Sniff-tests of the target tests on a single system look OK.  Full regtests are
currently running across assorted power systems.
OK for trunk, pending successful results?

Thanks,
-Will

[gcc]

2018-01-16  Will Schmidt  

* config/rs6000/rs6000.c: (rs6000_gimple_builtin) Add gimple folding
support for merge[hl].
(fold_mergehl_helper): New helper function.
(tree-vector-builder.h): New #include for tree_vector_builder usage.
* config/rs6000/altivec.md (altivec_vmrghw_direct): Add xxmrghw insn.
(altivec_vmrglw_direct): Add xxmrglw insn.

[testsuite]

2018-01-16  Will Schmidt  

* gcc.target/powerpc/fold-vec-mergehl-char.c: New.
* gcc.target/powerpc/fold-vec-mergehl-double.c: New.
* gcc.target/powerpc/fold-vec-mergehl-float.c: New.
* gcc.target/powerpc/fold-vec-mergehl-int.c: New.
* gcc.target/powerpc/fold-vec-mergehl-longlong.c: New.
* gcc.target/powerpc/fold-vec-mergehl-pixel.c: New.
* gcc.target/powerpc/fold-vec-mergehl-short.c: New.
* gcc.target/powerpc/builtins-1-be.c: Disable gimple-folding.
* gcc.target/powerpc/builtins-1-le.c: Disable gimple-folding.
* gcc.target/powerpc/builtins-1-be-folded.c: New.
* gcc.target/powerpc/builtins-1-le-folded.c: New.
* gcc.target/powerpc/builtins-1.fold.h: New.

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 733d920..bb00583 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -995,12 +995,12 @@
 }
   [(set_attr "type" "vecperm")])
 
 (define_insn "altivec_vmrghb_direct"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
-(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
-   (match_operand:V16QI 2 "register_operand" "v")]
+   (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
+  (match_operand:V16QI 2 "register_operand" "v")]
  UNSPEC_VMRGH_DIRECT))]
   "TARGET_ALTIVEC"
   "vmrghb %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
@@ -1102,16 +1102,18 @@
 return "vmrglw %0,%2,%1";
 }
   [(set_attr "type" "vecperm")])
 
 (define_insn "altivec_vmrghw_direct"
-  [(set (match_operand:V4SI 0 "register_operand" "=v")
-(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
-  (match_operand:V4SI 2 "register_operand" "v")]
- UNSPEC_VMRGH_DIRECT))]
+  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
+   (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v,wa")
+ (match_operand:V4SI 2 "register_operand" "v,wa")]
+UNSPEC_VMRGH_DIRECT))]
   "TARGET_ALTIVEC"
-  "vmrghw %0,%1,%2"
+  "@
+  vmrghw %0,%1,%2
+  xxmrghw %x0,%x1,%x2"
   [(set_attr "type" "vecperm")])
 
 (define_insn "*altivec_vmrghsf"
   [(set (match_operand:V4SF 0 "register_operand" "=v")
 (vec_select:V4SF
@@ -1184,13 +1186,13 @@
 }
   [(set_attr "type" "vecperm")])
 
 (define_insn "altivec_vmrglb_direct"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
-(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
-  (match_operand:V16QI 2 "register_operand" "v")]
-  UNSPEC_VMRGL_DIRECT))]
+   (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
+  (match_operand:V16QI 2 "register_operand" "v")]
+ UNSPEC_VMRGL_DIRECT))]
   "TARGET_ALTIVEC"
   "vmrglb %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
 (define_expand "altivec_vmrglh"
@@ -1242,11 +1244,11 @@
 
 (define_insn "altivec_vmrglh_direct"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
 (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
  (match_operand:V8HI 2 "register_operand" "v")]
-

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-16 Thread Uros Bizjak

On Sun, Jan 14, 2018 at 5:43 PM, Uros Bizjak  wrote:
> On Sun, Jan 14, 2018 at 5:35 PM, H.J. Lu  wrote:
>> On Sun, Jan 14, 2018 at 8:19 AM, Uros Bizjak  wrote:
>>> On Fri, Jan 12, 2018 at 9:01 AM, Uros Bizjak  wrote:
 On Thu, Jan 11, 2018 at 2:28 PM, H.J. Lu  wrote:

> Hi Uros,
>
> Can you take a look at my x86 backend changes so that they are ready
> to check in once we have consensus.

 Please finish the talks about the correct approach first. Once the
 consensus is reached, please post the final version of the patches for
 review.

 BTW: I have no detailed insight in these issues, so I'll look mostly
 at the implementation details, probably early next week.
>>>
>>> One general remark is on the usage of -1 as an invalid register
>>
>> This has been rewritten.  The checked in patch no longer does that.
>
> I'm looking directly into current indirect_thunk_name,
> output_indirect_thunk and output_indirect_thunk_function functions in
> i386.c which have plenty of the mentioned checks.

Improved with attached patch.

2018-01-16  Uros Bizjak  

* config/i386/i386.c (indirect_thunk_name): Declare regno
as unsigned int.  Compare regno with INVALID_REGNUM.
(output_indirect_thunk): Ditto.
(output_indirect_thunk_function): Ditto.
(ix86_code_end): Declare regno as unsigned int.  Use INVALID_REGNUM
in the call to output_indirect_thunk_function.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ea9c462..7f233d1 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10765,16 +10765,16 @@ static int indirect_thunks_bnd_used;
 /* Fills in the label name that should be used for the indirect thunk.  */
 
 static void
-indirect_thunk_name (char name[32], int regno, bool need_bnd_p,
-bool ret_p)
+indirect_thunk_name (char name[32], unsigned int regno,
+bool need_bnd_p, bool ret_p)
 {
-  if (regno >= 0 && ret_p)
+  if (regno != INVALID_REGNUM && ret_p)
 gcc_unreachable ();
 
   if (USE_HIDDEN_LINKONCE)
 {
   const char *bnd = need_bnd_p ? "_bnd" : "";
-  if (regno >= 0)
+  if (regno != INVALID_REGNUM)
{
  const char *reg_prefix;
  if (LEGACY_INT_REGNO_P (regno))
@@ -10792,7 +10792,7 @@ indirect_thunk_name (char name[32], int regno, bool 
need_bnd_p,
 }
   else
 {
-  if (regno >= 0)
+  if (regno != INVALID_REGNUM)
{
  if (need_bnd_p)
ASM_GENERATE_INTERNAL_LABEL (name, "LITBR", regno);
@@ -10844,7 +10844,7 @@ indirect_thunk_name (char name[32], int regno, bool 
need_bnd_p,
  */
 
 static void
-output_indirect_thunk (bool need_bnd_p, int regno)
+output_indirect_thunk (bool need_bnd_p, unsigned int regno)
 {
   char indirectlabel1[32];
   char indirectlabel2[32];
@@ -10874,7 +10874,7 @@ output_indirect_thunk (bool need_bnd_p, int regno)
 
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel2);
 
-  if (regno >= 0)
+  if (regno != INVALID_REGNUM)
 {
   /* MOV.  */
   rtx xops[2];
@@ -10898,12 +10898,12 @@ output_indirect_thunk (bool need_bnd_p, int regno)
 }
 
 /* Output a funtion with a call and return thunk for indirect branch.
-   If BND_P is true, the BND prefix is needed.   If REGNO != -1,  the
-   function address is in REGNO.  Otherwise, the function address is
+   If BND_P is true, the BND prefix is needed.  If REGNO != UNVALID_REGNUM,
+   the function address is in REGNO.  Otherwise, the function address is
on the top of stack.  */
 
 static void
-output_indirect_thunk_function (bool need_bnd_p, int regno)
+output_indirect_thunk_function (bool need_bnd_p, unsigned int regno)
 {
   char name[32];
   tree decl;
@@ -10952,7 +10952,7 @@ output_indirect_thunk_function (bool need_bnd_p, int 
regno)
ASM_OUTPUT_LABEL (asm_out_file, name);
   }
 
-  if (regno < 0)
+  if (regno == INVALID_REGNUM)
 {
   /* Create alias for __x86.return_thunk/__x86.return_thunk_bnd.  */
   char alias[32];
@@ -11026,16 +11026,16 @@ static void
 ix86_code_end (void)
 {
   rtx xops[2];
-  int regno;
+  unsigned int regno;
 
   if (indirect_thunk_needed)
-output_indirect_thunk_function (false, -1);
+output_indirect_thunk_function (false, INVALID_REGNUM);
   if (indirect_thunk_bnd_needed)
-output_indirect_thunk_function (true, -1);
+output_indirect_thunk_function (true, INVALID_REGNUM);
 
   for (regno = FIRST_REX_INT_REG; regno <= LAST_REX_INT_REG; regno++)
 {
-  int i = regno - FIRST_REX_INT_REG + LAST_INT_REG + 1;
+  unsigned int i = regno - FIRST_REX_INT_REG + LAST_INT_REG + 1;
   if ((indirect_thunks_used & (1 << i)))
output_indirect_thunk_function (false, regno);

[PATCH, rs6000] Bug fixes for the Power 9 stxvl and lxvl instructions.

2018-01-16 Thread Carl Love

GCC maintainers:

The following patch contains fixes for the stxvl and lxvl instructions
and XL_LEN_R builtin that were found while adding additional Power 9
test cases for the various load and store builtins.  The new tests in
builtins-5-p9-runnable.c and builtins-6-p9-runnable.c are included that
exposed the bugs.

The test cases have been run and verified by hand on Power 9 without
error.  The full regressions on Power 8 LE, Power 8 BE and Power 9 are
currently running.

Please let me know if the patch is acceptable provided the regression
testing completes cleanly.  Thanks.

 Carl Love


---


gcc/ChangeLog:

2018-01-16 Carl Love  
* config/rs6000/vsx.md (define_expand xl_len_r,
define_expand stxvl, define_expand *stxvl): Add match_dup
argument.

gcc/testsuite/ChangeLog:

2018-01-16  Carl Love  
* gcc.target/powerpc/builtins-6-p9-runnable.c: Add additional tests.
Add debug print statements.
* gcc.target/powerpc/builtins-5-p9-runnable.c: Add test to do
16 byte vector load followed by a partial vector load.
---
 gcc/config/rs6000/rs6000-builtin.def   |4 +-
 gcc/config/rs6000/vsx.md   |   26 +-
 .../gcc.target/powerpc/builtins-5-p9-runnable.c|  150 +-
 .../gcc.target/powerpc/builtins-6-p9-runnable.c| 1759 
 4 files changed, 1214 insertions(+), 725 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 0f7da6a4a..b17036c5a 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2197,8 +2197,8 @@ BU_P9V_OVERLOAD_2 (VIEDP, "insert_exp_dp")
 BU_P9V_OVERLOAD_2 (VIESP,  "insert_exp_sp")
 
 /* 2 argument vector functions added in ISA 3.0 (power9).  */
-BU_P9V_64BIT_VSX_2 (LXVL,  "lxvl", CONST,  lxvl)
-BU_P9V_64BIT_VSX_2 (XL_LEN_R,  "xl_len_r", CONST,  xl_len_r)
+BU_P9V_64BIT_VSX_2 (LXVL,  "lxvl", PURE,   lxvl)
+BU_P9V_64BIT_VSX_2 (XL_LEN_R,  "xl_len_r", PURE,   xl_len_r)
 
 BU_P9V_AV_2 (VEXTUBLX, "vextublx", CONST,  vextublx)
 BU_P9V_AV_2 (VEXTUBRX, "vextubrx", CONST,  vextubrx)
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 0323e866f..03f8ec2d6 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4624,10 +4624,12 @@ (define_expand "first_mismatch_or_eos_index_"
 ;; Load VSX Vector with Length
 (define_expand "lxvl"
   [(set (match_dup 3)
-(match_operand:DI 2 "register_operand"))
+(ashift:DI (match_operand:DI 2 "register_operand")
+   (const_int 56)))
(set (match_operand:V16QI 0 "vsx_register_operand")
(unspec:V16QI
 [(match_operand:DI 1 "gpc_reg_operand")
+  (mem:V16QI (match_dup 1))
  (match_dup 3)]
 UNSPEC_LXVL))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
@@ -4639,16 +4641,17 @@ (define_insn "*lxvl"
   [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
(unspec:V16QI
 [(match_operand:DI 1 "gpc_reg_operand" "b")
- (match_operand:DI 2 "register_operand" "+r")]
+ (mem:V16QI (match_dup 1))
+ (match_operand:DI 2 "register_operand" "r")]
 UNSPEC_LXVL))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
-  "sldi %2,%2, 56\; lxvl %x0,%1,%2"
-  [(set_attr "length" "8")
-   (set_attr "type" "vecload")])
+  "lxvl %x0,%1,%2"
+  [(set_attr "type" "vecload")])
 
 (define_insn "lxvll"
   [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
(unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b")
+   (mem:V16QI (match_dup 1))
   (match_operand:DI 2 "register_operand" "r")]
  UNSPEC_LXVLL))]
   "TARGET_P9_VECTOR"
@@ -4677,6 +4680,7 @@ (define_expand "xl_len_r"
 (define_insn "stxvll"
   [(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand" "b"))
(unspec:V16QI [(match_operand:V16QI 0 "vsx_register_operand" "wa")
+  (mem:V16QI (match_dup 1))
   (match_operand:DI 2 "register_operand" "r")]
  UNSPEC_STXVLL))]
   "TARGET_P9_VECTOR"
@@ -4686,10 +4690,12 @@ (define_insn "stxvll"
 ;; Store VSX Vector with Length
 (define_expand "stxvl"
   [(set (match_dup 3)
-   (match_operand:DI 2 "register_operand"))
+   (ashift:DI (match_operand:DI 2 "register_operand")
+  (const_int 56)))
(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand"))
(unspec:V16QI
 [(match_operand:V16QI 0 "vsx_register_operand")
+ (mem:V16QI (match_dup 1))
  (match_dup 3)]
 UNSPEC_STXVL))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
@@ -4701,12 +4707,12 @@ (define_insn "*stxvl"
   [(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand" "b"))
(unspec:V16QI
 [(match_operand:V16QI 0 "vsx_register_operand" "wa")
- (match_

Re: [PATCH] i386: More use reference of struct ix86_frame to avoid copy

2018-01-16 Thread H.J. Lu

On Tue, Jan 16, 2018 at 8:09 AM, H.J. Lu  wrote:
> On Tue, Jan 16, 2018 at 7:03 AM, Martin Liška  wrote:
>> On 01/16/2018 01:35 PM, H.J. Lu wrote:
>>> On Tue, Jan 16, 2018 at 3:40 AM, H.J. Lu  wrote:
 This patch has been used with my Spectre backport for GCC 7 for many
 weeks and has been checked into GCC 7 branch.  Should I revert it on
 GCC 7 branch or check it into trunk?
>>>
>>> Ada build failed with this on trunk:
>>>
>>> raised STORAGE_ERROR : stack overflow or erroneous memory access
>>> make[5]: *** 
>>> [/export/gnu/import/git/sources/gcc/gcc/ada/Make-generated.in:45:
>>> ada/sinfo.h] Error 1
>>
>> Hello.
>>
>> I know that you've already reverted the change, but it's possible to replace
>> struct ix86_frame &frame = cfun->machine->frame;
>>
>> with:
>> struct ix86_frame *frame = &cfun->machine->frame;
>>
>> And replace usages with point access operator (->). That would also avoid 
>> copying.
>
> Won't it be equivalent to reference?
>
>> One another question. After you switched to references, isn't the behavior 
>> of function
>> ix86_expand_epilogue as it also contains write to frame struct like:
>>
>>  14799/* Special care must be taken for the normal return case of a 
>> function
>>  14800   using eh_return: the eax and edx registers are marked as saved, 
>> but
>>  14801   not restored along this path.  Adjust the save location to 
>> match.  */
>>  14802if (crtl->calls_eh_return && style != 2)
>>  14803  frame.reg_save_offset -= 2 * UNITS_PER_WORD;
>
> That could be the issue.  I will double check it.
>

Revert the ix86_expand_epilogue change fixes the ada build.  I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83905


-- 
H.J.

Re: VIEW_CONVERT_EXPR slots for strict-align targets (PR 83884)

2018-01-16 Thread Eric Botcazou

> I'd assumed that variable-length types couldn't occur here, since it
> seems strange to view-convert a variable-length type to a fixed-length
> one.

This happens all the time in Ada when you convert an unconstrained type into 
one of its constrained subtypes (but the run-time sizes must match).

> But is this VIEW_CONVERT_EXPR really valid?  Maybe this is just
> papering over a deeper issue.  There again, the MAX in the old
> code was presumably there because the sizes can be different...

The problem is that Ada exposes VIEW_CONVERT_EXPR to the user and the user can 
do very weird things with it so you need to be prepared for the worst.

> 2018-01-16  Richard Sandiford  
> 
> gcc/
>   PR middle-end/83884
>   * expr.c (expand_expr_real_1): Use the size of GET_MODE (op0)
>   rather than the size of inner_type to determine the stack slot size
>   when handling VIEW_CONVERT_EXPRs on strict-alignment targets.

This looks good to me, thanks for fixing the problem.  Unexpectedly enough, I 
don't see the failures on SPARC (32-bit or 64-bit).

-- 
Eric Botcazou

[C++ Patch] PR 81054 ("[7/8 Regression] ICE with volatile variable in constexpr function") [Take 2]

2018-01-16 Thread Paolo Carlini


Hi again,

thus I figured out what was badly wrong in my first try: I misread 
ensure_literal_type_for_constexpr_object and missed that it can return 
NULL_TREE without emitting an hard error. Thus my first try even caused 
miscompilations :( Anyway, when DECL_DECLARED_CONSTEXPR_P is true we are 
safe and indeed we want to clear it as matter of error recovery. Then, 
in this safe case the only change in the below is returning early, thus 
avoiding any internal inconsistencies later and also the redundant / 
misleading diagnostic which I already mentioned. Testing on x86_64-linux 
is still in progress - in libstdc++ - but I separately checked that all 
the regressions are gone.


Thanks! Paolo.

/


/cp
2018-01-16  Paolo Carlini  

PR c++/81054
* decl.c (cp_finish_decl): Early return when
ensure_literal_type_for_constexpr_object returns NULL_TREE 
and DECL_DECLARED_CONSTEXPR_P is true.

/testsuite
2018-01-16  Paolo Carlini  

PR c++/81054
* g++.dg/cpp0x/constexpr-ice19.C: New.
Index: cp/decl.c
===
--- cp/decl.c   (revision 256753)
+++ cp/decl.c   (working copy)
@@ -6810,8 +6810,12 @@ cp_finish_decl (tree decl, tree init, bool init_co
   cp_apply_type_quals_to_decl (cp_type_quals (type), decl);
 }
 
-  if (!ensure_literal_type_for_constexpr_object (decl))
-DECL_DECLARED_CONSTEXPR_P (decl) = 0;
+  if (!ensure_literal_type_for_constexpr_object (decl)
+  && DECL_DECLARED_CONSTEXPR_P (decl))
+{
+  DECL_DECLARED_CONSTEXPR_P (decl) = 0;
+  return;
+}
 
   if (VAR_P (decl)
   && DECL_CLASS_SCOPE_P (decl)
Index: testsuite/g++.dg/cpp0x/constexpr-ice19.C
===
--- testsuite/g++.dg/cpp0x/constexpr-ice19.C(nonexistent)
+++ testsuite/g++.dg/cpp0x/constexpr-ice19.C(working copy)
@@ -0,0 +1,13 @@
+// PR c++/81054
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  volatile int i;
+  constexpr A() : i() {}
+};
+
+struct B
+{
+  static constexpr A a {};  // { dg-error "not literal" }
+};

[PATCH] make -Wrestrict for strcat more meaningful (PR 83698)

2018-01-16 Thread Martin Sebor


PR 83698 - bogus offset in -Wrestrict messages for strcat of
unknown strings, points out that the offsets printed by
-Wrestrict for possibly overlapping strcat calls with unknown
strings don't look meaningful in some cases.  The root cause
of the bogus values is wrapping during the conversion from
offset_int in which the pass tracks numerical values to
HOST_WIDE_INT for printing.  (The problem will go away once
GCC's pretty-printer supports wide int formatting.)  For
instance, the following:

  extern char *d;
  strcat (d + 3, d + 5);

results in

  warning: ‘strcat’ accessing 0 or more bytes at offsets 3 and 5 may 
overlap 1 byte at offset [3, -9223372036854775806]


which, besides printing the bogus negative offset on LP64
targets, isn't correct because strcat always accesses at least
one byte (the nul) and there can be no overlap at offset 3.
To be more accurate, the warning should say something like:

  warning: ‘strcat’ accessing 3 or more bytes at offsets 3 and 5 may 
overlap 1 byte at offset 5 [-Wrestrict]


because the function must access at least 3 bytes in order to
cause an overlap, and when it does, the overlap starts at the
higher of the two offsets, i.e., 5.  (Though it's virtually
impossible to have a single sentence and a singled set of
numbers cover all the cases with perfect accuracy.)

The attached patch fixes these issues to make the printed values
make more sense.  (It doesn't affect when diagnostics are printed.)

Although this isn't strictly a regression, it has an impact on
the readability of the warnings.  If left unchanged, the original
messages are likely to confuse users and lead to bug reports.

Martin
PR tree-optimization/83698 - bogus offset in -Wrestrict messages for strcat of unknown strings

gcc/ChangeLog:

	PR tree-optimization/83698
	* gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): For
	arrays constrain the offset range to their bounds.
	(builtin_access::strcat_overlap): Adjust the bounds of overlap offset.
	(builtin_access::overlap): Avoid setting the size of overlap if it's
	already been set.
	(maybe_diag_overlap): Also consider arrays when deciding what values
	of offsets to include in diagnostics.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83698
	* gcc.dg/Wrestrict-7.c: New test.
	* c-c++-common/Wrestrict.c: Adjust expected values for strcat.
	* gcc.target/i386/chkp-stropt-17.c: Same.

Index: gcc/gimple-ssa-warn-restrict.c
===
--- gcc/gimple-ssa-warn-restrict.c	(revision 256752)
+++ gcc/gimple-ssa-warn-restrict.c	(working copy)
@@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
 	  base = SSA_NAME_VAR (base);
   }
 
+  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
+{
+  if (offrange[0] < 0 && offrange[1] > 0)
+	offrange[0] = 0;
+}
+
   if (size)
 {
   tree range[2];
@@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
 return false;
 
   /* When strcat overlap is certain it is always a single byte:
- the terminatinn NUL, regardless of offsets and sizes.  When
+ the terminating NUL, regardless of offsets and sizes.  When
  overlap is only possible its range is [0, 1].  */
   acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
   acs.ovlsiz[1] = 1;
-  acs.ovloff[0] = (dstref->sizrange[0] + dstref->offrange[0]).to_shwi ();
-  acs.ovloff[1] = (dstref->sizrange[1] + dstref->offrange[1]).to_shwi ();
 
-  acs.sizrange[0] = wi::smax (acs.dstsiz[0], srcref->sizrange[0]).to_shwi ();
+  offset_int endoff = dstref->offrange[0] + dstref->sizrange[0];
+  if (endoff <= srcref->offrange[0])
+acs.ovloff[0] = wi::smin (maxobjsize, srcref->offrange[0]).to_shwi ();
+  else
+acs.ovloff[0] = wi::smin (maxobjsize, endoff).to_shwi ();
+
+  acs.sizrange[0] = wi::smax (wi::abs (endoff - srcref->offrange[0]) + 1,
+			  srcref->sizrange[0]).to_shwi ();
+  if (dstref->offrange[0] == dstref->offrange[1])
+{
+  if (srcref->offrange[0] == srcref->offrange[1])
+	acs.ovloff[1] = acs.ovloff[0];
+  else
+	acs.ovloff[1]
+	  = wi::smin (maxobjsize,
+		  srcref->offrange[1] + srcref->sizrange[1]).to_shwi ();
+}
+  else
+acs.ovloff[1]
+  = wi::smin (maxobjsize,
+		  dstref->offrange[1] + dstref->sizrange[1]).to_shwi ();
+
+  if (acs.sizrange[0] == 0)
+acs.sizrange[0] = 1;
   acs.sizrange[1] = wi::smax (acs.dstsiz[1], srcref->sizrange[1]).to_shwi ();
   return true;
 }
@@ -1224,8 +1251,12 @@ builtin_access::overlap ()
   /* Call the appropriate function to determine the overlap.  */
   if ((this->*detect_overlap) ())
 {
-  sizrange[0] = wi::smax (acs.dstsiz[0], srcref->sizrange[0]).to_shwi ();
-  sizrange[1] = wi::smax (acs.dstsiz[1], srcref->sizrange[1]).to_shwi ();
+  if (!sizrange[1])
+	{
+	  /* Unless the access size range has already been set, do so here.  */
+	  sizrange[0] = wi::smax (acs.dstsiz[0], srcref->sizrange[0]).to_shwi ();
+	  sizrange[1] = wi::smax

[libstdc++] Fix 17_intro/names.cc on SPARC/Linux

2018-01-16 Thread Eric Botcazou

The SPARC-V8 architecture contains a Y register so  defines 
a structure with a 'y' field on Linux.

Tested on SPARC64/Linux, applied on the mainline and 7 branch as obvious.


2018-01-16  Eric Botcazou  

* testsuite/17_intro/names.cc: Undefine 'y' on SPARC/Linux.

-- 
Eric Botcazou
Index: testsuite/17_intro/names.cc
===
--- testsuite/17_intro/names.cc	(revision 256562)
+++ testsuite/17_intro/names.cc	(working copy)
@@ -112,4 +112,8 @@
 #undef r
 #endif
 
+#if defined (__linux__) && defined (__sparc__)
+#undef y
+#endif
+
 #include

Fix PR testsuite/77734 on SPARC

2018-01-16 Thread Eric Botcazou

We need to enable delayed-branch scheduling to have sibling calls on SPARC.

Tested on SPARC64/Linux, applied on the mainline and 7 branch.


2018-01-16  Eric Botcazou  

PR testsuite/77734
* gcc.dg/plugin/must-tail-call-1.c: Pass -fdelayed-branch on SPARC.

-- 
Eric BotcazouIndex: gcc.dg/plugin/must-tail-call-1.c
===
--- gcc.dg/plugin/must-tail-call-1.c	(revision 256562)
+++ gcc.dg/plugin/must-tail-call-1.c	(working copy)
@@ -1,3 +1,5 @@
+/* { dg-options "-fdelayed-branch" { target sparc*-*-* } } */
+
 extern void abort (void);
 
 int __attribute__((noinline,noclone))

[visium] Very minor tweak

2018-01-16 Thread Eric Botcazou

Tested on visium-elf, applied on the mainline.


2018-01-16  Eric Botcazou  

* config/visium/visium.md (nop): Tweak comment.
(hazard_nop): Likewise.

-- 
Eric BotcazouIndex: config/visium/visium.md
===
--- config/visium/visium.md	(revision 256562)
+++ config/visium/visium.md	(working copy)
@@ -2962,13 +2962,13 @@ (define_insn "dsi"
 (define_insn "nop"
   [(const_int 0)]
   ""
-  "nop			;generated nop"
+  "nop			;generated"
   [(set_attr "type" "nop")])
 
 (define_insn "hazard_nop"
   [(unspec_volatile [(const_int 0)] UNSPEC_NOP)]
   ""
-  "nop			;hazard avoidance nop"
+  "nop			;hazard avoidance"
   [(set_attr "type" "nop")])
 
 (define_insn "blockage"

[testsuite] Skip loop tests on Visium

2018-01-16 Thread Eric Botcazou

They either use too much space in the data segment or on the stack.

Tested on visium-elf, applied on the mainline.


2018-01-16  Eric Botcazou  

* gcc.dg/tree-ssa/ldist-27.c: Skip on Visium.
* gcc.dg/tree-ssa/loop-interchange-1.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-1b.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-2.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-3.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-4.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-5.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-6.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-7.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-8.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-9.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-10.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-11.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-14.c: Likewise.
* gcc.dg/tree-ssa/loop-interchange-15.c: Likewise.

-- 
Eric BotcazouIndex: gcc.dg/tree-ssa/ldist-27.c
===
--- gcc.dg/tree-ssa/ldist-27.c	(revision 256562)
+++ gcc.dg/tree-ssa/ldist-27.c	(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+/* { dg-skip-if "too big data segment" { visium-*-* } } */
 
 #define M (300)
 #define N (200)
Index: gcc.dg/tree-ssa/loop-interchange-1.c
===
--- gcc.dg/tree-ssa/loop-interchange-1.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-1.c	(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -floop-interchange -fassociative-math -fno-signed-zeros -fno-trapping-math -fdump-tree-linterchange-details" } */
+/* { dg-skip-if "too big data segment" { visium-*-* } } */
 
 /* Copied from graphite/interchange-4.c */
 
Index: gcc.dg/tree-ssa/loop-interchange-10.c
===
--- gcc.dg/tree-ssa/loop-interchange-10.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-10.c	(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */
+/* { dg-skip-if "too big data segment" { visium-*-* } } */
 
 #define M 256
 int a[M][M], b[M][M];
Index: gcc.dg/tree-ssa/loop-interchange-11.c
===
--- gcc.dg/tree-ssa/loop-interchange-11.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-11.c	(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */
+/* { dg-skip-if "too big data segment" { visium-*-* } } */
 
 #define M 256
 int a[M][M], b[M][M];
Index: gcc.dg/tree-ssa/loop-interchange-14.c
===
--- gcc.dg/tree-ssa/loop-interchange-14.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-14.c	(working copy)
@@ -1,6 +1,7 @@
 /* PR tree-optimization/83337 */
 /* { dg-do run { target int32plus } } */
 /* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */
+/* { dg-skip-if "too big data segment" { visium-*-* } } */
 
 /* Copied from graphite/interchange-5.c */
 
Index: gcc.dg/tree-ssa/loop-interchange-15.c
===
--- gcc.dg/tree-ssa/loop-interchange-15.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-15.c	(working copy)
@@ -2,6 +2,7 @@
 /* { dg-do run { target int32plus } } */
 /* { dg-options "-O2 -floop-interchange" } */
 /* { dg-require-effective-target alloca }  */
+/* { dg-skip-if "too big stack" { visium-*-* } } */
 
 /* Copied from graphite/interchange-5.c */
 
Index: gcc.dg/tree-ssa/loop-interchange-1b.c
===
--- gcc.dg/tree-ssa/loop-interchange-1b.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-1b.c	(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */
+/* { dg-skip-if "too big data segment" { visium-*-* } } */
 
 /* Copied from graphite/interchange-4.c */
 
Index: gcc.dg/tree-ssa/loop-interchange-2.c
===
--- gcc.dg/tree-ssa/loop-interchange-2.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-2.c	(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */
+/* { dg-skip-if "too big data segment" { visium-*-* } } */
 
 /* Copied from graphite/interchange-5.c */
 
Index: gcc.dg/tree-ssa/loop-interchange-3.c
===
--- gcc.dg/tree-ssa/loop-interchange-3.c	(revision 256562)
+++ gcc.dg/tree-ssa/loop-interchange-3.c	(working copy)
@@ -1,5 +1,6 @

[testsuite] Tweak patchable function tests

2018-01-16 Thread Eric Botcazou

On Visium, the compiler sometimes emits a NOP to avoid a pipeline hazard.

Tested on visium-elf and x86_64-suse-linux, applied on the mainline.


2018-01-16  Eric Botcazou  

* c-c++-common/patchable_function_entry-decl.c: Use 3 NOPs on Visium.
* c-c++-common/patchable_function_entry-default.c: Use 4 NOPs on Visium.
* c-c++-common/patchable_function_entry-definition.c: Use 2 NOPs on 
Visium.

-- 
Eric Botcazou
Index: c-c++-common/patchable_function_entry-decl.c
===
--- c-c++-common/patchable_function_entry-decl.c	(revision 256562)
+++ c-c++-common/patchable_function_entry-decl.c	(working copy)
@@ -1,7 +1,8 @@
 /* { dg-do compile { target { ! nvptx*-*-* } } } */
 /* { dg-options "-O2 -fpatchable-function-entry=3,1" } */
-/* { dg-final { scan-assembler-times "nop" 2 { target { ! alpha*-*-* } } } } */
+/* { dg-final { scan-assembler-times "nop" 2 { target { ! { alpha*-*-* visium-*-* } } } } } */
 /* { dg-final { scan-assembler-times "bis" 2 { target alpha*-*-* } } } */
+/* { dg-final { scan-assembler-times "nop" 3 { target visium-*-* } } } */
 
 extern int a;
 
Index: c-c++-common/patchable_function_entry-default.c
===
--- c-c++-common/patchable_function_entry-default.c	(revision 256562)
+++ c-c++-common/patchable_function_entry-default.c	(working copy)
@@ -1,7 +1,8 @@
 /* { dg-do compile { target { ! nvptx*-*-* } } } */
 /* { dg-options "-O2 -fpatchable-function-entry=3,1" } */
-/* { dg-final { scan-assembler-times "nop" 3 { target { ! alpha*-*-* } } } } */
+/* { dg-final { scan-assembler-times "nop" 3 { target { ! { alpha*-*-* visium-*-* } } } } } */
 /* { dg-final { scan-assembler-times "bis" 3 { target alpha*-*-* } } } */
+/* { dg-final { scan-assembler-times "nop" 4 { target visium-*-* } } } */
 
 extern int a;
 
Index: c-c++-common/patchable_function_entry-definition.c
===
--- c-c++-common/patchable_function_entry-definition.c	(revision 256562)
+++ c-c++-common/patchable_function_entry-definition.c	(working copy)
@@ -1,7 +1,8 @@
 /* { dg-do compile { target { ! nvptx*-*-* } } } */
 /* { dg-options "-O2 -fpatchable-function-entry=3,1" } */
-/* { dg-final { scan-assembler-times "nop" 1 { target { ! alpha*-*-* } } } } */
+/* { dg-final { scan-assembler-times "nop" 1 { target { ! { alpha*-*-* visium-*-* } } } } } */
 /* { dg-final { scan-assembler-times "bis" 1 { target alpha*-*-* } } } */
+/* { dg-final { scan-assembler-times "nop" 2 { target visium-*-* } } } */
 
 extern int a;

Re: [PATCH] avoid assuming known string length is constant (PR 83896)

2018-01-16 Thread Jakub Jelinek

On Tue, Jan 16, 2018 at 07:37:30PM +, Richard Sandiford wrote:
> > -/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
> > +/* If RHS, either directly or indirectly, refers to a string of constant
> > +   length, return it.  Otherwise return a negative value.  */
> > +
> >  static int
> >  get_string_len (tree rhs)
> >  {
> 
> I think this should be returning HOST_WIDE_INT given the unconstrained
> tree_to_shwi return.  Same type change for rhslen in the caller.
> 
> (Not my call, but it might be better to have a more specific function name,
> given that the file already had "get_string_length" before this function
> was added.)

Yeah, certainly for both.

> > @@ -2789,7 +2791,8 @@ get_string_len (tree rhs)
> >   if (idx > 0)
> > {
> >   strinfo *si = get_strinfo (idx);
> > - if (si && si->full_string_p)
> > + if (si && si->full_string_p
> > + && TREE_CODE (si->nonzero_chars) == INTEGER_CST)
> > return tree_to_shwi (si->nonzero_chars);
> 
> tree_fits_shwi_p?

Surely that instead of TREE_CODE check, but even that will not make sure it
fits into host int, so yes, it should be HOST_WIDE_INT and the code should
make sure it is also >= 0.

Jakub

Re: [PATCH] make -Wrestrict for strcat more meaningful (PR 83698)

2018-01-16 Thread Jakub Jelinek

On Tue, Jan 16, 2018 at 01:36:26PM -0700, Martin Sebor wrote:
> --- gcc/gimple-ssa-warn-restrict.c(revision 256752)
> +++ gcc/gimple-ssa-warn-restrict.c(working copy)
> @@ -384,6 +384,12 @@ builtin_memref::builtin_memref (tree expr, tree si
> base = SSA_NAME_VAR (base);
>}
>  
> +  if (DECL_P (base) && TREE_CODE (TREE_TYPE (base)) == ARRAY_TYPE)
> +{
> +  if (offrange[0] < 0 && offrange[1] > 0)
> + offrange[0] = 0;
> +}

Why the 2 nested ifs?

> @@ -1079,14 +1085,35 @@ builtin_access::strcat_overlap ()
>  return false;
>  
>/* When strcat overlap is certain it is always a single byte:
> - the terminatinn NUL, regardless of offsets and sizes.  When
> + the terminating NUL, regardless of offsets and sizes.  When
>   overlap is only possible its range is [0, 1].  */
>acs.ovlsiz[0] = dstref->sizrange[0] == dstref->sizrange[1] ? 1 : 0;
>acs.ovlsiz[1] = 1;
> -  acs.ovloff[0] = (dstref->sizrange[0] + dstref->offrange[0]).to_shwi ();
> -  acs.ovloff[1] = (dstref->sizrange[1] + dstref->offrange[1]).to_shwi ();

You use to_shwi many times in the patch, do the callers or something earlier
in this function guarantee that you aren't throwing away any bits (unlike
tree_to_shwi, to_shwi method doesn't ICE, just throws away upper bits).
Especially when you perform additions like here, even if both wide_ints fit
into a shwi, the result might not.

Jakub

Re: [C++ Patch] PR 81054 ("[7/8 Regression] ICE with volatile variable in constexpr function") [Take 2]

2018-01-16 Thread Jason Merrill

On Tue, Jan 16, 2018 at 3:32 PM, Paolo Carlini  wrote:
> thus I figured out what was badly wrong in my first try: I misread
> ensure_literal_type_for_constexpr_object and missed that it can return
> NULL_TREE without emitting an hard error. Thus my first try even caused
> miscompilations :( Anyway, when DECL_DECLARED_CONSTEXPR_P is true we are
> safe and indeed we want to clear it as matter of error recovery. Then, in
> this safe case the only change in the below is returning early, thus
> avoiding any internal inconsistencies later and also the redundant /
> misleading diagnostic which I already mentioned.

I can't see how this could be right.  In the cases where we don't give
an error (e.g. because we're dealing with an instantiation of a
variable template) there is no error, so we need to proceed with the
rest of cp_finish_decl as normal.

Jason

Re: [C++ Patch] PR 81054 ("[7/8 Regression] ICE with volatile variable in constexpr function") [Take 2]

2018-01-16 Thread Paolo Carlini


Hi Jason

On 16/01/2018 22:35, Jason Merrill wrote:

On Tue, Jan 16, 2018 at 3:32 PM, Paolo Carlini  wrote:

thus I figured out what was badly wrong in my first try: I misread
ensure_literal_type_for_constexpr_object and missed that it can return
NULL_TREE without emitting an hard error. Thus my first try even caused
miscompilations :( Anyway, when DECL_DECLARED_CONSTEXPR_P is true we are
safe and indeed we want to clear it as matter of error recovery. Then, in
this safe case the only change in the below is returning early, thus
avoiding any internal inconsistencies later and also the redundant /
misleading diagnostic which I already mentioned.

I can't see how this could be right.  In the cases where we don't give
an error (e.g. because we're dealing with an instantiation of a
variable template) there is no error, so we need to proceed with the
rest of cp_finish_decl as normal.
The cases where we don't give an error all fall under 
DECL_DECLARED_CONSTEXPR_P == false, thus aren't affected at all.


Unless I'm again misreading ensure_literal_type_for_constexpr_object, I 
hope not.


Paolo.

One more patch for PR80481

2018-01-16 Thread Vladimir Makarov


The patch changes the test to exclude solaris for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80481

The first patch solved the problem for solaris too but solaris gcc still 
generates vmovaps in some different part of the code (unrelated to the 
problem) where linux gcc does not.


Committed as rev. 256761.


Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 256760)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2018-01-16  Vladimir Makarov  
+
+	PR rtl-optimization/80481
+	* g++.dg/pr80481.C: Exclude solaris.
+
 2018-01-16  Eric Botcazou  
 
 	* c-c++-common/patchable_function_entry-decl.c: Use 3 NOPs on Visium.
Index: testsuite/g++.dg/pr80481.C
===
--- testsuite/g++.dg/pr80481.C	(revision 256760)
+++ testsuite/g++.dg/pr80481.C	(working copy)
@@ -1,4 +1,4 @@
-// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-do compile { target { i?86-*-* x86_64-*-* }  && { ! *-*-solaris* } } }
 // { dg-options "-Ofast -funroll-loops -fopenmp -march=knl" }
 // { dg-final { scan-assembler-not "vmovaps" } }

Re: Compilation warning in simple-object-xcoff.c

2018-01-16 Thread Andreas Schwab

On Jan 16 2018, DJ Delorie  wrote:

> And it's not the host's bit size that counts; there are usually ways to
> get 64-bit file operations on 32-bit hosts.

If ACX_LARGEFILE doesn't succeed in enabling those 64-bit file
operations (thus making off_t a 64-bit type) then you are out of luck
(or AC_SYS_LARGEFILE doesn't support your host yet).

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Re: [PATCH] avoid assuming known string length is constant (PR 83896)

2018-01-16 Thread Martin Sebor


On 01/16/2018 12:37 PM, Richard Sandiford wrote:

Martin Sebor  writes:

Recent improvements to the strlen pass introduced the assumption
that when the length of a string has been recorded by the pass
the length is necessarily constant.  Bug 83896 shows that this
assumption is not always true, and that GCC fails with an ICE
when it doesn't hold.  To avoid the ICE the attached patch
removes the assumption.

x86_64-linux bootstrap successful, regression test in progress.

Martin

PR tree-optimization/83896 - ice in get_string_len on a call to strlen with 
non-constant length

gcc/ChangeLog:

PR tree-optimization/83896
* tree-ssa-strlen.c (get_string_len): Avoid assuming length is constant.

gcc/testsuite/ChangeLog:

PR tree-optimization/83896
* gcc.dg/strlenopt-43.c: New test.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c   (revision 256752)
+++ gcc/tree-ssa-strlen.c   (working copy)
@@ -2772,7 +2772,9 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
 }
 }

-/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
+/* If RHS, either directly or indirectly, refers to a string of constant
+   length, return it.  Otherwise return a negative value.  */
+
 static int
 get_string_len (tree rhs)
 {


I think this should be returning HOST_WIDE_INT given the unconstrained
tree_to_shwi return.  Same type change for rhslen in the caller.


Thanks for looking at it!  I confess it's not completely clear
to me in what type the pass tracks string lengths.  For string
constants, get_stridx() returns an int with the their length
bit-flipped.  I tried to maintain that invariant in the change
I introduced in the block toward the end of the function (in
a different patch).  But then in other places the pass works
with HOST_WIDE_INT, so it looks like it would be appropriate
to use here as well.

I tried to come up with a test case that would exercise this
conversion but couldn't.  If you (or someone else) have an idea
for one I'd be more than happy to add it to the test suite.


(Not my call, but it might be better to have a more specific function name,
given that the file already had "get_string_length" before this function
was added.)


I renamed it (again), this time to get_string_cst_length().
Nothing better came to mind.




@@ -2789,7 +2791,8 @@ get_string_len (tree rhs)
  if (idx > 0)
{
  strinfo *si = get_strinfo (idx);
- if (si && si->full_string_p)
+ if (si && si->full_string_p
+ && TREE_CODE (si->nonzero_chars) == INTEGER_CST)
return tree_to_shwi (si->nonzero_chars);


tree_fits_shwi_p?


Sigh.  Yes.  I still keep forgetting about all these gotchas.
Dealing with integers is so painfully error-prone in GCC (as
evident from all the bug reports with ICEs for these things).

It would be much simpler and safer if tree_to_shwi() returned
true on success and false for error (e.g., null, non-const,
or overflow) and took an extra argument for the result.  Then
the code would become:

  HOST_WIDE_INT result;
  if (si && tree_to_shwi (&result, si->nonzero_chars))
return result;

and it would be nearly impossible to forget to check for bad
input.

Anyway, attached is an updated patch.

Martin



Thanks,
Richard


}
}
Index: gcc/testsuite/gcc.dg/strlenopt-43.c
===
--- gcc/testsuite/gcc.dg/strlenopt-43.c (nonexistent)
+++ gcc/testsuite/gcc.dg/strlenopt-43.c (working copy)
@@ -0,0 +1,13 @@
+/* PR tree-optimization/83896 - ice in get_string_len on a call to strlen
+   with non-constant length
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+extern char a[5];
+extern char b[];
+
+void f (void)
+{
+  if (__builtin_strlen (b) != 4)
+__builtin_memcpy (a, b, sizeof a);
+}


PR tree-optimization/83896 - ice in get_string_len on a call to strlen with non-constant length

gcc/ChangeLog:

	PR tree-optimization/83896
	* tree-ssa-strlen.c (get_string_len): Rename...
	(get_string_cst_length): ...to this.  Return HOST_WIDE_INT.
	Avoid assuming length is constant.
	(handle_char_store): Use HOST_WIDE_INT for string length.

gcc/testsuite/ChangeLog:

	PR tree-optimization/83896
	* gcc.dg/strlenopt-43.c: New test.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 256752)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -2772,16 +2772,20 @@ handle_pointer_plus (gimple_stmt_iterator *gsi)
 }
 }
 
-/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
-static int
-get_string_len (tree rhs)
+/* If RHS, either directly or indirectly, refers to a string of constant
+   length, return it.  Otherwise return a negative value.  */
+
+static HOST_WIDE_INT
+get_string_cst_length (tree rhs)
 {
   if (TREE_CODE (rhs) == MEM

Re: [PATCH] avoid assuming known string length is constant (PR 83896)

2018-01-16 Thread Martin Sebor


On 01/16/2018 02:26 PM, Jakub Jelinek wrote:

On Tue, Jan 16, 2018 at 07:37:30PM +, Richard Sandiford wrote:

-/* Check if RHS is string_cst possibly wrapped by mem_ref.  */
+/* If RHS, either directly or indirectly, refers to a string of constant
+   length, return it.  Otherwise return a negative value.  */
+
 static int
 get_string_len (tree rhs)
 {


I think this should be returning HOST_WIDE_INT given the unconstrained
tree_to_shwi return.  Same type change for rhslen in the caller.

(Not my call, but it might be better to have a more specific function name,
given that the file already had "get_string_length" before this function
was added.)


Yeah, certainly for both.


@@ -2789,7 +2791,8 @@ get_string_len (tree rhs)
  if (idx > 0)
{
  strinfo *si = get_strinfo (idx);
- if (si && si->full_string_p)
+ if (si && si->full_string_p
+ && TREE_CODE (si->nonzero_chars) == INTEGER_CST)
return tree_to_shwi (si->nonzero_chars);


tree_fits_shwi_p?


Surely that instead of TREE_CODE check, but even that will not make sure it
fits into host int, so yes, it should be HOST_WIDE_INT and the code should
make sure it is also >= 0.


I made these changes except for the last part:  How/when can
the length be negative?

Martin

Re: [PATCH, rs6000] (v2) Support for gimple folding of mergeh, mergel intrinsics

2018-01-16 Thread Segher Boessenkool

Hi!

On Tue, Jan 16, 2018 at 01:39:28PM -0600, Will Schmidt wrote:
> Sniff-tests of the target tests on a single system look OK.  Full regtests are
> currently running across assorted power systems.
> OK for trunk, pending successful results?

Just a few little things:

> 2018-01-16  Will Schmidt  
> 
>   * config/rs6000/rs6000.c: (rs6000_gimple_builtin) Add gimple folding
>   support for merge[hl].

The : goes after the ).

>  (define_insn "altivec_vmrghw_direct"
> -  [(set (match_operand:V4SI 0 "register_operand" "=v")
> -(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
> -  (match_operand:V4SI 2 "register_operand" "v")]
> - UNSPEC_VMRGH_DIRECT))]
> +  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
> + (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v,wa")
> +   (match_operand:V4SI 2 "register_operand" "v,wa")]
> +  UNSPEC_VMRGH_DIRECT))]
>"TARGET_ALTIVEC"
> -  "vmrghw %0,%1,%2"
> +  "@
> +  vmrghw %0,%1,%2
> +  xxmrghw %x0,%x1,%x2"

Those last two lines should be indented one more space, so that everything
aligns (with the @).

> +  "@
> +  vmrglw %0,%1,%2
> +  xxmrglw %x0,%x1,%x2"

Same here of course.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-be-folded.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { powerpc-*-* } } } */

Do you want powerpc*-*-*?  That is default in gcc.target/powerpc; dg-do
compile is default, too, so you can either say

/* { dg-do compile } */

or nothing at all, to taste.

But it looks like you want to restrict to BE?  We still don't have a
dejagnu thingy for that; you could put some #ifdef around it all (there
are some examples in other testcases).  Not ideal, but works.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-float.c
> @@ -0,0 +1,26 @@
> +/* Verify that overloaded built-ins for vec_splat with float
> +   inputs produce the right code.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-maltivec -O2" } */

Either powerpc_altivec_ok or -mvsx?

> new file mode 100644
> index 000..ab5f54e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-longlong.c
> @@ -0,0 +1,48 @@
> +/* Verify that overloaded built-ins for vec_merge* with long long
> +   inputs produce the right code.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mvsx -O2" } */

Either powerpc_vsx_ok or -mpower8-vector?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-pixel.c
> @@ -0,0 +1,24 @@
> +/* Verify that overloaded built-ins for vec_splat with pixel
> +   inputs produce the right code.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-maltivec -mvsx -O2" } */

-mvsx implies -maltivec (not wrong of course, just a bit weird).

Okay for trunk with those nits fixed.  Thanks!


Segher

Re: [PATCH] avoid assuming known string length is constant (PR 83896)

2018-01-16 Thread Jakub Jelinek

On Tue, Jan 16, 2018 at 03:20:24PM -0700, Martin Sebor wrote:
> Thanks for looking at it!  I confess it's not completely clear
> to me in what type the pass tracks string lengths.  For string
> constants, get_stridx() returns an int with the their length
> bit-flipped.  I tried to maintain that invariant in the change

That is because TREE_STRING_LENGTH is an int, so gcc doesn't allow
string literals longer than 2GB.  All other length are tracked as tree.

Jakub

1 2 >

1 - 100 of 117 matches

Mail list logo