Re: Move -Wmaybe-uninitialized to -Wextra

2019-02-03 Thread Marc Glisse

On Sat, 2 Feb 2019, Martin Sebor wrote:


I don't feel too strongly about whether -Wmaybe-uninitialized should
be in -Wall or in -Wextra, and I might even be somewhat more inclined
to expect to find it in the latter.  But since you sound like you are
gearing up for proposing other changes in the same vein down the line
where I might have stronger opinions, I should comment.

[It's a bit of a long-winded response because I've been thinking about
this topic a lot lately.]


Thank you for taking the time to write this down.


In general, I think a discussion of warning groupings is worth having
(even needed) at the right time, but stage 4 doesn't strike me as
the most opportune occasion.

Specifically for -Wmaybe-uninitialized, the option has been in -Wall
for many releases, and no major changes to it have been made recently
that I know.  So what I'm missing in this proposal is: why now?  What
has changed to make this a pressing issue now?  Has its rate of false
positives gone up significantly?  If so, by how much and why?


I've been wanting to post this for years, but I only got motivated enough 
recently after seeing negative effects of -Wmaybe-uninitialized in two 
projects within a few days. I don't think the rate of false positives has 
gone up significantly in gcc-9, they just randomly appear or disappear 
from one release to the next. In some sense, for projects that support 
multiple compiler versions, each new gcc release makes the number of false 
positives (on at least one version) go up, but that doesn't count as the 
warning getting worse.


The discussion and/or the change don't need to happen now, but if I didn't 
post this patch while motivated, it might be years before I actually do 
it, so I ignored stage 4.



Please be clear about what you mean by false positives.  Is it that
the warning triggers contrary to the documentation ("a path exists
where the variable is uninitialized along with one where it is"),
or that the path to the use of the variable does exist but we
(though not GCC) can tell from the context that it cannot be reached?


The latter.


(The latter wouldn't qualify as a false positive as the term is defined
in literature; i.e., the warning works as designed, we just don't like
or agree with it.)


Indeed the definition of false positive is important. In some sense, no 
warning is ever a false positive, when its definition is "the compiler 
decided to warn". But that makes the warning useless. 
-Wmaybe-uninitialized does not say there is an uninitialized use, it says 
the control flow is too complicated or the compiler not clever enough (or 
there is indeed an uninitialized use for some input, and only this subcase 
is a bug). Another warning, -Wstrict-overflow (after it moved to the 
middle-end), was more of an optimization note than a true warning: 
"warning: assuming your code is valid when generating code", Duh! Was I 
supposed to make it invalid to quiet the warning?



In practice, false positives (and negatives) of both kinds, whether
they fit the formal definition or the informal one, are the nature
of virtually all non-trivial static diagnostics, certainly all those
that depend on control or data flow analysis.  Some are due to bugs
or limitations in the implementation of the warning.  Others are
inherent in the technology.


Yes, and I argue that these warnings belong in a different "level" of 
warnings than the trivial warnings.


Is this warning more prone to one kind than others?  If so, is it 
because it's implemented poorly, or that its implementation hasn't kept 
up with the improvements to the optimizers, or has the infrastructure it 
depends on become ill-suited for it to avoid some of the false positives 
(as formally defined), or is it something else?


I am mostly looking at what I see in practice. I regularly see false 
positives of -Wstrict-overflow (less now that I neutered parts of it) and 
-Wmaybe-uninitialized. And when I have strange crashes that later turn out 
to be caused by uninitialized memory uses, it is very seldom that 
-Wmaybe-uninitialized helps detect them. IIRC, last time, I had slightly 
more success with -Wreturn-local-addr -O3 -fkeep-inline-functions 
-fkeep-static-functions.


I think several of the new middle-end warnings you added are about string 
manipulation. I don't do any of that, which may explain why I am not 
seeing them. Also, we are not using gcc-9 much yet.


These false positives are not easy to avoid, as required to be part of 
-Wall. Avoiding them, when it is possible at all, requires not just a 
syntactic tweak, like adding parentheses, but a semantic change that can 
make the code worse. Initializing something that does not need it is extra 
code (increases code size and running time). It also prevents better tools 
from detecting true uninitialized uses, either static analyzers or runtime 
checkers (sanitizer, valgrind).


I don't find the argument very compelling that diagnosing potential
bugs should be avoided

Re: [PATCH 24/46] i386: Emulate MMX maskmovq with SSE2 maskmovdqu

2019-02-03 Thread H.J. Lu
On Fri, Feb 01, 2019 at 01:17:47PM -0800, H.J. Lu wrote:
> Emulate MMX maskmovq with SSE2 maskmovdqu by zeroing out the upper 64
> bits of the mask operand.  A warning is issued since invalid memory
> access may happen when bits 64:127 at memory location are unmapped:
> 
> xmmintrin.h:1168:3: note: Emulate MMX maskmovq with SSE2 maskmovdqu may 
> result in invalid memory access
>  1168 |   __builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P);
>   |   ^~~
> 
> Only SSE register source operand is allowed.
> 

Here is the updated patch to handle unmapped bits 64:127 at memory address
by adjusting source and mask operands together with memory address.


H.J.
---
Emulate MMX maskmovq with SSE2 maskmovdqu by zero-extending source and
mask operands to 128 bits.  Handle unmapped bits 64:127 at memory address
by adjusting source and mask operands together with memory address.

PR target/89021
* config/i386/i386.c (ix86_init_mmx_sse_builtins): Don't
provide __builtin_ia32_maskmovq for TARGET_MMX_WITH_SSE.
* config/i386/mmx.md (mmx_maskmovq): Add "&& !TARGET_MMX_WITH_SSE".
(*mmx_maskmovq): Likewise.
* config/i386/xmmintrin.h: Emulate MMX maskmovq with SSE2
maskmovdqu.
---
 gcc/config/i386/i386.c  | 15 --
 gcc/config/i386/mmx.md  |  4 ++--
 gcc/config/i386/xmmintrin.h | 39 +
 3 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 5f4f7e9ddde..b7cbc3f8a2d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -31048,12 +31048,15 @@ ix86_init_mmx_sse_builtins (void)
   def_builtin_pure (OPTION_MASK_ISA_SSE, 0, "__builtin_ia32_stmxcsr",
UNSIGNED_FTYPE_VOID, IX86_BUILTIN_STMXCSR);
 
-  /* SSE or 3DNow!A */
-  def_builtin (OPTION_MASK_ISA_SSE | OPTION_MASK_ISA_3DNOW_A
-  /* As it uses V4HImode, we have to require -mmmx too.  */
-  | OPTION_MASK_ISA_MMX, 0,
-  "__builtin_ia32_maskmovq", VOID_FTYPE_V8QI_V8QI_PCHAR,
-  IX86_BUILTIN_MASKMOVQ);
+  /* SSE or 3DNow!A.  NB: We can't emulate MMX maskmovq directly with
+ SSE2 maskmovdqu since invalid memory access may happen when bits
+ 64:127 at memory location are unmapped.  */
+  if (!TARGET_MMX_WITH_SSE)
+def_builtin (OPTION_MASK_ISA_SSE | OPTION_MASK_ISA_3DNOW_A
+/* As it uses V4HImode, we have to require -mmmx too.  */
+| OPTION_MASK_ISA_MMX, 0,
+"__builtin_ia32_maskmovq", VOID_FTYPE_V8QI_V8QI_PCHAR,
+IX86_BUILTIN_MASKMOVQ);
 
   /* SSE2 */
   def_builtin (OPTION_MASK_ISA_SSE2, 0, "__builtin_ia32_maskmovdqu",
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index f90574a7255..a1b732ad7be 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1748,7 +1748,7 @@
  (match_operand:V8QI 2 "register_operand")
  (match_dup 0)]
 UNSPEC_MASKMOV))]
-  "TARGET_SSE || TARGET_3DNOW_A")
+  "(TARGET_SSE || TARGET_3DNOW_A) && !TARGET_MMX_WITH_SSE")
 
 (define_insn "*mmx_maskmovq"
   [(set (mem:V8QI (match_operand:P 0 "register_operand" "D"))
@@ -1756,7 +1756,7 @@
  (match_operand:V8QI 2 "register_operand" "y")
  (mem:V8QI (match_dup 0))]
 UNSPEC_MASKMOV))]
-  "TARGET_SSE || TARGET_3DNOW_A"
+  "(TARGET_SSE || TARGET_3DNOW_A) && !TARGET_MMX_WITH_SSE"
   ;; @@@ check ordering of operands in intel/nonintel syntax
   "maskmovq\t{%2, %1|%1, %2}"
   [(set_attr "type" "mmxcvt")
diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h
index 58284378514..680256f5fab 100644
--- a/gcc/config/i386/xmmintrin.h
+++ b/gcc/config/i386/xmmintrin.h
@@ -1165,7 +1165,46 @@ _m_pshufw (__m64 __A, int const __N)
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_maskmove_si64 (__m64 __A, __m64 __N, char *__P)
 {
+#ifdef __x86_64__
+# ifdef __MMX__
+  __builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P);
+# else
+  /* Emulate MMX maskmovq with SSE2 maskmovdqu and handle unmapped bits
+ 64:127 at address __P.  */
+  typedef long long __v2di __attribute__ ((__vector_size__ (16)));
+  typedef char __v16qi __attribute__ ((__vector_size__ (16)));
+  __v2di __A128, __N128;
+
+  /* Check the alignment of __P.  */
+  __SIZE_TYPE__ offset = ((__SIZE_TYPE__) __P) & 0xf;
+  if (offset)
+{
+  /* If the misalignment of __P > 8, subtract __P by 8 bytes.
+Otherwise, subtract __P by the misalignment.  */
+  if (offset > 8)
+   offset = 8;
+  __P = (char *) (((__SIZE_TYPE__) __P) - offset);
+
+  /* Zero-extend __A and __N to 128 bits and shift right by the
+adjustment.  */
+  unsigned __int128 __a128 = ((__v1di) __A)[0];
+  unsigned __int128 __n128 = ((__v1di) __N)[0];
+  __a128 <<= offset * 8

Re: [PATCH 00/46] Implement MMX intrinsics with SSE

2019-02-03 Thread H.J. Lu
On Sat, Feb 02, 2019 at 09:12:12AM -0800, H.J. Lu wrote:
> On Sat, Feb 2, 2019 at 9:07 AM Florian Weimer  wrote:
> >
> > * H. J. Lu:
> >
> > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
> > > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
> > > mask operand.  A warning is issued since invalid memory access may
> > > happen when bits 64:127 at memory location are unmapped:
> > >
> > > xmmintrin.h:1168:3: note: Emulate MMX maskmovq with SSE2 maskmovdqu may 
> > > result i
> > > n invalid memory access
> > >  1168 |   __builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P);
> > >   |   ^~~
> >
> > Would it be possible to shift the mask according to the misalignment in
> > the address?  I think this should allow avoiding crossing a page
> > boundary if the orginal 64-bit load would not.
> 
> I guess it is possible.  But it may be quite a bit complex for for no
> apparent gains
> since we also need to shift the implicit memory address.
> 

I updated MMX maskmovq emulation to handle it:

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00139.html

and added tests to verify that unmapped bits 64:127 at memory address
are properly handled:

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00140.html


H.J.



[patch, fortran] Fix part of PR 67679, spurious undefined warning

2019-02-03 Thread Thomas Koenig

Hello world,

the attached patch fixes a 7/8/9 regression where a spurious warning
about compiler-generated variables, with allocation on stat.

The rest of the PR (ALLOCATE in an IF statememnt, later use
of the array) is a dup of PR 66459, which has now been reclassified
as a middle-end bug.

If I get this part committed, I intend to resolve 67679 as a duplicate
of 66459.

Regression-tested. OK for trunk?

Regards

Thomas

2019-02-03  Thomas Koenig  

PR fortran/67679
* trans-array.c (gfc_array_allocate):  For setting the bounds on
the new array, add a condition for a not previously allocated
variable.

2019-02-03  Thomas Koenig  

PR fortran/67679
* gfortran.dg/warn_undefined_1.f90: New test.
* gfortran.dg/coarray_lock_7.f90: Fix patterns in test.
Index: fortran/trans-array.c
===
--- fortran/trans-array.c	(Revision 268432)
+++ fortran/trans-array.c	(Arbeitskopie)
@@ -5736,6 +5736,7 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr,
   tree var_overflow = NULL_TREE;
   tree cond;
   tree set_descriptor;
+  tree not_prev_allocated = NULL_TREE;
   stmtblock_t set_descriptor_block;
   stmtblock_t elseblock;
   gfc_expr **lower;
@@ -5881,8 +5882,6 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr,
 	}
 }
 
-  gfc_start_block (&elseblock);
-
   /* Allocate memory to store the data.  */
   if (POINTER_TYPE_P (TREE_TYPE (se->expr)))
 se->expr = build_fold_indirect_ref_loc (input_location, se->expr);
@@ -5898,6 +5897,19 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr,
 pointer = gfc_conv_descriptor_data_get (se->expr);
   STRIP_NOPS (pointer);
 
+  if (allocatable)
+{
+  not_prev_allocated = gfc_create_var (logical_type_node,
+	   "not_prev_allocated");
+  tmp = fold_build2_loc (input_location, EQ_EXPR,
+			 logical_type_node, pointer,
+			 build_int_cst (TREE_TYPE (pointer), 0));
+
+  gfc_add_modify (&se->pre, not_prev_allocated, tmp);
+}
+
+  gfc_start_block (&elseblock);
+
   /* The allocatable variant takes the old pointer as first argument.  */
   if (allocatable)
 gfc_allocate_allocatable (&elseblock, pointer, size, token,
@@ -5965,6 +5977,11 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr,
   cond = fold_build2_loc (input_location, EQ_EXPR,
 			  logical_type_node, status,
 			  build_int_cst (TREE_TYPE (status), 0));
+
+  if (not_prev_allocated != NULL_TREE)
+	cond = fold_build2_loc (input_location, TRUTH_OR_EXPR,
+logical_type_node, cond, not_prev_allocated);
+
   gfc_add_expr_to_block (&se->pre,
 		 fold_build3_loc (input_location, COND_EXPR, void_type_node,
   cond,
Index: testsuite/gfortran.dg/coarray_lock_7.f90
===
--- testsuite/gfortran.dg/coarray_lock_7.f90	(Revision 268432)
+++ testsuite/gfortran.dg/coarray_lock_7.f90	(Arbeitskopie)
@@ -35,8 +35,8 @@ end
 ! { dg-final { scan-tree-dump-times "_gfortran_caf_lock \\(caf_token.., 0, 0, 0B, 0B, 0B, 0\\);" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "_gfortran_caf_unlock \\(caf_token.., 0, 0, 0B, 0B, 0\\);" 1 "original" } }
 
-! { dg-final { scan-tree-dump-times "_gfortran_caf_lock \\(caf_token.., .*\\(\\(3 - parm...dim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parm...dim\\\[1\\\].lbound\\)\\), 0, 0B, &ii, 0B, 0\\);|_gfortran_caf_lock \\(caf_token.1, \\(3 - parm...dim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parm...dim\\\[1\\\].lbound\\), 0, 0B, &ii, 0B, 0\\);" 1 "original" } }
-! { dg-final { scan-tree-dump-times "_gfortran_caf_unlock \\(caf_token.., .*\\(\\(2 - parm...dim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parm...dim\\\[1\\\].lbound\\)\\), 0, &ii, 0B, 0\\);|_gfortran_caf_unlock \\(caf_token.., \\(2 - parm...dim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parm...dim\\\[1\\\].lbound\\), 0, &ii, 0B, 0\\);" 1 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_caf_lock \\(caf_token.., .*\\(\\(3 - parmdim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parmdim\\\[1\\\].lbound\\)\\), 0, 0B, &ii, 0B, 0\\);|_gfortran_caf_lock \\(caf_token.1, \\(3 - parmdim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parmdim\\\[1\\\].lbound\\), 0, 0B, &ii, 0B, 0\\);" 1 "original" } }
+! { dg-final { scan-tree-dump-times "_gfortran_caf_unlock \\(caf_token.., .*\\(\\(2 - parmdim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parmdim\\\[1\\\].lbound\\)\\), 0, &ii, 0B, 0\\);|_gfortran_caf_unlock \\(caf_token.., \\(2 - parmdim\\\[0\\\].lbound\\) \\+ \\(MAX_EXPR  \\+ 1\\) \\* \\(3 - parmdim\\\[1\\\].lbound\\), 0, &ii, 0B, 0\\);" 1 "original" } }
 
 ! { dg-final { scan-tree-dump-times "_gfortran_caf_lock \\(three.token, 0, \\(integer\\(kind=4\\)\\) \\(5 - three.dim\\\[0\\\].lbound\\), &acquired.\[0-9\]+, 0B, 0B, 0\\);|_gfortran_caf_lock \\(three.token, 0, 5 - three.dim\\\[0\

[PATCH, i386]: (Partially) fix PR89074, break SSE reg dependency for a few scalar insns

2019-02-03 Thread Uros Bizjak
Following patch may help with partial SSE reg dependencies for
{R,}SQRTS{S,D}, RCPS{S,D} and ROUNDS{S,D} instructions. It takes the
same strategy as both ICC and clang take, that is:

a) load from memory with MOVS{S,D} and
b) in case of SSE, match input and output register.

The implementation uses preferred_for_speed attribute, so in cold
sections or when compiled with -Os, the compiler is still able to
create direct load from memory (SSE, AVX) and use unmatched registers
for SSE targets.

The sqrt from memory is now compiled to:

movsd   z(%rip), %xmm0
sqrtsd  %xmm0, %xmm0

(SSE) or

vmovsd  z(%rip), %xmm1
vsqrtsd %xmm1, %xmm1, %xmm0

(AVX).

And sqrt from unmatched input register will compile to:

sqrtsd  %xmm1, %xmm1
movapd  %xmm1, %xmm0

(SSE) or

   vsqrtsd %xmm1, %xmm1, %xmm0

(AVX).

The patch doesn't touch conversion instructions, where XOR clearing is
preferred (pending patch for PR 87007).

2019-02-03  Uroš Bizjak  

PR target/89071
* config/i386/i386.md (*sqrt2_sse): Add (v,0) alternative.
Do not prefer (v,v) alternative for non-AVX targets and (m,v)
alternative for speed when TARGET_SSE_PARTIAL_REG_DEPENDENCY is set.
(*rcpsf2_sse): Ditto.
(*rsqrtsf2_sse): Ditto.
(sse4_1_rounddiff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 744f155fca6f..9948f77fca53 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4472,9 +4472,9 @@
(set (match_dup 0) (float_extend:DF (match_dup 2)))]
   "operands[2] = lowpart_subreg (SFmode, operands[0], DFmode);")
 
-;; Break partial reg stall for cvtss2sd.  This splitter should split
-;; late in the pass sequence (after register rename pass),
-;; so allocated registers won't change anymore.
+;; Break partial SSE register dependency stall.  This splitter should split
+;; late in the pass sequence (after register rename pass), so allocated
+;; registers won't change anymore
 
 (define_split
   [(set (match_operand:DF 0 "sse_reg_operand")
@@ -4632,9 +4632,9 @@
(set (match_dup 0) (float_truncate:SF (match_dup 2)))]
   "operands[2] = lowpart_subreg (DFmode, operands[0], SFmode);")
 
-;; Break partial reg stall for cvtsd2ss.  This splitter should split
-;; late in the pass sequence (after register rename pass),
-;; so allocated registers won't change anymore.
+;; Break partial SSE register dependency stall.  This splitter should split
+;; late in the pass sequence (after register rename pass), so allocated
+;; registers won't change anymore
 
 (define_split
   [(set (match_operand:SF 0 "sse_reg_operand")
@@ -5137,7 +5137,7 @@
(set_attr "unit" "i387")
(set_attr "fp_int_src" "true")])
 
-;; Avoid partial SSE register dependency stalls.  This splitter should split
+;; Break partial SSE register dependency stall.  This splitter should split
 ;; late in the pass sequence (after register rename pass), so allocated
 ;; registers won't change anymore
 
@@ -14765,18 +14765,26 @@
 (symbol_ref "false"])
 
 (define_insn "*rcpsf2_sse"
-  [(set (match_operand:SF 0 "register_operand" "=x,x")
-   (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "x,m")]
+  [(set (match_operand:SF 0 "register_operand" "=x,x,x")
+   (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")]
   UNSPEC_RCP))]
   "TARGET_SSE && TARGET_SSE_MATH"
   "@
+   %vrcpss\t{%d1, %0|%0, %d1}
%vrcpss\t{%d1, %0|%0, %d1}
%vrcpss\t{%1, %d0|%d0, %1}"
   [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "rcp")
(set_attr "btver2_sse_attr" "rcp")
(set_attr "prefix" "maybe_vex")
-   (set_attr "mode" "SF")])
+   (set_attr "mode" "SF")
+   (set (attr "preferred_for_speed")
+ (cond [(eq_attr "alternative" "1")
+ (symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
+  (eq_attr "alternative" "2")
+ (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY")
+  ]
+  (symbol_ref "true")))])
 
 (define_insn "*fop_xf_1_i387"
   [(set (match_operand:XF 0 "register_operand" "=f,f")
@@ -15003,18 +15011,26 @@
(set_attr "bdver1_decode" "direct")])
 
 (define_insn "*rsqrtsf2_sse"
-  [(set (match_operand:SF 0 "register_operand" "=x,x")
-   (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "x,m")]
+  [(set (match_operand:SF 0 "register_operand" "=x,x,x")
+   (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")]
   UNSPEC_RSQRT))]
   "TARGET_SSE && TARGET_SSE_MATH"
   "@
+   %vrsqrtss\t{%d1, %0|%0, %d1}
%vrsqrtss\t{%d1, %0|%0, %d1}
%vrsqrtss\t{%1, %d0|%d0, %1}"
   [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "rcp")
(set_attr "btver2_sse_attr" "rcp")
(set_attr "prefix" "maybe_vex")
-   (set_attr "mode" "SF")])
+   (set_attr "mode" "SF")
+   (set (attr "preferred_for_speed")
+ (cond [(eq_attr "alternative" "1")
+ (symbol_ref "TARGET_AVX || !TARGET_SSE_PARTIAL_REG_DEPENDENCY")
+  (eq_attr "alternative" "2")
+   

Re: [patch, fortran] Fix part of PR 67679, spurious undefined warning

2019-02-03 Thread Steve Kargl
On Sun, Feb 03, 2019 at 05:20:34PM +0100, Thomas Koenig wrote:
> +  if (allocatable)
> +{
> +  not_prev_allocated = gfc_create_var (logical_type_node,
> +"not_prev_allocated");

Can't remember semantics of gfc_create_var.  Can not_prev_allocated
conflict with a user defined variable of the same name?  If yes,
you may want to add leading underscore.  If not, then OK to 
commit.

-- 
Steve


Re: [patch, fortran] Fix part of PR 67679, spurious undefined warning

2019-02-03 Thread Thomas Koenig

Hi Steve,


On Sun, Feb 03, 2019 at 05:20:34PM +0100, Thomas Koenig wrote:

+  if (allocatable)
+{
+  not_prev_allocated = gfc_create_var (logical_type_node,
+  "not_prev_allocated");


Can't remember semantics of gfc_create_var.  Can not_prev_allocated
conflict with a user defined variable of the same name?  If yes,
you may want to add leading underscore.  If not, then OK to
commit.


Committed as r268502.  I will wait for a bit for the other open
branches.

Regarding gfc_create_var: This will create a variable name like
not_prev_allocated.3 (note the dot), so this is not a variable
name that can be generated from C or Fortran.

Regards

Thomas



[PR fortran/89077, patch] - ICE using * as len specifier for character parameter

2019-02-03 Thread Harald Anlauf
The attached patch fixes an ICE-on-valid that probably goes back to
rev.128130.  Apparently the patch applied back then did not check
this code path which resulted in a NULL pointer dereference.  This
is remedied by the new testcase base on comment #0 in this PR.

The PR mentions another wrong-code issue to be addressed separately.

OK for trunk?  And shall this fix be backported?

Thanks,
Harald

2019-02-03  Harald Anlauf  

PR fortran/89077
* decl.c (add_init_expr_to_sym): Copy length of string initializer
to declared symbol.

2019-02-03  Harald Anlauf  

PR fortran/89077
* gfortran.dg/pr89077.f90: New test.

Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c  (revision 268502)
+++ gcc/fortran/decl.c  (working copy)
@@ -1921,7 +1921,7 @@
}
  else if (init->ts.u.cl && init->ts.u.cl->length)
sym->ts.u.cl->length =
-   gfc_copy_expr (sym->value->ts.u.cl->length);
+   gfc_copy_expr (init->ts.u.cl->length);
}
}
  /* Update initializer character length according symbol.  */
Index: gcc/testsuite/gfortran.dg/pr89077.f90
===
--- gcc/testsuite/gfortran.dg/pr89077.f90   (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr89077.f90   (working copy)
@@ -0,0 +1,11 @@
+! { dg-do run }
+!
+! PR fortran/89077 - ICE using * as len specifier for character parameter
+
+program test
+  implicit none
+  integer :: i
+  character(*), parameter :: s = 'abcdef'
+  character(*), parameter :: t = transfer ([(s(i:i), i=1,len(s))], s)
+  if (len (t) /= len (s) .or. t /= s) stop 1
+end


Re: [PATCH 00/46] Implement MMX intrinsics with SSE

2019-02-03 Thread Andi Kleen
On Fri, Feb 01, 2019 at 06:46:53PM -0800, H.J. Lu wrote:
> On Fri, Feb 1, 2019 at 4:50 PM Andi Kleen  wrote:
> >
> > "H.J. Lu"  writes:
> >
> > > To support it, we disable
> > > MMX by default in 64-bit mode so that MMX registers won't be available
> >
> > Wouldn't that break inline assembler that references MMX register clobbers?
> 
> Yes.  You need to use -mmmx explicitly to enable MMX.

Such a breaking change needs to be clearly spelled out in the documentation /
NEWS.

-Andi


[Patch]Bug 84762 - GCC for PowerPC32 violates the SysV ABI spec for small struct returns

2019-02-03 Thread Lokesh Janghel
Hi,

Find the attached patch for the subjected issue.
Please let me know your thoughts and comments on the same.

>-  if (!global_options_set.x_aix_struct_return)
>+  if (!global_options_set.x_aix_struct_return
>+&& !rs6000_current_svr4_struct_return)
According to the value of aix_struct_return it will decide which one need to use
register or memory. After that, it will check which alignment is there
for register
according to the given option.

ChangeLogs
/gcc/ChangeLog
2019-01-10  Lokesh Janghel  

PR target/84762
* config/rs6000/rs6000.c (rs6000_return_in_msb): Retrun in svr4
for small struct value.
(rs6000_option_override_internal): Add the condition for aix or
svr4 (LSB/MSB aligned).
* config/rs6000/rs6000.opt: Extend the -msvr4-struct-return option
for LSB aligned value and MSB aligned value.

/gcc/testsuite/ChangeLog
2019-01-10  Lokesh Janghel  

PR target/84762
* gcc.target/pr84762-1.c: New testcase.
* gcc.target/pr84762-2.c: New testcase.
* gcc.target/pr84762-3.c: New testcase.

-- 
Thanks
Lokesh Janghel


84762 .patch
Description: Binary data