[Bug c++/77285] New: extern thread_local linkage

2016-08-18 Thread jan.willem.ps at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

Bug ID: 77285
   Summary: extern thread_local linkage
   Product: gcc
   Version: 5.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jan.willem.ps at gmail dot com
  Target Milestone: ---

Minimal test case:

cat > a.h <
extern thread_local std::string gFeelingLucky;
EOF

cat > a.cpp < main.cpp <

int main() {
std::cout << "I'm feeling " << gFeelingLucky << '\n';
}
EOF

c++ -std=c++11 main.cpp a.cpp -o main
Output:
/tmp/cc1AJnUy.o: In function `_ZTW13gFeelingLucky':
main.cpp:(.text._ZTW13gFeelingLucky[_ZTW13gFeelingLucky]+0x5): undefined
reference to `_ZTH13gFeelingLucky'
collect2: error: ld returned 1 exit status

I've used the docker images on dockerhub to verify:
docker run -it --rm gcc:5.1 bash

This does work on gcc 4.9.0 (docker run --rm -it gcc:4.9.0 bash), so it looks
like a regression (see bug #55800).

It also doesn't work with gcc 6.1 (docker run -it --rm gcc:6.1 bash).

[Bug c++/77284] [5/6/7 Regression] ICE on valid C++11 code using initializer list: in potential_constant_expression_1, at cp/constexpr.c:5480

2016-08-18 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77284

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-08-18
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org
   Target Milestone|--- |5.5
Summary|ICE on valid C++11 code |[5/6/7 Regression] ICE on
   |using initializer list: in  |valid C++11 code using
   |potential_constant_expressi |initializer list: in
   |on_1, at|potential_constant_expressi
   |cp/constexpr.c:5480 |on_1, at
   ||cp/constexpr.c:5480
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Started with r218653.

[Bug fortran/49792] OpenMP workshare: Wrong result with array assignment

2016-08-18 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49792

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jakub Jelinek  ---
.

[Bug tree-optimization/77283] [7 Regression] Revision 238005 disables loop unrolling

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77283

Richard Biener  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org

--- Comment #4 from Richard Biener  ---
I don't like (and see the point of) path-splitting but the patch was to avoid a
regression with PRE code-hoisting.  The path-splitting code is incredibly
stupid
when it comes to cost modeling.  That is, it tries to expose CSE but doesn't
actually verify there is any likeliness in achieving that.  In fact, if the
block we duplicate (the latch) doesn't contain any stmts (in this case just
the IV increment) then there is _zero_ CSE possibility.  Improving
path-splitting
for this case is welcome.  Or just remove that stupid pass and wire it into
a pass that can actually (opportunistically) perform the CSE before deciding
to duplicate the tail.

Note that unrolling likely gives up because of the loop now having two latches?

Note that there was a plan to move RTL unrolling (-funroll-loops) to GIMPLE,
but that wouldn't affect fallout like SMS not being able to unroll (not sure
if that can handle conditional code at all).

[Bug fortran/71687] ICE in omp_add_variable, at gimplify.c:5821

2016-08-18 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71687

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Jakub Jelinek  ---
Fixed.

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-08-18
  Known to work||4.9.4
Summary|extern thread_local linkage |[5/6/7 Regression] extern
   ||thread_local linkage
 Ever confirmed|0   |1
  Known to fail||5.4.0, 6.1.0, 7.0

[Bug tree-optimization/77283] [7 Regression] Revision 238005 disables loop unrolling

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77283

--- Comment #5 from Richard Biener  ---
Patch to fix the testcase, but will eventually regress the path-splitting
testcase again.  The IV increment may be combined with a stmt in the
path thus more analysis would be required here.  I guess if there are
no non-virtual PHIs in the joiner in addition to <= 2 stmts then we can
be reasonably sure of no CSE possibilities.

Index: gcc/gimple-ssa-split-paths.c
===
--- gcc/gimple-ssa-split-paths.c(revision 239560)
+++ gcc/gimple-ssa-split-paths.c(working copy)
@@ -200,6 +200,12 @@ is_feasible_trace (basic_block bb)
}
 }

+  /* If the join block has only the conditional and possibly an IV
+ increment there are no possible CSE/DCE opportunities to expose by
+ duplicating it.  */
+  if (num_stmts_in_join <= 2)
+return false;
+
   /* We may want something here which looks at dataflow and tries
  to guess if duplication of BB is likely to result in simplification
  of instructions in BB in either the original or the duplicate.  */


Or simply look at all PHIs
(no PHIs == no CSE/DCE possibility anyway) and see if any of its result
has a use in the joiner.  If not -> fail.

Index: gcc/gimple-ssa-split-paths.c
===
--- gcc/gimple-ssa-split-paths.c(revision 239560)
+++ gcc/gimple-ssa-split-paths.c(working copy)
@@ -32,6 +32,9 @@ along with GCC; see the file COPYING3.
 #include "tracer.h"
 #include "predict.h"
 #include "params.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"

 /* Given LATCH, the latch block in a loop, see if the shape of the
path reaching LATCH is suitable for being split by duplication.
@@ -200,6 +203,34 @@ is_feasible_trace (basic_block bb)
}
 }

+  /* If the joiner has no PHIs with uses inside it there is zero chance
+ of CSE/DCE possibilities exposed by duplicating it.  */
+  bool found_phi_with_uses_in_bb = false;
+  for (gphi_iterator si = gsi_start_phis (bb); ! gsi_end_p (si);
+   gsi_next (&si))
+{
+  gphi *phi = si.phi ();
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  FOR_EACH_IMM_USE_FAST (use_p, iter, gimple_phi_result (phi))
+   if (gimple_bb (USE_STMT (use_p)) == bb)
+ {
+   found_phi_with_uses_in_bb = true;
+   break;
+ }
+  if (found_phi_with_uses_in_bb)
+   break;
+}
+  if (! found_phi_with_uses_in_bb)
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file,
+"Block %d is a join that does not expose CSE/DCE "
+"opportunities when duplicated.\n",
+bb->index);
+  return false;
+}
+
   /* We may want something here which looks at dataflow and tries
  to guess if duplication of BB is likely to result in simplification
  of instructions in BB in either the original or the duplicate.  */


This one FAILs gcc.dg/tree-ssa/split-path-7.c though.  But it looks quite
artificial (with empty if blocks, etc.).  Looking at the IL definitely only
one path is profitable to split.  Jeff?

[Bug bootstrap/77279] build error in isl/ctx.h

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77279

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
  Component|c++ |bootstrap
 Resolution|--- |FIXED
   Target Milestone|--- |6.2
  Known to fail||6.1.0

--- Comment #5 from Richard Biener  ---
Fixed for GCC 6.2 by patching the downloaded tarball.

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |5.5

--- Comment #1 from Richard Biener  ---
I suppose it works when you remove #pragma once?

[Bug tree-optimization/77283] [7 Regression] Revision 238005 disables loop unrolling

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77283

--- Comment #6 from Richard Biener  ---
I'm reasonably happy with the PHI logic so I am going to test it.  It's quite
on the do-more-duplicating side for memory references (just uses the virtual
operand PHI to see if there are any loads/stores in the joiner).

For the case of

  if (a)
i = i + 1;
  else
...;
  j = i + 1;

thus CSE opportunities on one/both paths this is something that PRE catches
already.  So we're not actually looking for CSE opportunities but opportunities
for simplification or DCE/DSE.

[Bug tree-optimization/77282] [7 regression] test case gcc.dg/autopar/pr46193.c fails starting with r239414

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77282

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-08-18
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |7.0
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.  Must have slipped through my testing.

[Bug tree-optimization/77282] [7 regression] test case gcc.dg/autopar/pr46193.c fails starting with r239414

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77282

--- Comment #2 from Richard Biener  ---
Ok, so autopar is limited by the same issues as vectorization (loop carried
dependences via non-reduction PHIs) and thus should enable the PRE code that
inhibits such transforms.  Would the testcase have vectorization enabled it
would "work".

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 239560)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -4270,7 +4275,7 @@ eliminate_dom_walker::before_dom_childre
  if (sprime
  && TREE_CODE (sprime) == SSA_NAME
  && do_pre
- && flag_tree_loop_vectorize
+ && (flag_tree_loop_vectorize || flag_tree_parallelize_loops)
  && loop_outer (b->loop_father)
  && has_zero_uses (sprime)
  && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))

fixes this.

[Bug tree-optimization/77282] [7 regression] test case gcc.dg/autopar/pr46193.c fails starting with r239414

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77282

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Richard Biener  ---
Fixed.

--- Comment #4 from Richard Biener  ---
Author: rguenth
Date: Thu Aug 18 10:06:03 2016
New Revision: 239565

URL: https://gcc.gnu.org/viewcvs?rev=239565&root=gcc&view=rev
Log:
2016-08-18  Richard Biener  

PR tree-optimization/77282
* tree-ssa-pre.c (eliminate_dom_walker::before_dom_children):
When doing auto-parallelizing also prevent use of PHIs that
carry dependences across loop backedges.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-pre.c

[Bug tree-optimization/77282] [7 regression] test case gcc.dg/autopar/pr46193.c fails starting with r239414

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77282

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Richard Biener  ---
Fixed.

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

--- Comment #2 from Jonathan Wakely  ---
No, using header guards makes no difference.

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

--- Comment #3 from Jonathan Wakely  ---
Without the header:

cat > a.cpp <
thread_local std::string gFeelingLucky;
EOF

cat > main.cpp <
extern thread_local std::string gFeelingLucky;

int main() {
 return gFeelingLucky.length();
}
EOF

g++11 main.cpp a.cpp -o main
/tmp/ccpboHS6.o: In function `TLS wrapper function for gFeelingLucky':
main.cpp:(.text._ZTW13gFeelingLucky[_ZTW13gFeelingLucky]+0x5): undefined
reference to `TLS init function for gFeelingLucky'
collect2: error: ld returned 1 exit status

[Bug tree-optimization/77286] New: [7 Regression] ICE in fold_convert_loc, at fold-const.c:2248 building 435.gromacs

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77286

Bug ID: 77286
   Summary: [7 Regression] ICE in fold_convert_loc, at
fold-const.c:2248 building 435.gromacs
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---
Target: x86_64-*-*

/home/gcc/spec/sb-czerny-head-64-2006/x86_64/install-hack/bin/../libexec/gcc/x86_64-pc-linux-gnu/7.0.0/cc1
-fpreprocessed coupling.i -quiet -dumpbase coupling.c -mavx2 -mtune=generic
-march=x86-64 -auxbase-strip coupling.o -Ofast -version -o coupling.s
coupling.c: In function 'nosehoover_tcoupl':
coupling.c:364:6: internal compiler error: in fold_convert_loc, at
fold-const.c:2248
 void nosehoover_tcoupl(t_grpopts *opts,t_groups *grps,
  ^
0x7d2f67 fold_convert_loc(unsigned int, tree_node*, tree_node*)
/gcc/spec/sb-czerny-head-64-2006/gcc/gcc/fold-const.c:2247
0x102ee2a chrec_convert_1
/gcc/spec/sb-czerny-head-64-2006/gcc/gcc/tree-chrec.c:1364
0xab6b71 interpret_rhs_expr
/gcc/spec/sb-czerny-head-64-2006/gcc/gcc/tree-scalar-evolution.c:1795
0xab460c interpret_gimple_assign
/gcc/spec/sb-czerny-head-64-2006/gcc/gcc/tree-scalar-evolution.c:2008
0xab460c analyze_scalar_evolution_1
/gcc/spec/sb-czerny-head-64-2006/gcc/gcc/tree-scalar-evolution.c:2090

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

Jonathan Wakely  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #4 from Jonathan Wakely  ---
The reference to the TLS init function in main.o doesn't have the abi_tag:

nm a.o
 U _GLOBAL_OFFSET_TABLE_
 B _Z13gFeelingLuckyB5cxx11
 U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1Ev
 U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev
 r _ZStL19piecewise_construct
 T _ZTH13gFeelingLuckyB5cxx11
 U __cxa_thread_atexit
 U __dso_handle
0020 b __tls_guard
 t __tls_init

nm main.o
 U _GLOBAL_OFFSET_TABLE_
 U _Z13gFeelingLuckyB5cxx11
 U
_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6lengthEv
 r _ZStL19piecewise_construct
 U _ZTH13gFeelingLucky
 W _ZTW13gFeelingLucky
 T main

Compare _ZTH13gFeelingLuckyB5cxx11 and _ZTH13gFeelingLucky

The TLS wrapper function (_ZTW) also doesn't have the abi tag.

[Bug tree-optimization/77286] [7 Regression] ICE in fold_convert_loc, at fold-const.c:2248 building 435.gromacs

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77286

--- Comment #1 from Richard Biener  ---
Seen with r239549 and checking disabled.  Current TOT leads to

coupling.c: In function ‘nosehoover_tcoupl’:
coupling.c:364:6: error: invalid operands in binary operation
 void nosehoover_tcoupl(t_grpopts *opts,t_groups *grps,
  ^
_182 = .MEM_183 + _179;
coupling.c:364:6: error: invalid operands in gimple comparison
if (i_184 < .MEM_183)
coupling.c:364:6: error: invalid operands in binary operation
_192 = .MEM_183 + _179;
coupling.c:364:6: error: invalid PHI argument
.MEM_183
coupling.c:364:6: error: incompatible types in PHI argument 0
int

void
...

not in my dev tree though.

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

--- Comment #5 from Jonathan Wakely  ---
Reduced:

cat >a.cpp 

[Bug tree-optimization/77286] [7 Regression] ICE in fold_convert_loc, at fold-const.c:2248 building 435.gromacs

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77286

--- Comment #2 from Richard Biener  ---
Ok, looks like my if-conversion patch is at least partly responsible for the
current TOT fallout.

[Bug c/7652] -Wswitch-break : Warn if a switch case falls through

2016-08-18 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7652

--- Comment #59 from Marek Polacek  ---
Author: mpolacek
Date: Thu Aug 18 10:28:03 2016
New Revision: 239566

URL: https://gcc.gnu.org/viewcvs?rev=239566&root=gcc&view=rev
Log:
PR c/7652
gcc/cp/
* call.c (add_builtin_candidate): Add gcc_fallthrough.
* cxx-pretty-print.c (pp_cxx_unqualified_id): Likewise.
* parser.c (cp_parser_skip_to_end_of_statement): Likewise.
(cp_parser_cache_defarg): Likewise.
libcpp/
* pch.c (write_macdef): Add CPP_FALLTHRU.

Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/call.c
trunk/gcc/cp/cxx-pretty-print.c
trunk/gcc/cp/parser.c
trunk/libcpp/ChangeLog
trunk/libcpp/pch.c

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

--- Comment #6 from Jonathan Wakely  ---
Oops, sorry that's the nm output for the wrong objects, the output for the
reduced examples is:

nm a.o 
 U _GLOBAL_OFFSET_TABLE_
 B _Z13gFeelingLuckyB3tag
 W _ZN1XB3tagD1Ev
 W _ZN1XB3tagD2Ev
 n _ZN1XB3tagD5Ev
 T _ZTH13gFeelingLuckyB3tag
 U __cxa_thread_atexit
 U __dso_handle
0004 b __tls_guard
 t __tls_init

nm main.o
 U _GLOBAL_OFFSET_TABLE_
 U _Z13gFeelingLuckyB3tag
 U _ZTH13gFeelingLucky
 W _ZTW13gFeelingLucky
 T main


The definition is _ZTH13gFeelingLuckyB3tag but the extern reference is
ZTH13gFeelingLucky

The wrapper function should probably be tagged too, so it doesn't conflict with
a symbol for the untagged object, otherwise a library that contains both
_ZTH13gFeelingLuckyB3tag and _ZTH13gFeelingLucky would get the wrong one via
the wrapper function.

[Bug c++/77287] New: Much worse code generated compared to clang (stack alignment and spills)

2016-08-18 Thread kobalicek.petr at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77287

Bug ID: 77287
   Summary: Much worse code generated compared to clang (stack
alignment and spills)
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kobalicek.petr at gmail dot com
  Target Milestone: ---

A simple function (artificial code):

#include 

int fn(
  const int* px, const int* py,
  const int* pz, const int* pw,
  const int* pa, const int* pb,
  const int* pc, const int* pd) {

  __m256i a0 = _mm256_loadu_si256((__m256i*)px);
  __m256i a1 = _mm256_loadu_si256((__m256i*)py);
  __m256i a2 = _mm256_loadu_si256((__m256i*)pz);
  __m256i a3 = _mm256_loadu_si256((__m256i*)pw);
  __m256i a4 = _mm256_loadu_si256((__m256i*)pa);
  __m256i b0 = _mm256_loadu_si256((__m256i*)pb);
  __m256i b1 = _mm256_loadu_si256((__m256i*)pc);
  __m256i b2 = _mm256_loadu_si256((__m256i*)pd);
  __m256i b3 = _mm256_loadu_si256((__m256i*)pc + 1);
  __m256i b4 = _mm256_loadu_si256((__m256i*)pd + 1);

  __m256i x0 = _mm256_packus_epi16(a0, b0);
  __m256i x1 = _mm256_packus_epi16(a1, b1);
  __m256i x2 = _mm256_packus_epi16(a2, b2);
  __m256i x3 = _mm256_packus_epi16(a3, b3);
  __m256i x4 = _mm256_packus_epi16(a4, b4);

  x0 = _mm256_add_epi16(x0, a0);
  x1 = _mm256_add_epi16(x1, a1);
  x2 = _mm256_add_epi16(x2, a2);
  x3 = _mm256_add_epi16(x3, a3);
  x4 = _mm256_add_epi16(x4, a4);

  x0 = _mm256_sub_epi16(x0, b0);
  x1 = _mm256_sub_epi16(x1, b1);
  x2 = _mm256_sub_epi16(x2, b2);
  x3 = _mm256_sub_epi16(x3, b3);
  x4 = _mm256_sub_epi16(x4, b4);

  x0 = _mm256_packus_epi16(x0, x1);
  x0 = _mm256_packus_epi16(x0, x2);
  x0 = _mm256_packus_epi16(x0, x3);
  x0 = _mm256_packus_epi16(x0, x4);
  return _mm256_extract_epi32(x0, 1);
}

Produces the following asm when compiled by GCC (annotated by me):

  ; GCC 6.1 -O2 -Wall -mavx2 -m32 -fomit-frame-pointer
  lea   ecx, [esp+4]  ; Return address
  and   esp, -32  ; Align the stack to 32 bytes
  push  DWORD PTR [ecx-4] ; Push returned address
  push  ebp   ; Save frame-pointer even if I
told GCC to not to
  mov   ebp, esp
  push  edi   ; Save GP regs
  push  esi
  push  ebx
  push  ecx
  sub   esp, 296  ; Reserve stack for YMM spills
  mov   eax, DWORD PTR [ecx+16]   ; LOAD 'pa'
  mov   esi, DWORD PTR [ecx+4]; LOAD 'py'
  mov   edi, DWORD PTR [ecx]  ; LOAD 'px'
  mov   ebx, DWORD PTR [ecx+8]; LOAD 'pz'
  mov   edx, DWORD PTR [ecx+12]   ; LOAD 'pw'
  mov   DWORD PTR [ebp-120], eax  ; SPILL 'pa'
  mov   eax, DWORD PTR [ecx+20]   ; LOAD 'pb'
  mov   DWORD PTR [ebp-152], eax  ; SPILL 'pb'
  mov   eax, DWORD PTR [ecx+24]   ; LOAD 'pc'
  vmovdqu   ymm4, YMMWORD PTR [esi]
  mov   ecx, DWORD PTR [ecx+28]   ; LOAD 'pd'
  vmovdqu   ymm7, YMMWORD PTR [edi]
  vmovdqa   YMMWORD PTR [ebp-56], ymm4; SPILL VEC
  vmovdqu   ymm4, YMMWORD PTR [ebx]
  mov   ebx, DWORD PTR [ebp-152]  ; LOAD 'pb'
  vmovdqa   YMMWORD PTR [ebp-88], ymm4; SPILL VEC
  vmovdqu   ymm4, YMMWORD PTR [edx]
  mov   edx, DWORD PTR [ebp-120]  ; LOAD 'pa'
  vmovdqu   ymm6, YMMWORD PTR [edx]
  vmovdqa   YMMWORD PTR [ebp-120], ymm6   ; SPILL VEC
  vmovdqu   ymm0, YMMWORD PTR [ecx]
  vmovdqu   ymm6, YMMWORD PTR [ebx]
  vmovdqa   ymm5, ymm0; Why to move anything when using
AVX?
  vmovdqu   ymm0, YMMWORD PTR [eax+32]
  vmovdqu   ymm2, YMMWORD PTR [eax]
  vmovdqa   ymm1, ymm0; Why to move anything when using
AVX?
  vmovdqu   ymm0, YMMWORD PTR [ecx+32]
  vmovdqa   YMMWORD PTR [ebp-152], ymm2
  vmovdqa   ymm3, ymm0; Why to move anything when using
AVX?
  vpackuswb ymm0, ymm7, ymm6
  vmovdqa   YMMWORD PTR [ebp-184], ymm5   ; SPILL VEC
  vmovdqa   YMMWORD PTR [ebp-248], ymm3   ; SPILL VEC
  vmovdqa   YMMWORD PTR [ebp-280], ymm0   ; SPILL VEC
  vmovdqa   ymm0, YMMWORD PTR [ebp-56]; ALLOC VEC
  vmovdqa   YMMWORD PTR [ebp-216], ymm1   ; SPILL VEC
  vpackuswb ymm2, ymm0, YMMWORD PTR [ebp-152] ; Uses SPILL slot
  vmovdqa   ymm0, YMMWORD PTR [ebp-88]; ALLOC VEC
  vpackuswb ymm1, ymm4, YMMWORD PTR [ebp-216] ; Uses SPILL slot
  vpackuswb ymm5, ymm0, YMMWORD PTR [ebp-184] ; Uses SPILL slot
  vmovdqa   ymm0, YMMWORD PTR [ebp-120]   ; ALLOC VEC
  vpaddwymm2, ymm2, YMMWORD PTR [ebp-56]  ; Uses SPILL slot
  vpsubwymm2, ymm2, YMMWORD PTR [ebp-152] ; Uses SPILL slot
  vpackuswb ymm3, ymm0, YMMWORD PTR [ebp-248] ; Uses SPILL slot
  vpaddwymm0, ymm7, YMMWORD PTR [ebp-280] ; Uses SPILL slot
  vpsubwymm0, ymm0, ymm6
  vmovdqa   ymm7, YM

[Bug c++/77285] [5/6/7 Regression] extern thread_local linkage

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77285

--- Comment #7 from Jonathan Wakely  ---
The examples only work with GCC 4.9.4 because the abi_tag doesn't have any
effect before GCC 5, but it's still a regression because of the effect on
std::string variables, which are now tagged.

[Bug tree-optimization/77286] [7 Regression] ICE in fold_convert_loc, at fold-const.c:2248 building 435.gromacs

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77286

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-08-18
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
Created attachment 39468
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39468&action=edit
autoreduced testcase

[Bug target/77287] Much worse code generated compared to clang (stack alignment and spills)

2016-08-18 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77287

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |target

--- Comment #1 from Andrew Pinski  ---
Try adding -march=intel or-mtune=intel . The default tuning for gcc is generic
which is combination of Intel and amd tuning. And because amd tuning needs not
to use gprs and SIMD registers at the same time spilling is faster there. It
tunes for that.

[Bug tree-optimization/77286] [7 Regression] ICE in fold_convert_loc, at fold-const.c:2248 building 435.gromacs

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77286

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |7.0

--- Comment #4 from Richard Biener  ---
Actually it's the vectorizer messing up the IL.  Seems to be related to the
tree-ssa-loop-manip.c change in

2016-08-17  Richard Biener  

* tree-ssa.c: Include tree-cfg.h and tree-dfa.h.
(verify_vssa): New function verifying virtual SSA form.
(verify_ssa): Call it.
* tree-ssa-loop-manip.c (slpeel_update_phi_nodes_for_guard2):
Do not apply loop-closed SSA handling to virtuals.
* ssa-iterators.h (op_iter_init): Handle GIMPLE_TRANSACTION.
* tree-into-ssa.c (prepare_use_sites_for): Skip virtual SSA names
when rewriting their symbol.
(prepare_def_site_for): Likewise.
* tree-chkp-opt.c (chkp_reduce_bounds_lifetime): Clear virtual
operands of moved stmts.

[Bug target/77287] Much worse code generated compared to clang (stack alignment and spills)

2016-08-18 Thread kobalicek.petr at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77287

--- Comment #2 from Petr  ---
With '-mtune=intel' the push/pop sequence is gone, but YMM register management
remains the same - 24 memory accesses more than clang.

[Bug tree-optimization/77286] [7 Regression] ICE in fold_convert_loc, at fold-const.c:2248 building 435.gromacs

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77286

--- Comment #5 from Richard Biener  ---
Ok, again slpeels incredibly fragile manual SSA updating which relies on PHI
nodes
to appear in exactly the same order...  which is AFAIK why I at some point
removed that lc-PHI handling bail-out I re-added.  We can't rely on update-ssa
to perform it.

[Bug target/62180] (RX600) - compiler doesn't honor -fstrict-volatile-bitfields and generates incorrect machine code for I/O register access

2016-08-18 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62180

Oleg Endo  changed:

   What|Removed |Added

 CC||olegendo at gcc dot gnu.org

--- Comment #6 from Oleg Endo  ---
(In reply to DJ Delorie from comment #4)
> Perhaps you need this patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00993.html

DJ, did you actually commit that patch?

In any case, I think this issue can be closed.  Using bitfields for hardware
access is not really a safe thing to do.  E.g. see also PR 67644.

[Bug rtl-optimization/77287] Much worse code generated compared to clang (stack alignment and spills)

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77287

Richard Biener  changed:

   What|Removed |Added

   Keywords||ra
 Target||x86_64-*-*, i?86-*-*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-08-18
 CC||vmakarov at gcc dot gnu.org
  Component|target  |rtl-optimization
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
-fschedule-insns improves things here - and LRA seems to be more happy to
spill/reload rather than rematerialize.  But in the end the testcase requires
careful scheduling of the operations to reduce register lifetime and thus
allow optimal RA with the limited number of registers available.

We force a frame pointer because we have to re-align the stack for possible
spills.

[Bug tree-optimization/71077] [7 Regression] gcc -lto raises ICE

2016-08-18 Thread ysrumyan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71077

--- Comment #7 from Yuri Rumyantsev  ---
I checked that proposed patch fixed RF for 176.gcc.

Please, go ahead and commit your patch to trunk.
Thanks.
Yuri.

2016-08-12 20:14 GMT+03:00 patrick at parcs dot ath.cx
:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71077
>
> --- Comment #6 from patrick at parcs dot ath.cx ---
> On Fri, 12 Aug 2016, ysrumyan at gmail dot com wrote:
>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71077
>>
>> Yuri Rumyantsev  changed:
>>
>>What|Removed |Added
>> 
>>  CC||ysrumyan at gmail dot com
>>
>> --- Comment #5 from Yuri Rumyantsev  ---
>> We found out that after r235653 with minor change of int->bool type 176.gcc
>> still RF on HSW machine in 32-bit if opt level equal 3. If we turn off VRP
>> phase by -fno-tree-vrp option benchmark is passed. Need to understand why 
>> this
>> simplification affects on it.
>
> My only guess is that the combining step still doesn't handle
> VECTOR_CSTs correctly. Could you please check if this patch fixes the
> runtime failure?
>
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index 170e456..0db7bda 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -577,6 +577,7 @@ simplify_control_stmt_condition_1 (edge e,
>if (handle_dominating_asserts
>&& (cond_code == EQ_EXPR || cond_code == NE_EXPR)
>&& TREE_CODE (op0) == SSA_NAME
> +  && INTEGRAL_TYPE_P (TREE_TYPE (op0))
>&& integer_zerop (op1))
>  {
>gimple *def_stmt = SSA_NAME_DEF_STMT (op0);
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug tree-optimization/77283] [7 Regression] Revision 238005 disables loop unrolling

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77283

--- Comment #7 from Richard Biener  ---
Created attachment 39469
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39469&action=edit
patch

Ok, causes

FAIL: gcc.dg/tree-ssa/pr69270.c scan-tree-dump-not dom3 "bit_xor"
FAIL: gcc.dg/tree-ssa/pr69270.c scan-tree-dump-times dom3 "Folded to: _[0-9]+ = 
0;" 1
FAIL: gcc.dg/tree-ssa/pr69270.c scan-tree-dump-times dom3 "Folded to: _[0-9]+ = 
1;" 1
FAIL: gcc.dg/tree-ssa/pr69270.c scan-tree-dump-times dom3 "Replaced
.bufferstep_
[0-9]+. with constant .0." 1
FAIL: gcc.dg/tree-ssa/pr69270.c scan-tree-dump-times dom3 "Replaced
.bufferstep_
[0-9]+. with constant .1." 1

which is where path-splitting exposes a jump threading opportunity it seems
as jump threading is not happy to perform the operation in one go.

Also my adjustment of gcc.dg/tree-ssa/split-path-7.c was only good in my dev
tree for some reason.  Otherwise bootstrapped / tested on
x86_64-unknown-linux-gnu.

[Bug target/71338] [RL78] mulu instruction not used on G10

2016-08-18 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71338

--- Comment #6 from Oleg Endo  ---
Author: olegendo
Date: Thu Aug 18 12:14:33 2016
New Revision: 239568

URL: https://gcc.gnu.org/viewcvs?rev=239568&root=gcc&view=rev
Log:
gcc/
Backport from mainline
2016-06-17  DJ Delorie  

PR target/71338
* config/rl78/rl78-expand.c (umulqihi3): Enable for G10.
* config/rl78/rl78-virtual.c (umulhi3_shift_virt): Likewise.
(umulqihi3_virt): Likewise.
* config/rl78/rl78-real.c (umulhi3_shift_real): Likewise.
(umulqihi3_real): Likewise.


Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/rl78/rl78-expand.md
branches/gcc-5-branch/gcc/config/rl78/rl78-real.md
branches/gcc-5-branch/gcc/config/rl78/rl78-virt.md

[Bug tree-optimization/77283] [7 Regression] Revision 238005 disables loop unrolling

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77283

--- Comment #8 from Richard Biener  ---
(In reply to Richard Biener from comment #7)
> Also my adjustment of gcc.dg/tree-ssa/split-path-7.c was only good in my dev
> tree for some reason.  Otherwise bootstrapped / tested on
> x86_64-unknown-linux-gnu.

Ah, that's because my dev tree has store commoning during sinking which turns

  if ()
*dp = ...
  else
*dp = ...
  if (++dp)
goto loop;

into

  if ()
...
  else
...
  *dp = ...
  if (++dp)
goto loop;

making the block artificially interesting.

[Bug target/71338] [RL78] mulu instruction not used on G10

2016-08-18 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71338

--- Comment #7 from Oleg Endo  ---
Author: olegendo
Date: Thu Aug 18 12:25:47 2016
New Revision: 239569

URL: https://gcc.gnu.org/viewcvs?rev=239569&root=gcc&view=rev
Log:
gcc/
Backport from mainline
2016-06-17  DJ Delorie  

PR target/71338
* config/rl78/rl78-expand.c (umulqihi3): Enable for G10.
* config/rl78/rl78-virtual.c (umulhi3_shift_virt): Likewise.
(umulqihi3_virt): Likewise.
* config/rl78/rl78-real.c (umulhi3_shift_real): Likewise.
(umulqihi3_real): Likewise.


Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/rl78/rl78-expand.md
branches/gcc-6-branch/gcc/config/rl78/rl78-real.md
branches/gcc-6-branch/gcc/config/rl78/rl78-virt.md

[Bug c++/59980] Diagnostics relating to template-specialisations using enums.

2016-08-18 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59980

Manuel López-Ibáñez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-08-18
 CC||manu at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #5 from Manuel López-Ibáñez  ---
Clang prints:

prog.cc:19:9: error: no member named 'fn' in 's1'; did you mean
's1::fn'?
s1::fn();   // Causes an error to be emitted that I'd like to
modify the format of.
^~
s1::fn
prog.cc:8:21: note: 's1::fn' declared here
static void fn() {}
^
prog.cc:21:9: error: no member named 'fn' in 's2'; did you mean
's2::fn'?
s2::fn();   // Causes an error to be emitted that I'd like
to modify the format of.
^~
s2::fn
prog.cc:14:21: note: 's2::fn' declared here
static void fn() {}
^

It should be possible to do this in g++ nowadays.

[Bug middle-end/49774] [meta-bug] restrict qualification aliasing issues

2016-08-18 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49774

Manuel López-Ibáñez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-08-18
 Depends on||50419, 65330
 Ever confirmed|0   |1
  Alias||restrict

--- Comment #1 from Manuel López-Ibáñez  ---
confirm so that it doesn't show up in the list of UNCONFIRMED.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50419
[Bug 50419] Bad interaction between data-ref and disambiguation with restrict
pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65330
[Bug 65330] restrict should be considered when folding through references from
global vars

[Bug target/71338] [RL78] mulu instruction not used on G10

2016-08-18 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71338

Oleg Endo  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Oleg Endo  ---
Fixed and backported.

[Bug tree-optimization/50419] Casts to restrict pointers have no effect

2016-08-18 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50419

Richard Biener  changed:

   What|Removed |Added

   Keywords||alias, missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-08-18
Summary|Bad interaction between |Casts to restrict pointers
   |data-ref and disambiguation |have no effect
   |with restrict pointers  |
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
Note that "cast-restrict" support has been removed and thus it is now expected
that only the restrict argument case is handled.

But - confirmed.  Nothing to do with dataref analysis though.

[Bug libstdc++/77288] New: Std::experimental::optional::operator= implementation is broken in gcc 6.1

2016-08-18 Thread dawid_jurek at vp dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77288

Bug ID: 77288
   Summary: Std::experimental::optional::operator= implementation
is broken in gcc 6.1
   Product: gcc
   Version: 6.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dawid_jurek at vp dot pl
  Target Milestone: ---

Created attachment 39470
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39470&action=edit
Unit tests triggering bug in operator=

0. Gcc version.

./gcc -v

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/6.1.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc-multilib/src/gcc/configure --prefix=/usr
--libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared
--enable-threads=posix --enable-libmpx --with-system-zlib --with-isl
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu
--disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object
--enable-linker-build-id --enable-lto --enable-plugin
--enable-install-libiberty --with-linker-hash-style=gnu
--enable-gnu-indirect-function --enable-multilib --disable-werror
--enable-checking=release
Thread model: posix
gcc version 6.1.1 20160707 (GCC) 


1. Issue.

C++ code snippet:
   
std::experimental::optional>
nested_element;
std::experimental::optional element = {};
nested_element = element;
assert(bool(nested_element));

In above snippet assertion passes for gcc 5.1 but fails for gcc 6.1. 
Analogous code passes with Boost.Optional as well.

From http://en.cppreference.com/w/cpp/experimental/optional:
"The optional object (of type T) contains a value in the following conditions:
 -   The object is initialized with a value of type T (...)"

so nested_element should contain a value which means operator= implementation
in gcc 6.1 is broken.

2. Analysis

According to http://en.cppreference.com/w/cpp/experimental/optional/operator%3D
there are 4 overloads of optional::operator=. 
We are interested in 2 versions here:
[2] optional& operator=( const optional& other );
[4] template< class U >
optional& operator=( U&& value );

Let's back to our problematic expression:
nested_element = element;

The root cause of problem is that in gcc 6.1 case version [2] is called but in
gcc 5.1 and Boost.Optional another version is called - [4] one.

Looking deeper on broken std::experimental::optional::operator= [4]
implementation there is:

  template>,
   __not_<__is_optional<_Up>>>::value,
 bool> = true>
optional&
operator=(_Up&& __u)
{
   ...
}

which means overload resolution ommits this candidate because of occurence
following expression in enable_if body:
__not_<__is_optional<_Up>> 
After substitution failure another overload candidate (version [2]) is choosen
at compilation level:

  template>>::value,
   bool> = true>
optional&
operator=(const optional<_Up>& __u)
{
if (__u)
{
...
}
else
{
this->_M_reset();
}
return *this;
}

In runtime during nested_element = element expression evaluation flow enters to
operator=(const optional<_Up>& __u) 
and after that to this->_M_reset(); because __u is nullopt. Now it's clear that
this->_M_engaged field is not set and finally
bool(nested_element) returns _M_engaged content which is false.

3. Solution

As we saw version [2] of operator= is not ready for handling assignment when
_Tp (type of this) is nested optional.
IMO the best what we can is forcing overload resoultion to choose version [4]
in such case as it was done in gcc 5.1 implementation.
Basing on gcc 5.1 implementation we can relax rules for overload resoultion for
operator= [4] version.

It's enaugh to replace is_optional trait by decay trait in enable_if_t from: 
__not_<__is_optional<_Up>>>::value
to:
is_same< std::decay_t<_Up>, _Tp>>::value

Now behaviour is the same as in gcc 5.1 - operator=(_Up&& __u) is chosen in 2
situations:
1. _Up is NOT optional
2. _Up is optional AND _Up is same as _Tp modulo cv-qualifiers, references etc.

I prepared impl-fix.patch for gcc 6.1 containing above change. After applying
this patch code snippet from beginning of my email is fixed and assertion pass.

4. Tests

Unfotunately existing Libstdc++-v3 unit tests are not able to detect bug I
describe here. That's why I prepared another patch - extra-tests.patch. 
This patch just add 2 simple tests in form of extra 7.cc file under
libstdc++-v3/testsuite/experimental/optional/assignment/ path.
With those patches following workflow is recommended - a

[Bug libstdc++/77288] Std::experimental::optional::operator= implementation is broken in gcc 6.1

2016-08-18 Thread dawid_jurek at vp dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77288

--- Comment #1 from dawid_jurek at vp dot pl ---
Created attachment 39471
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39471&action=edit
Fix bug in operator=

[Bug libstdc++/77288] Std::experimental::optional::operator= implementation is broken in gcc 6.1

2016-08-18 Thread ville.voutilainen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77288

Ville Voutilainen  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-08-18
 CC||ville.voutilainen at gmail dot 
com
   Assignee|unassigned at gcc dot gnu.org  |ville.voutilainen at 
gmail dot com
 Ever confirmed|0   |1

--- Comment #2 from Ville Voutilainen  ---
Mine.

[Bug libstdc++/77288] Std::experimental::optional::operator= implementation is broken in gcc 6.1

2016-08-18 Thread ville.voutilainen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77288

--- Comment #3 from Ville Voutilainen  ---
>Now behaviour is the same as in gcc 5.1 - operator=(_Up&& __u) is chosen in 2 
>situations:
>1. _Up is NOT optional
>2. _Up is optional AND _Up is same as _Tp modulo cv-qualifiers, references etc.

How do you expect converting assignments to work if that signature
sfinaes away if decay<_Up> is not _Tp? As in, code like

optional os;
os = "meow";

[Bug rtl-optimization/77289] New: ICE in extract_constrain_insn, at recog.c:2212 on powerpc64

2016-08-18 Thread pthaugen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77289

Bug ID: 77289
   Summary: ICE in extract_constrain_insn, at recog.c:2212 on
powerpc64
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pthaugen at gcc dot gnu.org
CC: bergner at gcc dot gnu.org, dje.gcc at gmail dot com,
seurer at gcc dot gnu.org, wschmidt at gcc dot gnu.org
  Target Milestone: ---
  Host: powerpc64-unknown-linux-gnu
Target: powerpc64-unknown-linux-gnu
 Build: powerpc64-unknown-linux-gnu

Following ICE hit while building 456.hmmer from cpu2006. Bisected to r239105.

[pthaugen@igoo delta]$ cat modelmakers.c
typedef struct msa_struct
{
  int alen;
  int nseq;
}
MSA;
extern int Alphabet_size;
double log (double);
float FDot (float *, float *, int);
void free (void *);
void
P7Maxmodelmaker (MSA * msa, float *null)
{
  int idx;
  int i, j;
  int x;
  float **matc;
  float cij[8], tij[8];
  float insp[20];
  float insc[20];
  float *sc;
  int first, last;
  float new, bestsc;
  int code;
  for (x = 0; x < Alphabet_size; x++)
insp[x] =
  ((insp[x] / null[x]) >
   0 ? log (insp[x] / null[x]) * 1.44269504 : -.);
  for (last = msa->alen; i > 0; i--)
{
  for (idx = 0; idx < msa->nseq; j++)
{
  if (code == 1)
{
  new =
sc[j] + FDot (tij, cij, 7) + FDot (insp, insc, Alphabet_size);
}
}
}
  for (i = 1; i <= msa->alen; i++)
free (matc[i]);
}


[pthaugen@igoo delta]$ ~/install/gcc/trunk/bin/gcc -c -m32 -O3 -mcpu=power7
-funroll-loops -ffast-math modelmakers.c
modelmakers.c: In function 'P7Maxmodelmaker':
modelmakers.c:42:1: error: insn does not satisfy its constraints:
 }
 ^
(insn 452 451 453 3 (parallel [
(set (reg:SF 61 29 [orig:258 MEM[base: _68, offset: 0B] ] [258])
(mem:SF (plus:SI (reg/f:SI 1 1)
(const_int 88 [0x58])) [1 MEM[base: _68, offset: 0B]+0
S4 A32]))
(set (reg:SI 31 31 [orig:328 ivtmp.25 ] [328])
(plus:SI (reg/f:SI 1 1)
(const_int 88 [0x58])))
]) modelmakers.c:27 617 {*movsf_update1}
 (nil))
modelmakers.c:42:1: internal compiler error: in extract_constrain_insn, at
recog.c:2212
0x107ce193 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/pthaugen/src/gcc/trunk/gcc/gcc/rtl-error.c:108
0x107ce1eb _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/pthaugen/src/gcc/trunk/gcc/gcc/rtl-error.c:119
0x10792daf extract_constrain_insn(rtx_insn*)
/home/pthaugen/src/gcc/trunk/gcc/gcc/recog.c:2212
0x1066d10b check_rtl
/home/pthaugen/src/gcc/trunk/gcc/gcc/lra.c:2108
0x106710bb lra(_IO_FILE*)
/home/pthaugen/src/gcc/trunk/gcc/gcc/lra.c:2516
0x10615317 do_reload
/home/pthaugen/src/gcc/trunk/gcc/gcc/ira.c:5385
0x10615317 execute
/home/pthaugen/src/gcc/trunk/gcc/gcc/ira.c:5569
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug middle-end/70895] OpenACC: loop reduction does not work. Output is zero.

2016-08-18 Thread cltang at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70895

--- Comment #6 from Chung-Lin Tang  ---
Author: cltang
Date: Thu Aug 18 14:46:19 2016
New Revision: 239576

URL: https://gcc.gnu.org/viewcvs?rev=239576&root=gcc&view=rev
Log:
2016-08-18  Chung-Lin Tang  

PR middle-end/70895
gcc/
* gimplify.c (omp_add_variable): Adjust/add variable mapping on
enclosing parallel construct for reduction variables on OpenACC loop
directives.

gcc/testsuite/
* gfortran.dg/goacc/loop-tree-1.f90: Add gimple scan-tree-dump test.
* c-c++-common/goacc/reduction-1.c: Likewise.
* c-c++-common/goacc/reduction-2.c: Likewise.
* c-c++-common/goacc/reduction-3.c: Likewise.
* c-c++-common/goacc/reduction-4.c: Likewise.

libgomp/
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Add explicit
firstprivate clauses.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Remove explicit
copy clauses.
* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-dbl.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-dbl.c: Likewise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimplify.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/c-c++-common/goacc/reduction-1.c
trunk/gcc/testsuite/c-c++-common/goacc/reduction-2.c
trunk/gcc/testsuite/c-c++-common/goacc/reduction-3.c
trunk/gcc/testsuite/c-c++-common/goacc/reduction-4.c
trunk/gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90
trunk/libgomp/ChangeLog
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-2.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-4.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-dbl.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-dbl.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-flt.c
trunk/libgomp/testsuite/libgomp.oacc-fortran/reduction-6.f90
trunk/libgomp/testsuite/libgomp.oacc-fortran/reduction-7.f90

[Bug tree-optimization/77282] [7 regression] test case gcc.dg/autopar/pr46193.c fails starting with r239414

2016-08-18 Thread seurer at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77282

--- Comment #5 from Bill Seurer  ---
Just did a test run and it works.  Thanks!

[Bug middle-end/70895] OpenACC: loop reduction does not work. Output is zero.

2016-08-18 Thread cltang at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70895

--- Comment #7 from Chung-Lin Tang  ---
Author: cltang
Date: Thu Aug 18 14:56:11 2016
New Revision: 239577

URL: https://gcc.gnu.org/viewcvs?rev=239577&root=gcc&view=rev
Log:
Backport from mainline:

2016-08-18  Chung-Lin Tang  

PR middle-end/70895
gcc/
* gimplify.c (omp_add_variable): Adjust/add variable mapping on
enclosing parallel construct for reduction variables on OpenACC loop
directives.

gcc/testsuite/
* gfortran.dg/goacc/loop-tree-1.f90: Add gimple scan-tree-dump test.
* c-c++-common/goacc/reduction-1.c: Likewise.
* c-c++-common/goacc/reduction-2.c: Likewise.
* c-c++-common/goacc/reduction-3.c: Likewise.
* c-c++-common/goacc/reduction-4.c: Likewise.

libgomp/
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Add explicit
firstprivate clauses.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Remove explicit
copy clauses.
* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-dbl.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-dbl.c: Likewise.


Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/gimplify.c
branches/gcc-6-branch/gcc/testsuite/ChangeLog
branches/gcc-6-branch/gcc/testsuite/c-c++-common/goacc/reduction-1.c
branches/gcc-6-branch/gcc/testsuite/c-c++-common/goacc/reduction-2.c
branches/gcc-6-branch/gcc/testsuite/c-c++-common/goacc/reduction-3.c
branches/gcc-6-branch/gcc/testsuite/c-c++-common/goacc/reduction-4.c
branches/gcc-6-branch/gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90
branches/gcc-6-branch/libgomp/ChangeLog
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-2.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-4.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-dbl.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-dbl.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-flt.c
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-fortran/reduction-6.f90
   
branches/gcc-6-branch/libgomp/testsuite/libgomp.oacc-fortran/reduction-7.f90

[Bug target/72839] MOVE_RATIO is too small for Lakemont

2016-08-18 Thread hjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72839

--- Comment #1 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Thu Aug 18 14:59:46 2016
New Revision: 239578

URL: https://gcc.gnu.org/viewcvs?rev=239578&root=gcc&view=rev
Log:
Increase MOVE_RATIO to 17 for Lakemont

Larger MOVE_RATIO will always make code faster.  17 is the number with
smaller code sizes for Lakemont.

gcc/

PR target/72839
* config/i386/i386.c (lakemont_cost): Set MOVE_RATIO to 17.

gcc/testsuite/

PR target/72839
* gcc.target/i386/pr72839.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr72839.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/70895] OpenACC: loop reduction does not work. Output is zero.

2016-08-18 Thread cltang at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70895

--- Comment #8 from Chung-Lin Tang  ---
Author: cltang
Date: Thu Aug 18 15:01:50 2016
New Revision: 239579

URL: https://gcc.gnu.org/viewcvs?rev=239579&root=gcc&view=rev
Log:
Backport from mainline:

2016-08-18  Chung-Lin Tang  

PR middle-end/70895
gcc/
* gimplify.c (omp_add_variable): Adjust/add variable mapping on
enclosing parallel construct for reduction variables on OpenACC loop
directives.

gcc/testsuite/
* gfortran.dg/goacc/loop-tree-1.f90: Add gimple scan-tree-dump test.
* c-c++-common/goacc/reduction-1.c: Likewise.
* c-c++-common/goacc/reduction-2.c: Likewise.
* c-c++-common/goacc/reduction-3.c: Likewise.
* c-c++-common/goacc/reduction-4.c: Likewise.

libgomp/
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Add explicit
firstprivate clauses.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Remove explicit
copy clauses.
* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-dbl.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-dbl.c: Likewise.


Modified:
branches/gomp-4_0-branch/gcc/gimplify.c
branches/gomp-4_0-branch/gcc/testsuite/ChangeLog.gomp
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/goacc/reduction-1.c
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/goacc/reduction-2.c
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/goacc/reduction-3.c
branches/gomp-4_0-branch/gcc/testsuite/c-c++-common/goacc/reduction-4.c
branches/gomp-4_0-branch/gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90
branches/gomp-4_0-branch/libgomp/ChangeLog.gomp
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-2.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/collapse-4.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-dbl.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-dbl.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-flt.c
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-fortran/reduction-6.f90
   
branches/gomp-4_0-branch/libgomp/testsuite/libgomp.oacc-fortran/reduction-7.f90

[Bug c/71514] ICE on C11 code with atomic exchange at -O1 and above on x86_64-linux-gnu: in copy_reference_ops_from_ref, at tree-ssa-sccvn.c:879

2016-08-18 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71514

--- Comment #14 from Marek Polacek  ---
Author: mpolacek
Date: Thu Aug 18 16:38:49 2016
New Revision: 239581

URL: https://gcc.gnu.org/viewcvs?rev=239581&root=gcc&view=rev
Log:
PR c/71514
* c-common.c (get_atomic_generic_size): Disallow pointer-to-function
and pointer-to-VLA.

* gcc.dg/pr71514.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr71514.c
Modified:
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.c
trunk/gcc/testsuite/ChangeLog

[Bug c/71514] ICE on C11 code with atomic exchange at -O1 and above on x86_64-linux-gnu: in copy_reference_ops_from_ref, at tree-ssa-sccvn.c:879

2016-08-18 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71514

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Marek Polacek  ---
Author: mpolacek
Date: Thu Aug 18 16:38:49 2016
New Revision: 239581

URL: https://gcc.gnu.org/viewcvs?rev=239581&root=gcc&view=rev
Log:
PR c/71514
* c-common.c (get_atomic_generic_size): Disallow pointer-to-function
and pointer-to-VLA.

* gcc.dg/pr71514.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr71514.c
Modified:
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.c
trunk/gcc/testsuite/ChangeLog

--- Comment #15 from Marek Polacek  ---
Fixed.

[Bug c/71514] ICE on C11 code with atomic exchange at -O1 and above on x86_64-linux-gnu: in copy_reference_ops_from_ref, at tree-ssa-sccvn.c:879

2016-08-18 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71514

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Marek Polacek  ---
Fixed.

[Bug libstdc++/77288] Std::experimental::optional::operator= implementation is broken in gcc 6.1

2016-08-18 Thread dawid_jurek at vp dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77288

--- Comment #4 from dawid_jurek at vp dot pl ---
After applying my patch code snippet you provided compile, run and works as
expected. To be more precise I'm talking about such snippet:

std::experimental::optional os;
os = "meow";
assert(bool(os) && *os == "meow");

You are right that operator=(_Up&& __u) won't be choose here as overload
candidate. 
Anyway there is converting constructor:

template 
constexpr optional(_Up&& __t) 

and generated move constructor:

optional<_Tp>::optional(optional<_Tp>::optional<_Tp>&& )

Let's see what happen when flow achieve expression: os = "meow" 

1. Call optional::optional<_Tp>::optional(_Up&& __t) where _Tp = std::string,
_Up = char const (&) [13]  
   for r-value "meow" and create temporary optional containing
"meow".
2. Call generated
optional::optional<_Tp>::optional(optional<_Tp>::optional<_Tp>&& ) where _Tp =
std::string
   and just transfer ownership from temporary optional to os.
3. Now os is engaged, os contains "meow" and assertion is happy. 

Regards,
Dawid

[Bug fortran/72698] [5/6/7 Regression] ICE in lhd_incomplete_type_error, at langhooks.c:205

2016-08-18 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72698

vehre at gcc dot gnu.org changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from vehre at gcc dot gnu.org ---
No regressions reported so far, closing.

[Bug fortran/71936] [6/7 Regression] ICE in wide_int_to_tree, at tree.c:1487

2016-08-18 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71936

vehre at gcc dot gnu.org changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from vehre at gcc dot gnu.org ---
No regressions reported so far, closing.

[Bug libstdc++/37986] std::tr1::variate_generator does not conform to TR1.

2016-08-18 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37986

--- Comment #7 from Jonathan Wakely  ---
(In reply to Manuel Holtgrewe from comment #0)
>   std::tr1::variate_generator<
> std::tr1::mt19937&,
> std::tr1::uniform_real
> > g(mt, dist);

This case is only fixed for -std=gnu++98 mode. With -std=c++98 there is no
reference collapsing and the std::tr1::__detail::_Adaptor class template tries
to form a reference to a reference.

From a look at the code, it appears that our variate_generator is not
conforming, because its engine_value_type is the adaptor class, not the Engine
argument.

I don't intend to fix this, I'm just noting it here for posterity.

[Bug tree-optimization/77290] New: [7 regression] test case gcc.dg/tree-ssa/pr71347.c fails starting with r239565

2016-08-18 Thread seurer at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77290

Bug ID: 77290
   Summary: [7 regression] test case gcc.dg/tree-ssa/pr71347.c
fails starting with r239565
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at linux dot vnet.ibm.com
  Target Milestone: ---

PASS: gcc.dg/tree-ssa/pr71347.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/pr71347.c scan-tree-dump-not optimized ".* = MEM.*;"

=== gcc Summary ===

# of expected passes1
# of unexpected failures1

This test previously passed on powerpc but began failing with r239565.  From
looking at the test results it looks like it has been failing on other
architectures for quite some time (mid June at least).

Diff of generated .s files:

23c23
<   addi 9,10,8
---
>   addi 9,10,16
30,31c30,32
<   fmul 12,0,0
<   stfd 12,.LANCHOR0+8@toc@l(8)
---
>   fmul 0,0,0
>   fmr 12,0
>   stfd 0,.LANCHOR0+8@toc@l(8)
37,38d37
<   lfd 0,0(9)
<   addi 9,9,8
40c39,40
<   stfd 0,0(9)
---
>   addi 9,9,8
>   stfd 0,-8(9)
61c61
<   .ident  "GCC: (GNU) 7.0.0 20160818 (experimental) [trunk revision
239565]"
---
>   .ident  "GCC: (GNU) 7.0.0 20160818 (experimental) [trunk revision 
> 239564]"
seurer@genoa:~/gcc/build/gcc-test3$ diff ../gcc-test2/pr71347.s pr71347.s
23c23
<   addi 9,10,16
---
>   addi 9,10,8
30,32c30,31
<   fmul 0,0,0
<   fmr 12,0
<   stfd 0,.LANCHOR0+8@toc@l(8)
---
>   fmul 12,0,0
>   stfd 12,.LANCHOR0+8@toc@l(8)
38c37
<   fmul 0,0,12
---
>   lfd 0,0(9)
40c39,40
<   stfd 0,-8(9)
---
>   fmul 0,0,12
>   stfd 0,0(9)
61c61
<   .ident  "GCC: (GNU) 7.0.0 20160818 (experimental) [trunk revision
239564]"
---
>   .ident  "GCC: (GNU) 7.0.0 20160818 (experimental) [trunk revision 
> 239565]"

[Bug libstdc++/77288] Std::experimental::optional::operator= implementation is broken in gcc 6.1

2016-08-18 Thread ville.voutilainen at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77288

--- Comment #5 from Ville Voutilainen  ---
Ah. That would indeed mean that every converting assignment introduces
a temporary. Design-wise I'd rather have it so that optional doesn't convert
at all in the assignment. :)

[Bug c++/71973] c++ handles built-in functions inconsistently

2016-08-18 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71973

--- Comment #3 from Bernd Edlinger  ---
Created attachment 39472
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39472&action=edit
possible patch

Oh, yeah!

Boot-strap OK.

The eh-code is OK.

BUT: the warning triggers a few hundred times, within the g++.dg testsuite,
and all of the places look like real bugs, that deserve a warning...

FAIL: g++.dg/charset/asm2.c  -std=c++98 (test for excess errors)
FAIL: g++.dg/charset/asm2.c  -std=c++11 (test for excess errors)
FAIL: g++.dg/charset/asm2.c  -std=c++14 (test for excess errors)
FAIL: g++.dg/debug/dwarf2/template-func-params-7.C  -std=gnu++11 (test for
excess errors)
FAIL: g++.dg/debug/dwarf2/template-func-params-7.C  -std=gnu++14 (test for
excess errors)
FAIL: g++.dg/addr_builtin-1.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/addr_builtin-1.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/addr_builtin-1.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-builtin1.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-builtin1.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-function2.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-function2.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-pos1.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-pos1.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/gen-attrs-40.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/cpp0x/gen-attrs-40.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/lambda-generic-udt.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/cpp1y/lambda-generic-xudt.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/diagnostic/pr70105.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/diagnostic/pr70105.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/diagnostic/pr70105.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ext/attrib40.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/ext/attrib40.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/ext/attrib40.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/init/new15.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/init/new15.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/init/new15.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/init/ref9.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/init/ref9.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/init/ref9.C  -std=c++14 (test for excess errors)
AIL: g++.dg/ipa/inline-1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/ipa/inline-1.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/ipa/inline-1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ipa/inline-2.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/ipa/inline-2.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/ipa/inline-2.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/lookup/builtin1.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/lookup/builtin1.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/lookup/builtin1.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/lookup/builtin3.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/lookup/builtin3.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/lookup/builtin3.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/lookup/builtin5.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/lookup/builtin5.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/lookup/builtin7.C  -std=c++98 (test for excess errors)
FAIL: g++.dg/lookup/builtin7.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/lookup/builtin7.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/lookup/extern-c-redecl4.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/lookup/extern-c-redecl4.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/lookup/extern-c-redecl4.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/cfg1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/cfg1.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/opt/cfg1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/conj1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/conj1.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/opt/conj1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/conj2.C  -std=c++11 (test for excess errors)
FAIL: g++.dg/opt/conj2.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/opt/copysign-1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/copysign-1.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/opt/copysign-1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/pr17724-5.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/pr17724-5.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/opt/pr17724-5.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/pr17724-6.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/pr17724-6.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/opt/pr17724-6.C  -std=gnu++14 (t

[Bug c++/77291] New: False positive for -Warray-bounds

2016-08-18 Thread abbeyj+gcc at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77291

Bug ID: 77291
   Summary: False positive for -Warray-bounds
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: abbeyj+gcc at gmail dot com
  Target Milestone: ---

Created attachment 39473
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39473&action=edit
Test case

Testcase:

struct Rec {
  unsigned char data[1];  // actually variable length
};

union U {
  unsigned char buf[42];
  struct Rec rec;
};

int Load()
{
  union U u;
  return u.rec.data[1];
}

==

When compiled with either of
$ gcc -c -O2 -Warray-bounds array_bounds.c
or
$ g++ -c -O2 -Warray-bounds array_bounds.cpp

produces

array_bounds.c: In function ‘Load’:
array_bounds.c:13:20: warning: array subscript is above array bounds
[-Warray-bounds]
   return u.rec.data[1];
  ~~^~~


There's an exception for accessing beyond the end of an array if that array is
the last member in a struct.  In that case it is assumed the user is using the
so-called "struct hack".  So, for instance, this doesn't warn:

int F(struct Rec *r) {
  return r->data[1];
}

But the warning *is* issued if the variable is allocated as an auto stack
variable since then the compiler knows the exact size allocated.  So this
example does warn:

int G() {
  struct Rec r;
  return r.data[1];
}

I'm assuming that the compiler is treating the test case more like this last
example since u.rec is on the stack.  But I believe the warning is incorrect
since the union should provide enough space.  Would it be possible to disable
this warning if the struct is on the stack but is also part of a union?  I
guess you could even try to figure out from the union how much space is
available and what the largest allowable index is but that seems like a lot of
effort to spend on what is going to be a rare case.

Please ignore the reading of uninitialized data in the testcase.  This can be
remedied by initializing before reading but the initialization must be done in
a separate translation unit.  If it is done in this translation unit the
optimizer causes the warning to disappear.  To keep the testcase to one file, I
left the initialization out.

[Bug c/77292] New: Spurious "warning: logical not is only applied to the left hand side of comparison"

2016-08-18 Thread terra at gnome dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77292

Bug ID: 77292
   Summary: Spurious "warning: logical not is only applied to the
left hand side of comparison"
   Product: gcc
   Version: 5.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: terra at gnome dot org
  Target Milestone: ---

The warning introduced in bug 62183 is trigger-happy.

int
foo (int a, int b)
{
  // Make it obvious that these are booleans.
  a = !!a;
  b = !!b;

  return !a == b;
}


# gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# gcc -c -Wall -O2 bbb.c 
bbb.c: In function ‘foo’:
bbb.c:8:13: warning: logical not is only applied to the left hand side of
comparison [-Wlogical-not-parentheses]
   return !a == b;
 ^
This is silly and out-right illogical when "a" and "b" are booleans.
"!a == b" means the same as "!(a == b)", so the warning is pointing out
that I might have meant something else that, upon inspection, is seen
to be the same thing.

Note: by "boolean" I mean an integer with value 0 or 1.

[Bug middle-end/77293] New: __builtin_object_size inconsistent for multidimensional arrays

2016-08-18 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77293

Bug ID: 77293
   Summary: __builtin_object_size inconsistent for
multidimensional arrays
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

While testing a patch for bug 71831 (as per
https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01363.html) I noticed some
inconsistencies in how __builtin_object_size handles multidimensional arrays. 
The test case below shows that the built-in returns a different result
depending on whether or not the offset in the expression is a constant.  In
particular, in type 1, when it is a constant the built-in evaluates to 1 while
when it is not a constant it evaluates to 4 in this case.  The manual suggests
that 4 should be expected: "If the least significant bit is clear, objects are
whole variables, if it is set, a closest surrounding subobject is considered
the object a pointer points to."

I don't need a fix for what I'm doing but it seems to me that the (type & 1)
functionality has been a source of confusion (see also bug 44384) and would be
best either deprecated or documented as less than fully reliable.  (The
documentation itself is ambiguous: what is "a closest surrounding subobject" of
an element of a multidimensional array?  Is it the higher-ranked subarray, the
entire array object, or, when the array object is a member of a structure, is
it the structure object?

$ (set -x && cat xyz.c && for N in a 1; do /build/gcc-trunk-svn/gcc/xgcc -B
/build/gcc-trunk-svn/gcc -DN=$N -O2 xyz.c && ./a.out; done)
+ cat xyz.c
static char x2x3[2][3];

#define strcpy(d, s) \
  __builtin___strcpy_chk (d, s, __builtin_object_size (d, 1))

int main (void)
{
  int a = 1;

  strcpy (&x2x3[0][1] + N, "abc"); 
}

+ for N in a 1
+ /build/gcc-trunk-svn/gcc/xgcc -B /build/gcc-trunk-svn/gcc -DN=a -O2 xyz.c
+ ./a.out
+ for N in a 1
+ /build/gcc-trunk-svn/gcc/xgcc -B /build/gcc-trunk-svn/gcc -DN=1 -O2 xyz.c
xyz.c: In function ‘main’:
xyz.c:4:3: warning: call to __builtin___memcpy_chk will always overflow
destination buffer
   __builtin___strcpy_chk (d, s, __builtin_object_size (d, 1))
   ^~~
xyz.c:10:3: note: in expansion of macro ‘strcpy’
   strcpy (&x2x3[0][1] + N, "abc");
   ^~
+ ./a.out
*** buffer overflow detected ***: ./a.out terminated
=== Backtrace: =
/lib64/libc.so.6(+0x77d9e)[0x7f983241dd9e]
/lib64/libc.so.6(__fortify_fail+0x37)[0x7f98324b7db7]
...

[Bug middle-end/44384] builtin_object_size_ treatment of multidimensional arrays is unexpected

2016-08-18 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44384

Martin Sebor  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-08-19
  Component|c   |middle-end
 Ever confirmed|0   |1
  Known to fail||4.5.3, 4.8.3, 4.9.3, 5.3.0,
   ||6.1.0, 7.0

--- Comment #5 from Martin Sebor  ---
I confirm this on the basis of the test case submitted in bug 77293.  Type-1
__builtin_object_size results for multidimensional arrays are inconsistent
within the same program and incongruous with the manual.  The inconsistency
affects not just users who directly rely on the built-in to implement checking
for their own functions (other than the Glibc subset) but also for users of
_FORTIFY_SOURCE (such as strcpy, as shown in bug 77293).

[Bug middle-end/77294] New: __builtin_object_size inconsistent for member arrays

2016-08-18 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77294

Bug ID: 77294
   Summary: __builtin_object_size inconsistent for member arrays
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

Bbesides bug 77293, further testing of my patch for bug 71831 also revealed
that __builtin_object_size yields inconsistent results for member arrays
depending on how an element of the array is referenced (using slightly
different but equivalent expressions) and on whether or not an offset into the
array is an integer constant.  The following test case shows the inconsistency
both between the iterations of the first and within the first one.  The output
is expected to be consistent both between the two iterations but also within
each one of them (i.e., I would expect each line of output to show the same two
numbers).  When (type & 1) is set, I would also expect to see a larger result
than when the bit is clear based on the manual saying "if [the least
significant bit ] is set, a closest surrounding subobject is considered the
object a pointer points to."

$ (set -x && cat xyz.c && for N in 1 i; do /build/gcc-trunk-svn/gcc/xgcc -B
/build/gcc-trunk-svn/gcc -DN=$N -O2 xyz.c && ./a.out; done)
+ cat xyz.c
struct __attribute__ ((packed)) A { char a [3]; char b [5]; };

struct A a;

int main (void)
{
  int i = 1;
  __builtin_printf ("type 0: %zu %zu\n"
"type 1: %zu %zu\n"
"type 2: %zu %zu\n"
"type 3: %zu %zu\n",
__builtin_object_size (&a.a[0] + N, 0),
__builtin_object_size (&a.a[N] + 0, 0),
__builtin_object_size (&a.a[0] + N, 1),
__builtin_object_size (&a.a[N] + 0, 1),
__builtin_object_size (&a.a[0] + N, 2),
__builtin_object_size (&a.a[N] + 0, 2),
__builtin_object_size (&a.a[0] + N, 3),
__builtin_object_size (&a.a[N] + 0, 3));
}


+ for N in 1 i
+ /build/gcc-trunk-svn/gcc/xgcc -B /build/gcc-trunk-svn/gcc -DN=1 -O2 xyz.c
+ ./a.out
type 0: 7 7
type 1: 2 2
type 2: 7 7
type 3: 7 2
+ for N in 1 i
+ /build/gcc-trunk-svn/gcc/xgcc -B /build/gcc-trunk-svn/gcc -DN=i -O2 xyz.c
+ ./a.out
type 0: 7 7
type 1: 7 7
type 2: 7 7
type 3: 7 7

[Bug target/71776] relocation truncated to fit: R_ARM_THM_JUMP11 against symbol `__gnu_h2f_internal'

2016-08-18 Thread malithyapa at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71776

--- Comment #5 from malithyapa at gmail dot com ---
On Wed, Jul 6, 2016 at 1:39 PM ktkachov at gcc dot gnu.org <
gcc-bugzi...@gcc.gnu.org> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71776
>
> ktkachov at gcc dot gnu.org changed:
>
>What|Removed |Added
> 
> 
>  CC||ktkachov at gcc dot
> gnu.org
>
> --- Comment #1 from ktkachov at gcc dot gnu.org ---
> Can you try the latest 4.9.3 release and if it still fails please provide
> the
> full configure line
>
> --
> You are receiving this mail because:
> You reported the bug.


I tried this on the 4.9.3 build and it still fails. I've updated the bug
with the configure line. Anything else i can do at the moment to get past
this?

Thanks,
Malith

[Bug tree-optimization/62171] restrict pointer to struct with restrict pointers parm doesn't prevent aliases

2016-08-18 Thread ncahill_alt at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62171

ncahill_alt at yahoo dot com changed:

   What|Removed |Added

 CC||ncahill_alt at yahoo dot com

--- Comment #14 from ncahill_alt at yahoo dot com ---
This test is failing for me in GCC 6.1.0 (i386).  It complains about having no
vectype.

Why that is, I don't know.  But it doesn't seem to be a problem otherwise, it
seems pretty safe to ignore except that it could vectorize the loop but
doesn't.

I wonder if it would be easier just to have this be UNSUPPORTED on i386.

[Bug middle-end/77295] New: missed optimisation when copying/moving union members

2016-08-18 Thread a...@cloudius-systems.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77295

Bug ID: 77295
   Summary: missed optimisation when copying/moving union members
   Product: gcc
   Version: 6.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: a...@cloudius-systems.com
  Target Milestone: ---

Discriminated unions of class types are becoming popular; for example
std::variant<...> or std::future.

gcc doesn't optimize copies or moved of discriminated unions well:


// Will usually be a template with user-defined types
// as members of the union
struct discriminated_union {
  unsigned which;
  union {
int v1;
long v2;
  };
  discriminated_union(discriminated_union&&);
};

discriminated_union::discriminated_union(discriminated_union&& x) {
  which = x.which;
  switch (x.which) {
  case 1:
v1 = x.v1;
break;
  case 2:
v2 = x.v2;
break;
  }
}


compiles into

   0:   8b 06   mov(%rsi),%eax
   2:   89 07   mov%eax,(%rdi)
   4:   8b 06   mov(%rsi),%eax
   6:   83 f8 01cmp$0x1,%eax
   9:   74 1d   je 28

   b:   83 f8 02cmp$0x2,%eax
   e:   75 10   jne20

  10:   48 8b 46 08 mov0x8(%rsi),%rax
  14:   48 89 47 08 mov%rax,0x8(%rdi)
  18:   c3  retq   
  19:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
  20:   f3 c3   repz retq 
  22:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
  28:   8b 46 08mov0x8(%rsi),%eax
  2b:   89 47 08mov%eax,0x8(%rdi)
  2e:   c3  retq   

instead of just copying the 12 bytes from (%rsi) into (rdi); unconditionally
copying the long is cheaper than the branching.