date:20190118

[Bug ipa/88900] New: [9 Regression] 502.gcc_r SPEC benchmark miscompiles with LTO and PGO

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88900

Bug ID: 88900
   Summary: [9 Regression] 502.gcc_r SPEC benchmark miscompiles
with LTO and PGO
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: needs-bisection, needs-reduction
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Can be seen on a Ryzen machine and Haswell with following flags:

OPTIMIZE= -Ofast -march=native -g -flto=8
CXXOPTIMIZE  = -fpermissive
FOPTIMIZE= -std=legacy

PASS1_OPTIMIZE= -fprofile-generate
PASS2_OPTIMIZE= -fprofile-use -fprofile-correction

I'm bisecting problematic revision.

[Bug ipa/88900] [9 Regression] 502.gcc_r SPEC benchmark miscompiles with LTO and PGO

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88900

Martin Liška  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-18
  Known to work||8.2.0
 Blocks||26163
   Target Milestone|--- |9.0
 Ever confirmed|0   |1
  Known to fail||9.0


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug tree-optimization/88896] [8/9 Regression] integer overflow check optimized away

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88896

--- Comment #4 from Jakub Jelinek  ---
The compiler optimizes the program with the assumption that undefined behavior
doesn't happen.  So, e.g. it can remove loop condition if it proves that
undefined behavior happens before the last iteration and many other
possibilities.
Use -fsanitize=undefined to discover the UB if unsure (though that doesn't
catch e.g. aliasing bugs).

[Bug c++/88901] New: [9.0 Regression] ICE when using -fsanitize=pointer-compare

2019-01-18 Thread dominik.stras...@onespin-solutions.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88901

Bug ID: 88901
   Summary: [9.0 Regression] ICE when using
-fsanitize=pointer-compare
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dominik.stras...@onespin-solutions.com
  Target Milestone: ---

Created attachment 45454
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45454&action=edit
Source code triggering the crash

The attached sourced code yields a
t.C:4:50: internal compiler error: tree check: did not expect class 'type',
have 'type' (template_type_parm) in contains_placeholder_p, at tree.c:3795
when "-fsanitize=address -fsanitize=pointer-compare" is used.
gcc 8.2 works fine.

g++ -c -fsanitize=address -fsanitize=pointer-compare   t.C

This is g++ from git 2019/01/17

[Bug fortran/88898] [Regression 9] gomp is broken by r268045

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88898

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Richard Biener  ---
Fixed I assume.

[Bug target/88892] [8/9 Regression] Double-to-float conversion uses wrong rounding mode when followed by memcpy

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88892

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
   Target Milestone|--- |8.3

[Bug sanitizer/88901] [9 Regression] ICE when using -fsanitize=pointer-compare

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88901

Richard Biener  changed:

   What|Removed |Added

 CC||dodji at gcc dot gnu.org,
   ||dvyukov at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org,
   ||kcc at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org
  Component|c++ |sanitizer
   Target Milestone|--- |9.0
Summary|[9.0 Regression] ICE when   |[9 Regression] ICE when
   |using   |using
   |-fsanitize=pointer-compare  |-fsanitize=pointer-compare

[Bug sanitizer/88901] [9 Regression] ICE when using -fsanitize=pointer-compare

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88901

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-01-18
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Let me take a look.

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

--- Comment #8 from Jakub Jelinek  ---
Author: jakub
Date: Fri Jan 18 09:14:18 2019
New Revision: 268063

URL: https://gcc.gnu.org/viewcvs?rev=268063&root=gcc&view=rev
Log:
Reapply:
2018-12-15  Jakub Jelinek  

PR target/88489
* gcc.target/i386/avx512vl-vfixupimmsd-2.c: New test.
* gcc.target/i386/avx512vl-vfixupimmss-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/avx512vl-vfixupimmsd-2.c
trunk/gcc/testsuite/gcc.target/i386/avx512vl-vfixupimmss-2.c
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug target/88734] [8/9 Regression] AArch64's ACLE intrinsics give an ICE instead of compile error when option mismatch.

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88734

--- Comment #8 from Jakub Jelinek  ---
Author: jakub
Date: Fri Jan 18 09:15:36 2019
New Revision: 268064

URL: https://gcc.gnu.org/viewcvs?rev=268064&root=gcc&view=rev
Log:
PR target/88734
* config/arm/arm_neon.h: Fix #pragma GCC target syntax - replace
(("..."))) with ("...").

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm_neon.h

[Bug bootstrap/88714] [9 regression] bootstrap comparison failure on armv7l since r265398

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88714

--- Comment #26 from Jakub Jelinek  ---
Created attachment 45455
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45455&action=edit
gcc9-pr88714.patch

I needed a temporary solution for our distro packages and with this patch
armv7hl passes profiledbootstrap.  That said, I think preserving the
MEM_ALIAS_SET and MEM_EXPR is important for proper scheduling etc. decisions
and so it would be better to add new patterns.

[Bug target/85596] aarch64 --with-multilib-list documentation missing

2019-01-18 Thread clyon at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85596

--- Comment #2 from Christophe Lyon  ---
Author: clyon
Date: Fri Jan 18 09:20:41 2019
New Revision: 268065

URL: https://gcc.gnu.org/viewcvs?rev=268065&root=gcc&view=rev
Log:
PR target/85596 Add --with-multilib-list doc for aarch64

2019-01-18  Christophe Lyon  

PR target/85596
* doc/install.texi (with-multilib-list): Document for aarch64.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/doc/install.texi

[Bug target/85596] aarch64 --with-multilib-list documentation missing

2019-01-18 Thread clyon at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85596

Christophe Lyon  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||clyon at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |clyon at gcc dot gnu.org

--- Comment #3 from Christophe Lyon  ---
Fixed on trunk.

Since this is reported against gcc-8, shall I backport it to gcc-8-branch?

[Bug target/88734] [8 Regression] AArch64's ACLE intrinsics give an ICE instead of compile error when option mismatch.

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88734

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[8/9 Regression] AArch64's  |[8 Regression] AArch64's
   |ACLE intrinsics give an ICE |ACLE intrinsics give an ICE
   |instead of compile error|instead of compile error
   |when option mismatch.   |when option mismatch.

--- Comment #9 from Jakub Jelinek  ---
Should be fixed on the trunk now.

[Bug lto/88902] New: [9 Regression] ICE: Segmentation fault (in DFS::DFS_write_tree_body)

2019-01-18 Thread asolokha at gmx dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88902

Bug ID: 88902
   Summary: [9 Regression] ICE: Segmentation fault (in
DFS::DFS_write_tree_body)
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: GC, ice-on-valid-code, lto
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

gfortran-9.0.0-alpha20190113 snapshot (r267906) ICEs when compiling
gcc/testsuite/gfortran.dg/pr50069_2.f90 w/ -flto --param ggc-min-heapsize=0:

% powerpc-e300c3-linux-gnu-gfortran-9.0.0-alpha20190113 -flto --param
ggc-min-heapsize=0 -c gcc/testsuite/gfortran.dg/pr50069_2.f90
during IPA pass: fnsummary
gcc/testsuite/gfortran.dg/pr50069_2.f90:11: internal compiler error:
Segmentation fault
   11 | end function reverse
  | 
0xd9fad6 crash_signal
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/toplev.c:326
0xc46ef8 DFS::DFS_write_tree_body(output_block*, tree_node*, DFS::sccs*, bool)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/lto-streamer-out.c:759
0xc4f69d DFS::DFS(output_block*, tree_node*, bool, bool, bool)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/lto-streamer-out.c:587
0xc50620 lto_output_tree(output_block*, tree_node*, bool, bool)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/lto-streamer-out.c:1628
0xc47ebc write_global_stream
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/lto-streamer-out.c:2511
0xc47ebc lto_output_decl_state_streams(output_block*, lto_out_decl_state*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/lto-streamer-out.c:2558
0xc4e504 produce_asm_for_decls()
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/lto-streamer-out.c:2898
0xcc1927 write_lto
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/passes.c:2596
0xcc5290 ipa_write_summaries_1
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/passes.c:2657
0xcc5290 ipa_write_summaries()
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/passes.c:2720
0x9813fc ipa_passes
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/cgraphunit.c:2530
0x9813fc symbol_table::compile()
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/cgraphunit.c:2618
0x983db8 symbol_table::finalize_compilation_unit()
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-9.0.0_alpha20190113/work/gcc-9-20190113/gcc/cgraphunit.c:2863

(While my target here is powerpc, the ICE not target-specific.)

[Bug inline-asm/52813] %rsp in clobber list is silently ignored

2019-01-18 Thread clyon at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52813

--- Comment #9 from Christophe Lyon  ---
Author: clyon
Date: Fri Jan 18 09:57:41 2019
New Revision: 268066

URL: https://gcc.gnu.org/viewcvs?rev=268066&root=gcc&view=rev
Log:
[ARM][testsuite] follow-up to PR target/52813 and target/11807 fix.

2019-01-18  Christophe Lyon  

* gcc.target/arm/pr77904.c: Add dg-warning for sp clobber.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/arm/pr77904.c

[Bug lto/88902] [9 Regression] ICE: Segmentation fault (in DFS::DFS_write_tree_body)

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88902

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Known to work||8.2.0
   Keywords||needs-bisection
   Last reconfirmed||2019-01-18
   Host||x86_64-pc-linux-gnu
 Ever confirmed|0   |1
   Target Milestone|--- |9.0
  Known to fail||9.0

--- Comment #1 from Martin Liška  ---
Confirmed on x86_64, let me bisect that.

[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214

--- Comment #17 from Jakub Jelinek  ---
Author: jakub
Date: Fri Jan 18 10:07:27 2019
New Revision: 268067

URL: https://gcc.gnu.org/viewcvs?rev=268067&root=gcc&view=rev
Log:
PR tree-optimization/86214
* tree-inline.h (struct copy_body_data): Add
add_clobbers_to_eh_landing_pads member.
* tree-inline.c (add_clobbers_to_eh_landing_pad): New function.
(copy_edges_for_bb): Call it if EH edge destination is <
id->add_clobbers_to_eh_landing_pads.  Fix a comment typo.
(expand_call_inline): Set id->add_clobbers_to_eh_landing_pads
if flag_stack_reuse != SR_NONE and clear it afterwards.

* g++.dg/opt/pr86214-1.C: New test.
* g++.dg/opt/pr86214-2.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/opt/pr86214-1.C
trunk/gcc/testsuite/g++.dg/opt/pr86214-2.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-inline.c
trunk/gcc/tree-inline.h

[Bug tree-optimization/86214] [8 Regression] Strongly increased stack usage

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[8/9 Regression] Strongly   |[8 Regression] Strongly
   |increased stack usage   |increased stack usage

--- Comment #18 from Jakub Jelinek  ---
Should be fixed now on the trunk.
Likely not going to backport.

[Bug ipa/65873] Failure to inline always_inline memcpy

2019-01-18 Thread hubicka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65873

--- Comment #17 from Jan Hubicka  ---
The underlying problem of inlining across target/optimization boundary not
being fully reliable is still there. I am not quite sure how we would want to
fix it w/o allowing to attach different optimization parameters to
regions/statements within one function which would be a lot of work to
implement.

[Bug tree-optimization/86214] [8 Regression] Strongly increased stack usage

2019-01-18 Thread steinar+gcc at gunderson dot no

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214

--- Comment #19 from Steinar H. Gunderson  ---
Thanks for fixing. IIRC we just added a noinline attribute somewhere in the
code, so we already have a workaround.

[Bug tree-optimization/88903] [8/9 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-01-18
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |8.3
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Mine.

[Bug tree-optimization/88903] New: [8/9 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

Bug ID: 88903
   Summary: [8/9 Regression] wrong-code with SLP vectorized shift
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

int x[1024];

void __attribute__((noinline)) foo()
{
  for (int i = 0; i < 512; ++i)
{
  x[2*i] = x[2*i] << (i+1);
  x[2*i+1] = x[2*i+1] << (i+1);
}
}

int main()
{
  for (int i = 0; i < 1024; ++i)
x[i] = i;
  foo ();
  for (int i = 0; i < 1024; ++i)
if (x[i] != i << (i/2+1))
  __builtin_abort ();
  return 0;
}

is vectorized using a scalar shift because

  /* In SLP, need to check whether the shift count is the same,
 in loops if it is a constant or invariant, it is always
 a scalar shift.  */
  if (slp_node)
{
  vec stmts = SLP_TREE_SCALAR_STMTS (slp_node);
  stmt_vec_info slpstmt_info;

  FOR_EACH_VEC_ELT (stmts, k, slpstmt_info)
{
  gassign *slpstmt = as_a  (slpstmt_info->stmt);
  if (!operand_equal_p (gimple_assign_rhs2 (slpstmt), op1, 0))
scalar_shift_arg = false;
}
}

only checks the scalar stmts covering half of the vector elements missing
that the other two will use a different shift value.

[Bug sanitizer/88901] ICE when using -fsanitize=pointer-compare

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88901

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|marxin at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org
Summary|[9 Regression] ICE when |ICE when using
   |using   |-fsanitize=pointer-compare
   |-fsanitize=pointer-compare  |
  Known to fail||8.2.0, 9.0

--- Comment #2 from Martin Liška  ---
It's there since the beginning, so started with r255404.
Jakub can you please take a look?

[Bug rtl-optimization/84842] [7/8/9 Regression] ICE in verify_target_availability, at sel-sched.c:1569

2019-01-18 Thread asolokha at gmx dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84842

--- Comment #16 from Arseny Solokha  ---
(In reply to Arseny Solokha from comment #15)
> Finally.

As of r267906 it doesn't ICE for me anymore, but
gcc/testsuite/gfortran.dg/dependency_36.f90 does:

% powerpc-e300c3-linux-gnu-gfortran-9.0.0-alpha20190113 -m32 -mcpu=power8 -O2
-fselective-scheduling2 -c gcc/testsuite/gfortran.dg/dependency_36.f90

[Bug ipa/88900] [9 Regression] 502.gcc_r SPEC benchmark miscompiles with LTO and PGO

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88900

Martin Liška  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #1 from Martin Liška  ---
What a surprise, started with r267883. I'll carry on bisection with --param
inline-unit-growth=40.

[Bug sanitizer/88901] ICE when using -fsanitize=pointer-compare

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88901

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Created attachment 45456
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45456&action=edit
gcc9-pr88901.patch

Untested fix.

[Bug tree-optimization/88903] [8/9 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
Started with r248909.

[Bug libgomp/87835] nvptx offloading: libgomp.oacc-c-c++-common/asyncwait-1.c execution test intermittently fails at -O2

2019-01-18 Thread tschwinge at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87835

--- Comment #3 from Thomas Schwinge  ---
Created attachment 45457
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45457&action=edit
[WIP] libgomp.oacc-c-c++-common/asyncwait-1.c debug

(In reply to Tom de Vries from comment #2)
> (In reply to Tom de Vries from comment #1)
> > (In reply to Thomas Schwinge from comment #0)
> > > After r264397 "[nvptx] Remove use of CUDA unified memory in libgomp", I'm
> > > seeing (intermittently only, and only on some systems):
> > 
> > I see the failure reproduced consistently with a Quadro M1200.

Oh, good -- in a way ;-) -- that it's consistently reproducable for you.  For
me, the failure is rather rare.

> > > I have not yet analyzed what's causing this, but I have some ideas about
> > > pending patches that might cure it.

Unfortunately, the patches I've been thinking of either are on trunk already,
or can't possibly be related to this problem.

The 'async'/'wait' clauses/directives in the test case look correct.

> do you intend to address this before stage4 closes?

I'd like to, yes.


Here is my current status.


With "-O2":

[...]
  nvptx_exec: kernel main$_omp_fn$37: launch gangs=32, workers=1,
vectors=32
  nvptx_exec: kernel main$_omp_fn$37: finished
  GOACC_data_end: restore mappings
  GOACC_data_end: mappings restored
[abort]

In addition to "main$_omp_fn$37", sometimes also seen with "main$_omp_fn$25",
"main$_omp_fn$29", "main$_omp_fn$33".

So far only seen with OpenACC 'kernels' constructs, but not with the very
similar 'parallel' ones earlier in the file.

For example, without "DEBUG_K":

[...]
  nvptx_exec: kernel main$_omp_fn$37: launch gangs=32, workers=1,
vectors=32
  nvptx_exec: kernel main$_omp_fn$37: finished
GOACC_wait -2 1
goacc_wait -2 1
goacc_wait   1
  GOACC_data_end: restore mappings
  GOACC_data_end: mappings restored
1007 c[64] 0
1019 e[64] 13
1007 c[65] 0
1019 e[65] 13
1007 c[66] 0
1019 e[66] 13
[...]
1007 c[125] 0
1019 e[125] 13
1007 c[126] 0
1019 e[126] 13
1007 c[127] 0
1019 e[127] 13

With "DEBUG_K":

[...]
  nvptx_exec: kernel main$_omp_fn$37: launch gangs=1, workers=1, vectors=32
  nvptx_exec: kernel main$_omp_fn$37: finished
GOACC_wait -2 1
goacc_wait -2 1
goacc_wait   1
966 c[64] 0
966 c[65] 0
966 c[66] 0
[...]
966 c[125] 0
966 c[126] 0
966 c[127] 0

So, the compute kernel ("main$_omp_fn$37") doesn't find the "c" array properly
initialized, even though they're enqueued on the same 'async', so have to
execute in proper order by definition.

I've only ever seen this with the "c" array.

Sometimes that's starting already with index 0 (often seen with
"main$_omp_fn$29"), or as late as index 100 (rarely).


When running under "valgrind", repeatedly until there's an "abort", that
doesn't print anything suspicious.


Might this perhaps be a latent issue in OpenACC 'kernels' plus 'async', now
uncovered by the r264397 "[nvptx] Remove use of CUDA unified memory in libgomp"
commit?

[Bug rtl-optimization/88904] New: Basic block incorrectly skipped in jump threading.

2019-01-18 Thread matmal01 at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88904

Bug ID: 88904
   Summary: Basic block incorrectly skipped in jump threading.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: matmal01 at gcc dot gnu.org
  Target Milestone: ---

When compiling the attached code, with an arm-none-eabi cross compiler from
trunk, 

arm-none-eabi-gcc -march=armv6-m -S test.c -o test.s -Os

incorrect assembly is generated, which leads to the second assert always being
triggered.


This happens since revision r266734 which introduced a new pass running
jump-threading just after reload.

For the attached testcase this triggers a latent bug in the `thread_jump`
function.

The combine pass can modify a jump_insn so that its pattern is of the form
(parallel [
(set (pc) ...)
(clobber (scratch))])

which after reload can end up in the form 
(parallel [
(set (pc) (if_then_else (

[Bug rtl-optimization/88904] Basic block incorrectly skipped in jump threading.

2019-01-18 Thread matmal01 at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88904

--- Comment #1 from Matthew Malcomson  ---
Created attachment 45458
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45458&action=edit
Problematic testcase

[Bug target/88794] [9 Regression] fixupimm intrinsics are unusable

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88794

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jakub Jelinek  ---
Fixed now.

[Bug tree-optimization/88903] [7/8/9 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
   Target Milestone|8.3 |7.5
Summary|[8/9 Regression] wrong-code |[7/8/9 Regression]
   |with SLP vectorized shift   |wrong-code with SLP
   ||vectorized shift

--- Comment #3 from Richard Biener  ---
Variant that also fails with GCC 7 (no SLP induction support):

int x[1024];
int y[1024];
int z[1024];

void __attribute__((noinline)) foo()
{
  for (int i = 0; i < 512; ++i)
{
  x[2*i] = x[2*i] << y[2*i];
  x[2*i+1] = x[2*i+1] << y[2*i];
  z[2*i] = y[2*i];
  z[2*i+1] = y[2*i+1];
}
}

int main()
{
  for (int i = 0; i < 1024; ++i)
x[i] = i, y[i] = i % 8;
  foo ();
  for (int i = 0; i < 1024; ++i)
if (x[i] != i << ((i & ~1) % 8))
  __builtin_abort ();
  return 0;
}

[Bug lto/88902] [8/9 Regression] ICE: Segmentation fault (in DFS::DFS_write_tree_body)

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88902

Martin Liška  changed:

   What|Removed |Added

 CC||lkrupp at gcc dot gnu.org
  Known to work|8.2.0   |
Summary|[9 Regression] ICE: |[8/9 Regression] ICE:
   |Segmentation fault (in  |Segmentation fault (in
   |DFS::DFS_write_tree_body)   |DFS::DFS_write_tree_body)
  Known to fail||8.2.0

--- Comment #2 from Martin Liška  ---
So it's there since the fix of PR50069 (r244601). Thus I guess it's a Fortran
error.

[Bug tree-optimization/88903] [7/8/9 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

--- Comment #4 from Martin Liška  ---
Started with:

SVN revision: r224221
Author: rguenth
2015-06-08  Richard Biener  

* tree-vect-stmts.c (vectorizable_load): Compute the pointer
adjustment for gaps at the end of a SLP load group properly.
* tree-vect-slp.c (vect_supported_load_permutation_p): Allow
all permutations we can generate.
(vect_transform_slp_perm_load): Use the correct group-size.

* gcc.dg/vect/slp-perm-10.c: New testcase.
* gcc.dg/vect/slp-23.c: Adjust.
* gcc.dg/torture/pr53366-2.c: Also verify cross-iteration
vector pointer update.

[Bug fortran/88902] [8/9 Regression] ICE: Segmentation fault (in DFS::DFS_write_tree_body)

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88902

Martin Liška  changed:

   What|Removed |Added

   Keywords|lto, needs-bisection|
  Component|lto |fortran

--- Comment #3 from Martin Liška  ---
So the FE creates an identifier node:

$ (gdb) p debug_tree((tree_node *) 0x76d340c8)
 

which is then removed by GGC:

Old value = IDENTIFIER_NODE
New value = 2779096485
__memset_avx2_unaligned_erms () at
../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:166
166 VZEROUPPER
(gdb) bt
#0  __memset_avx2_unaligned_erms () at
../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:166
#1  0x0098f8f5 in poison_pages () at ../../gcc/ggc-page.c:2113
#2  0x0098fa79 in ggc_collect () at ../../gcc/ggc-page.c:2208
#3  0x00e66c14 in execute_one_pass (pass=) at ../../gcc/passes.c:2479
#4  0x00e66c6a in execute_pass_list_1 (pass=) at ../../gcc/passes.c:2494
#5  0x00e66cf3 in execute_pass_list (fn=0x76d35000, pass=) at ../../gcc/passes.c:2505
#6  0x00a42c3b in cgraph_node::analyze (this=) at ../../gcc/cgraphunit.c:637
#7  0x00a4448d in analyze_functions (first_time=true) at
../../gcc/cgraphunit.c:1087
#8  0x00a48d6c in symbol_table::finalize_compilation_unit
(this=0x76b56100) at ../../gcc/cgraphunit.c:2562
#9  0x00f8b7ee in compile_file () at ../../gcc/toplev.c:488
#10 0x00f8dcf1 in do_compile () at ../../gcc/toplev.c:1983
#11 0x00f8dfce in toplev::main (this=0x7fffd9de, argc=16,
argv=0x7fffdad8) at ../../gcc/toplev.c:2117
#12 0x01b75230 in main (argc=16, argv=0x7fffdad8) at
../../gcc/main.c:39

So the tree is somehow lost.

[Bug fortran/58200] Option fcheck is misleadingly located in option descriptions

2019-01-18 Thread dominiq at lps dot ens.fr

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58200

--- Comment #5 from Dominique d'Humieres  ---
RFA posted at https://gcc.gnu.org/ml/fortran/2019-01/msg00142.html.

[Bug ipa/88900] [9 Regression] 502.gcc_r SPEC benchmark miscompiles with LTO and PGO

2019-01-18 Thread hubicka at ucw dot cz

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88900

--- Comment #2 from Jan Hubicka  ---
> What a surprise, started with r267883. I'll carry on bisection with --param
> inline-unit-growth=40.

Well, I guess I can't claim that this is not gcc bug but it is the
benchmark that is broken :)

Honza

[Bug tree-optimization/88903] [7/8/9 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

--- Comment #5 from Richard Biener  ---
In the end a mistake of the PR48616 fix (r172638).

[Bug target/88799] [8/9 Regression] Arm -mcpu=PROCESSOR does not result in assembly directives for .arch and .arch_extension

2019-01-18 Thread rearnsha at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88799

--- Comment #2 from Richard Earnshaw  ---
Author: rearnsha
Date: Fri Jan 18 11:49:56 2019
New Revision: 268072

URL: https://gcc.gnu.org/viewcvs?rev=268072&root=gcc&view=rev
Log:
PR target/88799 Add +mp and +sec extensions to ARMv7-a

Most armv7-a implementations support a number of basic extensions to
the architecture which are not particularly important to the compiler,
but can matter if code contains inline assembly.  This patch adds
support for these extensions, based on the capabilities that GAS
already provides for the appropriate CPUs.  For the purposes of
multilib selection we ignore these extensions entirely and map the
extended architecture versions down to the base versions we have
already support for.

gcc:
PR target/88799
* config/arm/arm-cpus.in (mp): New feature.
(sec): New feature.
(fgroup ARMv7ve): Add mp and sec features.
(arch armv7-a): Add options to allow mp and sec extensions.
(cpu generic-armv7-a): Add options to allow mp and sec extensions.
(cpu cortex-a5, cpu cortex-7, cpu cortex-a9): Add mp and sec
extenstions to the base architecture.
(cpu cortex-a8): Add sec extension to the base architecture.
(cpu marvell-pj4): Add mp and sec extensions to the base architecture.
* config/arm/t-aprofile (MULTILIB_MATCHES): Map all armv7-a arch
variants down to the base v7-a varaint.
* config/arm/t-multilib (v7_a_arch_variants): New variable.
* doc/invoke.texi (ARM Options): Add +mp and +sec to the list
of permitted extensions for -march=armv7-a and for
-mcpu=generic-armv7-a.

testsuite:
* gcc.target/arm/multilib.exp (config "aprofile"): Add tests for
mp and sec extensions to armv7-a.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm-cpus.in
trunk/gcc/config/arm/t-aprofile
trunk/gcc/config/arm/t-multilib
trunk/gcc/doc/invoke.texi
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/arm/multilib.exp

[Bug fortran/88902] ICE: Segmentation fault (in DFS::DFS_write_tree_body)

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88902

--- Comment #4 from Jakub Jelinek  ---
The problem is that gfc_add_decl_to_parent_function is called multiple times on
 and thus for that VAR_DECL DECL_CHAIN
(decl) == decl (or in theory there could be longer loop, but any loop in
DECL_CHAIN is invalid).

[Bug fortran/88902] ICE: Segmentation fault (in DFS::DFS_write_tree_body)

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88902

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
--- gcc/fortran/trans-decl.c.jj 2019-01-16 09:35:08.0 +0100
+++ gcc/fortran/trans-decl.c2019-01-18 12:52:45.205618524 +0100
@@ -1572,13 +1572,17 @@ gfc_get_symbol_decl (gfc_symbol * sym)
  if (VAR_P (length) && DECL_FILE_SCOPE_P (length))
{
  /* Add the string length to the same context as the symbol.  */
- if (DECL_CONTEXT (sym->backend_decl) == current_function_decl)
-   gfc_add_decl_to_function (length);
- else
-   gfc_add_decl_to_parent_function (length);
+ if (DECL_CONTEXT (length) == NULL_TREE)
+   {
+ if (DECL_CONTEXT (sym->backend_decl)
+ == current_function_decl)
+   gfc_add_decl_to_function (length);
+ else
+   gfc_add_decl_to_parent_function (length);
+   }

- gcc_assert (DECL_CONTEXT (sym->backend_decl) ==
-   DECL_CONTEXT (length));
+ gcc_assert (DECL_CONTEXT (sym->backend_decl)
+ == DECL_CONTEXT (length));

  gfc_defer_symbol_init (sym);
}

fixes this.

[Bug target/88905] New: [8/9 Regression] ICE: in decompose, at rtl.h:2253 with -mabm and __builtin_popcountll

2019-01-18 Thread zsojka at seznam dot cz

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88905

Bug ID: 88905
   Summary: [8/9 Regression] ICE: in decompose, at rtl.h:2253 with
-mabm and __builtin_popcountll
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: i686-pc-linux-gnu

Created attachment 45459
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45459&action=edit
reduced testcase

Compiler output:
$ i686-pc-linux-gnu-gcc -Og -fno-tree-ccp -mabm testcase.c
during RTL pass: cse1
testcase.c: In function 'foo':
testcase.c:16:1: internal compiler error: in decompose, at rtl.h:2266
   16 | }
  | ^
0xc79fa0 wi::int_traits >::decompose(long*,
unsigned int, std::pair const&)
/repo/gcc-trunk/gcc/rtl.h:2264
0xc79fa0 wide_int_ref_storage::wide_int_ref_storage
>(std::pair const&)
/repo/gcc-trunk/gcc/wide-int.h:1004
0xc79fa0 generic_wide_int
>::generic_wide_int >(std::pair const&)
/repo/gcc-trunk/gcc/wide-int.h:780
0xc79fa0 simplify_const_unary_operation(rtx_code, machine_mode, rtx_def*,
machine_mode)
/repo/gcc-trunk/gcc/simplify-rtx.c:1797
0xc74320 simplify_unary_operation(rtx_code, machine_mode, rtx_def*,
machine_mode)
/repo/gcc-trunk/gcc/simplify-rtx.c:873
0x16d8047 fold_rtx
/repo/gcc-trunk/gcc/cse.c:3347
0x16da79a canonicalize_insn
/repo/gcc-trunk/gcc/cse.c:4489
0x16da79a cse_insn
/repo/gcc-trunk/gcc/cse.c:4581
0x16e1838 cse_extended_basic_block
/repo/gcc-trunk/gcc/cse.c:6662
0x16e1838 cse_main
/repo/gcc-trunk/gcc/cse.c:6841
0x16e2806 rest_of_handle_cse
/repo/gcc-trunk/gcc/cse.c:7678
0x16e2806 execute
/repo/gcc-trunk/gcc/cse.c:7721
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


$ i686-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-i686/bin/i686-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-268059-checking-yes-rtl-df-extra-i686/bin/../libexec/gcc/i686-pc-linux-gnu/9.0.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --with-sysroot=/usr/i686-pc-linux-gnu
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=i686-pc-linux-gnu --with-ld=/usr/bin/i686-pc-linux-gnu-ld
--with-as=/usr/bin/i686-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-268059-checking-yes-rtl-df-extra-i686
Thread model: posix
gcc version 9.0.0 20190118 (experimental) (GCC)

[Bug fortran/88902] ICE: Segmentation fault (in DFS::DFS_write_tree_body)

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88902

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
Created attachment 45460
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45460&action=edit
gcc9-pr88902.patch

Full untested patch.

[Bug target/88906] New: wrong code with -march=k6 -minline-all-stringops -minline-stringops-dynamically -mmemcpy-strategy=libcall:-1:align and vector argument

2019-01-18 Thread zsojka at seznam dot cz

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88906

Bug ID: 88906
   Summary: wrong code with -march=k6 -minline-all-stringops
-minline-stringops-dynamically
-mmemcpy-strategy=libcall:-1:align and vector argument
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: i686-pc-linux-gnu

Created attachment 45461
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45461&action=edit
reduced testcase

Output:
$ i686-pc-linux-gnu-gcc -O -march=k6 -minline-all-stringops
-minline-stringops-dynamically -mmemcpy-strategy=libcall:-1:align testcase.c
-Wno-psabi
$ ./a.out 
Aborted

v[] contains garbage

$ i686-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-i686/bin/i686-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-268059-checking-yes-rtl-df-extra-i686/bin/../libexec/gcc/i686-pc-linux-gnu/9.0.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --with-sysroot=/usr/i686-pc-linux-gnu
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=i686-pc-linux-gnu --with-ld=/usr/bin/i686-pc-linux-gnu-ld
--with-as=/usr/bin/i686-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-268059-checking-yes-rtl-df-extra-i686
Thread model: posix
gcc version 9.0.0 20190118 (experimental) (GCC)

[Bug target/88905] [8/9 Regression] ICE: in decompose, at rtl.h:2253 with -mabm and __builtin_popcountll

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88905

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
   Target Milestone|--- |8.3

[Bug rtl-optimization/88904] [9 Regression] Basic block incorrectly skipped in jump threading.

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88904

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||arm-none-eabi
   Priority|P3  |P1
   Target Milestone|--- |9.0
Summary|Basic block incorrectly |[9 Regression] Basic block
   |skipped in jump threading.  |incorrectly skipped in jump
   ||threading.

[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
typedef __builtin_neon_qi int8x8_t __attribute__ ((__vector_size__ (8)));
void bar (int8x8_t *, int8x8_t, int8x8_t);

void
foo (int8x8_t z, int8x8_t x, int8x8_t v)
{
  bar (&v, z, x);
}

ICEs too with these options and so does:
typedef __builtin_neon_qi int8x8_t __attribute__ ((__vector_size__ (8)));
void bar (int8x8_t, int8x8_t);

void
foo (int8x8_t x, int8x8_t y)
{
  bar (y, x);
}

The last one works with x, y because the *neon_movv8qi instructions are
eliminated by fwprop1.

In any case, this looks like a target bug to me, there is nothing wrong in how
the expansion works here, saving hard registers in which the parameters are
passed into pseudos and then loading the hard registers in which callee expects
parameters from the pseudos is the usual thing.  If the *neon_movv8qi
instructions need -mfloat-abi=hard or whatever else, then guess
__builtin_neon_qi vectors can't be passed where gcc wants to pass them unless
that condition is met - if there are no instructions to move those data from/to
these registers or to/from memory, then those registers can't be really used
period.  Or just error if user does this?

[Bug c++/88907] New: Variadic template function deduction failure.

2019-01-18 Thread Peter.Georg at physik dot uni-regensburg.de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88907

Bug ID: 88907
   Summary: Variadic template function deduction failure.
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Peter.Georg at physik dot uni-regensburg.de
  Target Milestone: ---

Created attachment 45462
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45462&action=edit
Code to reproduce bug including possible work-arounds

GCC fails to compile the following code due to alleged ambiguous function call.

```
#include 
template
class Base
{
};

template
void error(Base const &... args)
{
}

template>...>{}>>
void error(Ts &&... args)
{
}

int main()
{
Base const a;
error(a);
}
```

The code is also attached or see: https://godbolt.org/z/GuW-YC

I.e. call to error(Base const &) is ambiguous for GCC.
In my opinion, and e.g. clang agrees with me, the first function is clearly
more specialized and thus should be chosen.

There are two possible work-arounds (both are included in the attached code):
#1: Add an extra function error that takes exactly one parameter of Type
Base const &
#2: Remove the default template parameter (SFINAE) from the second error
function

I started I discussion about this (I wasn't sure it is a bug at first) and a
related clang bug (https://bugs.llvm.org/show_bug.cgi?id=40305) here:
https://stackoverflow.com/questions/54236545/why-is-it-an-ambigious-function-call-using-gcc-template-deduction-failing

[Bug fortran/88908] New: [9 Regression] ICE in tree check: expected tree that contains ‘decl common’ structure, have ‘indirect_ref’ in gfc_conv_gfc_desc_to_cfi_desc, at fortran/trans-expr.c:4927

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88908

Bug ID: 88908
   Summary: [9 Regression] ICE in tree check: expected tree that
contains ‘decl common’ structure, have ‘indirect_ref’
in gfc_conv_gfc_desc_to_cfi_desc, at
fortran/trans-expr.c:4927
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 45463
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45463&action=edit
test-case

It's reduced from mvapich2 package. Unfortunately one needs 2 Fortran modules:

$ gfortran accumulate_f08ts.f90 -I. -O3 -c
accumulate_f08ts.f90:39:0:

   39 | target_disp, target_count, target_datatype%MPI_VAL,
op%MPI_VAL, win%MPI_VAL)
  | 
internal compiler error: tree check: expected tree that contains ‘decl common’
structure, have ‘indirect_ref’ in gfc_conv_gfc_desc_to_cfi_desc, at
fortran/trans-expr.c:4927
0x6f1f4c tree_contains_struct_check_failed(tree_node const*,
tree_node_structure_enum, char const*, int, char const*)
/home/marxin/Programming/gcc/gcc/tree.c:9983
0x5e1447 contains_struct_check(tree_node*, tree_node_structure_enum, char
const*, int, char const*)
/home/marxin/Programming/gcc/gcc/tree.h:3290
0x5e1447 gfc_conv_gfc_desc_to_cfi_desc
/home/marxin/Programming/gcc/gcc/fortran/trans-expr.c:4927
0x8af782 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*,
gfc_expr*, vec*)
/home/marxin/Programming/gcc/gcc/fortran/trans-expr.c:5785
0x8b1f6a gfc_conv_expr(gfc_se*, gfc_expr*)
/home/marxin/Programming/gcc/gcc/fortran/trans-expr.c:8228
0x8bb3e3 gfc_trans_assignment_1
/home/marxin/Programming/gcc/gcc/fortran/trans-expr.c:10437
0x877577 trans_code
/home/marxin/Programming/gcc/gcc/fortran/trans.c:1822
0x8edb73 gfc_trans_if_1
/home/marxin/Programming/gcc/gcc/fortran/trans-stmt.c:1448
0x8f68aa gfc_trans_if(gfc_code*)
/home/marxin/Programming/gcc/gcc/fortran/trans-stmt.c:1479
0x877447 trans_code
/home/marxin/Programming/gcc/gcc/fortran/trans.c:1910
0x8a3f15 gfc_generate_function_code(gfc_namespace*)
/home/marxin/Programming/gcc/gcc/fortran/trans-decl.c:6526
0x82b58e translate_all_program_units
/home/marxin/Programming/gcc/gcc/fortran/parse.c:6134
0x82b58e gfc_parse_file()
/home/marxin/Programming/gcc/gcc/fortran/parse.c:6337
0x8745af gfc_be_parse_file
/home/marxin/Programming/gcc/gcc/fortran/f95-lang.c:204

Hope it will be possible to reproduce without source code of the modules.

[Bug fortran/88908] [9 Regression] ICE in tree check: expected tree that contains ‘decl common’ structure, have ‘indirect_ref’ in gfc_conv_gfc_desc_to_cfi_desc, at fortran/trans-expr.c:4927

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88908

Martin Liška  changed:

   What|Removed |Added

   Keywords||needs-bisection
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-18
  Known to work||8.2.0
 Ever confirmed|0   |1
  Known to fail||9.0

[Bug c++/88757] [9 Regression] GCC wrongly treats dependent name as a type when it should be treated as a value

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88757

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug c++/88334] Implement P0482R6, C++20 char8_t.

2019-01-18 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88334

--- Comment #3 from Jonathan Wakely  ---
The library patches aren't in yet.

[Bug c/51628] attribute((packed)) is unsafe in some cases (i.e. add -Waddress-of-packed-member, etc.)

2019-01-18 Thread hjl at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628

--- Comment #61 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Fri Jan 18 13:05:18 2019
New Revision: 268075

URL: https://gcc.gnu.org/viewcvs?rev=268075&root=gcc&view=rev
Log:
c-family: Update unaligned adress of packed member check

Check unaligned pointer conversion and strip NOPS.

gcc/c-family/

PR c/51628
PR c/88664
* c-common.h (warn_for_address_or_pointer_of_packed_member):
Remove the boolean argument.
* c-warn.c (check_address_of_packed_member): Renamed to ...
(check_address_or_pointer_of_packed_member): This.  Also
warn pointer conversion.
(check_and_warn_address_of_packed_member): Renamed to ...
(check_and_warn_address_or_pointer_of_packed_member): This.
Also warn pointer conversion.
(warn_for_address_or_pointer_of_packed_member): Remove the
boolean argument.  Don't check pointer conversion here.

gcc/c

PR c/51628
PR c/88664
* c-typeck.c (convert_for_assignment): Upate the
warn_for_address_or_pointer_of_packed_member call.

gcc/cp

PR c/51628
PR c/88664
* call.c (convert_for_arg_passing): Upate the
warn_for_address_or_pointer_of_packed_member call.
* typeck.c (convert_for_assignment): Likewise.

gcc/testsuite/

PR c/51628
PR c/88664
* c-c++-common/pr51628-33.c: New test.
* c-c++-common/pr51628-35.c: New test.
* c-c++-common/pr88664-1.c: Likewise.
* c-c++-common/pr88664-2.c: Likewise.
* gcc.dg/pr51628-34.c: Likewise.

Added:
trunk/gcc/testsuite/c-c++-common/pr51628-33.c
trunk/gcc/testsuite/c-c++-common/pr51628-35.c
trunk/gcc/testsuite/c-c++-common/pr88664-1.c
trunk/gcc/testsuite/c-c++-common/pr88664-2.c
trunk/gcc/testsuite/gcc.dg/pr51628-34.c
Modified:
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.h
trunk/gcc/c-family/c-warn.c
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-typeck.c
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/call.c
trunk/gcc/cp/typeck.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/88664] [9 Regression] False positive -Waddress-of-packed-member

2019-01-18 Thread hjl at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88664

--- Comment #8 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Fri Jan 18 13:05:18 2019
New Revision: 268075

URL: https://gcc.gnu.org/viewcvs?rev=268075&root=gcc&view=rev
Log:
c-family: Update unaligned adress of packed member check

Check unaligned pointer conversion and strip NOPS.

gcc/c-family/

PR c/51628
PR c/88664
* c-common.h (warn_for_address_or_pointer_of_packed_member):
Remove the boolean argument.
* c-warn.c (check_address_of_packed_member): Renamed to ...
(check_address_or_pointer_of_packed_member): This.  Also
warn pointer conversion.
(check_and_warn_address_of_packed_member): Renamed to ...
(check_and_warn_address_or_pointer_of_packed_member): This.
Also warn pointer conversion.
(warn_for_address_or_pointer_of_packed_member): Remove the
boolean argument.  Don't check pointer conversion here.

gcc/c

PR c/51628
PR c/88664
* c-typeck.c (convert_for_assignment): Upate the
warn_for_address_or_pointer_of_packed_member call.

gcc/cp

PR c/51628
PR c/88664
* call.c (convert_for_arg_passing): Upate the
warn_for_address_or_pointer_of_packed_member call.
* typeck.c (convert_for_assignment): Likewise.

gcc/testsuite/

PR c/51628
PR c/88664
* c-c++-common/pr51628-33.c: New test.
* c-c++-common/pr51628-35.c: New test.
* c-c++-common/pr88664-1.c: Likewise.
* c-c++-common/pr88664-2.c: Likewise.
* gcc.dg/pr51628-34.c: Likewise.

Added:
trunk/gcc/testsuite/c-c++-common/pr51628-33.c
trunk/gcc/testsuite/c-c++-common/pr51628-35.c
trunk/gcc/testsuite/c-c++-common/pr88664-1.c
trunk/gcc/testsuite/c-c++-common/pr88664-2.c
trunk/gcc/testsuite/gcc.dg/pr51628-34.c
Modified:
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.h
trunk/gcc/c-family/c-warn.c
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-typeck.c
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/call.c
trunk/gcc/cp/typeck.c
trunk/gcc/testsuite/ChangeLog

[Bug target/88906] wrong code with -march=k6 -minline-all-stringops -minline-stringops-dynamically -mmemcpy-strategy=libcall:-1:align and vector argument

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88906

Martin Liška  changed:

   What|Removed |Added

   Keywords||needs-bisection
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-18
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Confirmed, let me bisect that.

[Bug target/88905] [8/9 Regression] ICE: in decompose, at rtl.h:2253 with -mabm and __builtin_popcountll

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88905

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-18
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Confirmed, let me bisect that.

[Bug target/88906] wrong code with -march=k6 -minline-all-stringops -minline-stringops-dynamically -mmemcpy-strategy=libcall:-1:align and vector argument

2019-01-18 Thread zsojka at seznam dot cz

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88906

--- Comment #2 from Zdenek Sojka  ---
I was getting either wrong code with 5+, or ICE with 4.9, or unknown compiler
argument with 4.8 -> I didn't find any gcc version where this was working.

[Bug tree-optimization/88903] [7/8 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

--- Comment #7 from Richard Biener  ---
Author: rguenth
Date: Fri Jan 18 13:13:21 2019
New Revision: 268076

URL: https://gcc.gnu.org/viewcvs?rev=268076&root=gcc&view=rev
Log:
2019-01-18  Richard Biener  

PR tree-optimization/88903
* tree-vect-stmts.c (vectorizable_shift): Verify we see all
scalar stmts a SLP shift amount is composed of when detecting
shifts by scalars.

* gcc.dg/vect/pr88903-1.c: New testcase.
* gcc.dg/vect/pr88903-2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/vect/pr88903-1.c
trunk/gcc/testsuite/gcc.dg/vect/pr88903-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-stmts.c

[Bug tree-optimization/88903] [7/8 Regression] wrong-code with SLP vectorized shift

2019-01-18 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88903

Richard Biener  changed:

   What|Removed |Added

  Known to work||9.0
Summary|[7/8/9 Regression]  |[7/8 Regression] wrong-code
   |wrong-code with SLP |with SLP vectorized shift
   |vectorized shift|
  Known to fail||7.4.0, 8.2.0

--- Comment #6 from Richard Biener  ---
Fixed on trunk sofar.

[Bug c++/88664] [9 Regression] False positive -Waddress-of-packed-member

2019-01-18 Thread hjl.tools at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88664

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from H.J. Lu  ---
Fixed.

[Bug rtl-optimization/88846] [9 Regression] pr69776-2.c failure on 32 bit AIX

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88846

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
 CC||jakub at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Seems entirely like RA bug to me.  The MEM_ALIAS_SET on the REG_EQUIV is
correct, it is the mode of the memory the SET_DEST of the insn is going to be
stored into.  What is weird is that it is added to an instruction before it is
stored there, and if that is reasonable, it should actually only use it if it
moves the store from where it has actually been done to where it wants with no
other memory accesses that would prevent that (in this case there are ones that
conflict with it, so it must not do that).  Probably latent before though.

[Bug target/88799] [8/9 Regression] Arm -mcpu=PROCESSOR does not result in assembly directives for .arch and .arch_extension

2019-01-18 Thread rearnsha at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88799

--- Comment #3 from Richard Earnshaw  ---
Author: rearnsha
Date: Fri Jan 18 13:25:37 2019
New Revision: 268077

URL: https://gcc.gnu.org/viewcvs?rev=268077&root=gcc&view=rev
Log:
[arm] PR target/88799 Add +mp and +sec extensions to ARMv7-a (gcc-8 backport)

Most armv7-a implementations support a number of basic extensions to
the architecture which are not particularly important to the compiler,
but can matter if code contains inline assembly.  This patch adds
support for these extensions, based on the capabilities that GAS
already provides for the appropriate CPUs.  For the purposes of
multilib selection we ignore these extensions entirely and map the
extended architecture versions down to the base versions we have
already support for.

gcc:
PR target/88799
* config/arm/arm-cpus.in (mp): New feature.
(sec): New feature.
(fgroup ARMv7ve): Add mp and sec features.
(arch armv7-a): Add options to allow mp and sec extensions.
(cpu generic-armv7-a): Add options to allow mp and sec extensions.
(cpu cortex-a5, cpu cortex-7, cpu cortex-a9): Add mp and sec
extenstions to the base architecture.
(cpu cortex-a8): Add sec extension to the base architecture.
(cpu marvell-pj4): Add mp and sec extensions to the base architecture.
* config/arm/t-aprofile (MULTILIB_MATCHES): Map all armv7-a arch
variants down to the base v7-a varaint.
* config/arm/t-multilib (v7_a_arch_variants): New variable.
* doc/invoke.texi (ARM Options): Add +mp and +sec to the list
of permitted extensions for -march=armv7-a and for
-mcpu=generic-armv7-a.

testsuite:
* gcc.target/arm/multilib.exp (config "aprofile"): Add tests for
mp and sec extensions to armv7-a.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/arm/arm-cpus.in
branches/gcc-8-branch/gcc/config/arm/t-aprofile
branches/gcc-8-branch/gcc/config/arm/t-multilib
branches/gcc-8-branch/gcc/doc/invoke.texi
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/testsuite/gcc.target/arm/multilib.exp

[Bug target/88799] [8/9 Regression] Arm -mcpu=PROCESSOR does not result in assembly directives for .arch and .arch_extension

2019-01-18 Thread rearnsha at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88799

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Richard Earnshaw  ---
Fixed on trunk and gcc-8 branch.

[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-18
 CC||vmakarov at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from Jakub Jelinek  ---
Started with r266385, which changed the cost:
-r113: preferred VFP_REGS, alternative NO_REGS, allocno VFP_REGS
-r112: preferred VFP_REGS, alternative NO_REGS, allocno VFP_REGS
+r113: preferred GENERAL_REGS, alternative ALL_REGS, allocno ALL_REGS
+r112: preferred GENERAL_REGS, alternative ALL_REGS, allocno ALL_REGS

-  a0(r113,l0) costs: LO_REGS:34000,34000 HI_REGS:34000,34000
CALLER_SAVE_REGS:34000,34000 GENERAL_REGS:34000,34000 VFP_D0_D7_REGS:4000,4000
VFP_LO_REGS:4000,4000 VFP_HI_REGS:4000,4000 VFP_REGS:4000,4000
ALL_REGS:34000,34000 MEM:24000,24000
-  a1(r112,l0) costs: LO_REGS:34000,34000 HI_REGS:34000,34000
CALLER_SAVE_REGS:34000,34000 GENERAL_REGS:34000,34000 VFP_D0_D7_REGS:4000,4000
VFP_LO_REGS:4000,4000 VFP_HI_REGS:4000,4000 VFP_REGS:4000,4000
ALL_REGS:34000,34000 MEM:24000,24000
+  a0(r113,l0) costs: GENERAL_REGS:4000,4000 VFP_D0_D7_REGS:6,6
VFP_LO_REGS:6,6 VFP_HI_REGS:6,6 VFP_REGS:6,6
ALL_REGS:3,3 MEM:4,4
+  a1(r112,l0) costs: GENERAL_REGS:4000,4000 VFP_D0_D7_REGS:6,6
VFP_LO_REGS:6,6 VFP_HI_REGS:6,6 VFP_REGS:6,6
ALL_REGS:3,3 MEM:4,4

We have:
(insn 13 4 14 2 (set (reg:V8QI 112)
(reg:V8QI 0 r0 [ x ])) "pr88850-2.c":6:1 936 {*neon_movv8qi}
 (expr_list:REG_DEAD (reg:V8QI 0 r0 [ x ])
(nil)))
(insn 14 13 7 2 (set (reg:V8QI 113)
(reg:V8QI 2 r2 [ y ])) "pr88850-2.c":6:1 936 {*neon_movv8qi}
 (expr_list:REG_DEAD (reg:V8QI 2 r2 [ y ])
(nil)))
(insn 7 14 8 2 (set (reg:V8QI 2 r2)
(reg:V8QI 112)) "pr88850-2.c":7:3 936 {*neon_movv8qi}
 (expr_list:REG_DEAD (reg:V8QI 112)
(nil)))
(insn 8 7 9 2 (set (reg:V8QI 0 r0)
(reg:V8QI 113)) "pr88850-2.c":7:3 936 {*neon_movv8qi}
 (expr_list:REG_DEAD (reg:V8QI 113)
(nil)))
and the move instructions don't have alternatives for GPR to GPR move, that can
be done only through a VFP_REGS register.

[Bug target/88909] New: struct builtin_description doesn't support ix86_isa_flags2

2019-01-18 Thread hjl.tools at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88909

Bug ID: 88909
   Summary: struct builtin_description doesn't support
ix86_isa_flags2
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, ubizjak at gmail dot com
  Target Milestone: ---
Target: i386,x86-64

There are

struct builtin_description
{
  const HOST_WIDE_INT mask; 
  const enum insn_code icode;
  const char *const name; 
  const enum ix86_builtins code; 
  const enum rtx_code comparison;
  const int flag; 
};

Since "mask" is used for both ix86_isa_flags and ix86_isa_flags2, we wind up
with

BDESC (OPTION_MASK_ISA_PTWRITE, CODE_FOR_ptwritedi, "__builtin_ia32_ptwrite64",
IX86_BUILTIN_PTWRITE64, UNKNOWN, (int) VOID_FTYPE_UINT64)

and

static inline tree
def_builtin2 (HOST_WIDE_INT mask, const char *name,
  enum ix86_builtin_func_type tcode,
  enum ix86_builtins code)
{
  tree decl = NULL_TREE;

  if (tcode == VOID_FTYPE_UINT64)
{
  if (!TARGET_64BIT)
return decl;
  ix86_builtins_isa[(int) code].isa = OPTION_MASK_ISA_64BIT;
}
  ix86_builtins_isa[(int) code].isa2 = mask;

We should add "const HOST_WIDE_INT mask2;" to struct builtin_description
to handle it properly.

[Bug ipa/88900] [9 Regression] 502.gcc_r SPEC benchmark miscompiles with LTO and PGO

2019-01-18 Thread hjl.tools at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88900

H.J. Lu  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #3 from H.J. Lu  ---
Is this the same as PR 87214?

[Bug target/88877] rs6000 emits signed extension for unsigned int type(__floatunsidf).

2019-01-18 Thread amodra at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88877

--- Comment #12 from Alan Modra  ---
I suspect that the patch in comment #1 will break libcalls in other situations,
eg.

void f1 (int y)
{
  extern double d;
  d = y;
}

[Bug ipa/88900] [9 Regression] 502.gcc_r SPEC benchmark miscompiles with LTO and PGO

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88900

--- Comment #4 from Martin Liška  ---
(In reply to H.J. Lu from comment #3)
> Is this the same as PR 87214?

No, this one is probably related to RPO VN, I'm not finished with bisection.
And it also happens on non-avx512 targets.

[Bug target/88850] [9 Regression] Hard register coming out of expand causing reload to fail.

2019-01-18 Thread tnfchris at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88850

--- Comment #5 from Tamar Christina  ---
So yeah it seems that there are three issues here:

1) We should probably have an r -> r alternative for *neon_mov.
2) The costs are now flipped from what they were before, for some reason the
VFP regs are now way more expensive.
3) reload shouldn't have ICEd since it says

r113: preferred GENERAL_REGS, alternative ALL_REGS, allocno ALL_REGS

so it hasn't excluded ALL_REGS as an alternative, which should have either
a) used the VPF register again or
b) spilled the register since we have a m -> r and r -> m pattern.

[Bug middle-end/88587] ICE in expand_debug_locations, at cfgexpand.c:5450

2019-01-18 Thread hjl at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88587

--- Comment #14 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Fri Jan 18 14:33:46 2019
New Revision: 268079

URL: https://gcc.gnu.org/viewcvs?rev=268079&root=gcc&view=rev
Log:
Update PR middle-end/88587 tests

It is wrong to use -m32 in dg-options.  { target ia32 } should be used
instead.  Also add -fno-pic to g++.target/i386/pr88587.C since it is
invalid with PIC.

PR middle-end/88587
* g++.target/i386/pr88587.C (dg-do): Add { target ia32 }.
(dg-options): Replace -m32 with -fno-pic.
* gcc.target/i386/mvc13.c (dg-do): Add { target ia32 }.
(dg-options): Remove -m32.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.target/i386/pr88587.C
trunk/gcc/testsuite/gcc.target/i386/mvc13.c

[Bug rtl-optimization/88910] New: FAIL: gcc.target/i386/pr88414.c 1 blank line(s) in output

2019-01-18 Thread hjl.tools at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88910

Bug ID: 88910
   Summary: FAIL: gcc.target/i386/pr88414.c 1 blank line(s) in
output
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
  Target Milestone: ---

Executing on host: /export/gnu/import/git/gcc-test-intel64/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-test-intel64/bld/gcc/
/export/gnu/import/git/gcc-test-intel64/src-trunk/gcc/testsuite/gcc.target/i386/pr88414.c
 -fpic -mcmodel=medium   -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never   -O1 -ftrapv -S
-o pr88414.s(timeout = 300)
spawn -ignore SIGHUP /export/gnu/import/git/gcc-test-intel64/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-test-intel64/bld/gcc/
/export/gnu/import/git/gcc-test-intel64/src-trunk/gcc/testsuite/gcc.target/i386/pr88414.c
-fpic -mcmodel=medium -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O1 -ftrapv -S -o
pr88414.s^M
/export/gnu/import/git/gcc-test-intel64/src-trunk/gcc/testsuite/gcc.target/i386/pr88414.c:
In function 'foo':^M
/export/gnu/import/git/gcc-test-intel64/src-trunk/gcc/testsuite/gcc.target/i386/pr88414.c:15:7:
error: 'asm' operand has impossible constraints^M
during RTL pass: reload^M
/export/gnu/import/git/gcc-test-intel64/src-trunk/gcc/testsuite/gcc.target/i386/pr88414.c:25:1:
internal compiler error: Maximum number of LRA assignment passes is achieved
(30)^M
^M
0xbf590d lra_assign(bool&)^M
../../src-trunk/gcc/lra-assigns.c:1695^M
0xbf02cd lra(_IO_FILE*)^M
../../src-trunk/gcc/lra.c:2521^M
0xba7e71 do_reload^M
../../src-trunk/gcc/ira.c:5475^M
0xba7e71 execute^M
../../src-trunk/gcc/ira.c:5659^M
Please submit a full bug report,^M
with preprocessed source if appropriate.^M

[Bug fortran/88899] Derived type IO in conjunction with openmp fails with invalid memory read

2019-01-18 Thread dominiq at lps dot ens.fr

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88899

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-18
 Ever confirmed|0   |1

--- Comment #3 from Dominique d'Humieres  ---
Confirmed from 7.4 up to trunk (9.0).

[Bug driver/88911] New: No "did you mean" for incorrect -dumpspecs option

2019-01-18 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88911

Bug ID: 88911
   Summary: No "did you mean" for incorrect -dumpspecs option
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: enhancement
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

tmp$ g++ q.cc -dump-spec
cc1plus: warning: unrecognized gcc debugging option: u
cc1plus: warning: unrecognized gcc debugging option: m
cc1plus: warning: unrecognized gcc debugging option: -
cc1plus: warning: unrecognized gcc debugging option: s
cc1plus: warning: unrecognized gcc debugging option: e
cc1plus: warning: unrecognized gcc debugging option: c
tmp$ g++ q.cc -dump-specs
cc1plus: warning: unrecognized gcc debugging option: u
cc1plus: warning: unrecognized gcc debugging option: m
cc1plus: warning: unrecognized gcc debugging option: -
cc1plus: warning: unrecognized gcc debugging option: s
cc1plus: warning: unrecognized gcc debugging option: e
cc1plus: warning: unrecognized gcc debugging option: c
cc1plus: warning: unrecognized gcc debugging option: s
tmp$ g++ q.cc -fdump-spec
cc1plus: error: unrecognized command line option '-fdump-spec'
tmp$ g++ q.cc -fdump-specs
cc1plus: error: unrecognized command line option '-fdump-specs'
tmp$ g++14 q.cc -fdumpspecs
g++: error: unrecognized command line option '-fdumpspecs'; did you mean
'-dumpspecs'?

Aha!

[Bug driver/88911] No "did you mean" for incorrect -dumpspecs option

2019-01-18 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88911

--- Comment #1 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #0)
> tmp$ g++14 q.cc -fdumpspecs

Oops, ignore the "14" there, it's just a shell alias I use, but I missed one
instance that I meant to change to "g++"

[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064

Jakub Jelinek  changed:

   What|Removed |Added

 CC||dje at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
  Component|libgomp |target

--- Comment #11 from Jakub Jelinek  ---
Seems to be a powerpc64le backend bug or RA bug.
Reduced testcase for -fopenacc -O1:
program reduction_3
  implicit none
  integer, parameter:: n = 10, vl = 32
  integer   :: i
  double precision  :: vresult, rv
  double precision, parameter :: e = 0.001
  double precision, dimension (n) :: array
  do i = 1, n
 array(i) = i
  end do
  rv = 0
  vresult = 0
  !$acc parallel vector_length(vl) copy(rv)
  !$acc loop reduction(max:rv) vector
  do i = 1, n
 rv = max (rv, array(i))
  end do
  !$acc end parallel
  do i = 1, n
 vresult = max (vresult, array(i))
  end do
  if (abs (rv - vresult) .ge. e) STOP 11
end program reduction_3

In *.optimized it looks all correct:
   [local count: 437450368]:
  # vect_M.23_45 = PHI 
  # ivtmp.34_3 = PHI 
  _2 = (void *) ivtmp.34_3;
  vect__28.26_44 = MEM[base: _2, offset: 0B];
  vect_M.27_34 = MAX_EXPR ;
  ivtmp.34_4 = ivtmp.34_3 + 16;
  if (ivtmp.34_4 != _25)
goto ; [80.00%]
  else
goto ; [20.00%]

   [local count: 437450371]:
  stmp_M.28_8 = .REDUC_MAX (vect_M.27_34);
  *_10 = stmp_M.28_8;
and the loop indeed iterates properly and we end up with { 10.0, 9.0 } vector
which REDUC_MAX ifn should reduce to 10.0.
During early RTL opts it also looks correct:
(insn 20 19 21 4 (parallel [
(set (reg:V2DF 134)
(smax:V2DF (vec_concat:V2DF (vec_select:DF (reg:V2DF 128 [
vect_M.23 ])
(parallel [
(const_int 1 [0x1])
]))
(vec_select:DF (reg:V2DF 128 [ vect_M.23 ])
(parallel [
(const_int 0 [0])
])))
(reg:V2DF 128 [ vect_M.23 ])))
(clobber (scratch:V2DF))
]) 1330 {vsx_reduc_smax_v2df}
 (nil))
(insn 21 20 22 4 (set (reg:DF 123 [ stmp_M.28 ])
(vec_select:DF (reg:V2DF 134)
(parallel [
(const_int 0 [0])
]))) 1219 {vsx_extract_v2df}
 (nil))
Then combine turns that into:
(insn 21 20 22 4 (parallel [
(set (reg:DF 123 [ stmp_M.28 ])
(vec_select:DF (smax:V2DF (vec_concat:V2DF (vec_select:DF
(reg:V2DF 128 [ vect_M.23 ])
(parallel [
(const_int 1 [0x1])
]))
(vec_select:DF (reg:V2DF 128 [ vect_M.23 ])
(parallel [
(const_int 0 [0])
])))
(reg:V2DF 128 [ vect_M.23 ]))
(parallel [
(const_int 1 [0x1])
])))
(clobber (scratch:DF))
]) 1336 {*vsx_reduc_smax_v2df_scalar}
 (expr_list:REG_DEAD (reg:V2DF 128 [ vect_M.23 ])
(nil)))
That is then split into:
(insn 34 20 35 4 (set (reg:DF 137)
(vec_select:DF (reg:V2DF 128 [ vect_M.23 ])
(parallel [
(const_int 1 [0x1])
]))) -1
 (nil))
(insn 35 34 22 4 (set (reg:DF 123 [ stmp_M.28 ])
(smax:DF (subreg:DF (reg:V2DF 128 [ vect_M.23 ]) 8)
(reg:DF 137))) -1
 (nil))
at which point I'm already not sure if it is correct or not.  As I said, at
least
in the debugger it shows that the input to this .REDUC_MAX contains the value {
10, 9 }
is the vec_select extracting the second elt (i.e. 9.0) and (subreg 8) also the
second one?
In the end, that is what happens, the resulting assembly is:
   0x186c <+32>:lxvd2x  vs0,0,r9
   0x1870 <+36>:addir8,r1,-16
   0x1874 <+40>:lxvd2x  vs12,0,r8
   0x1878 <+44>:xxswapd vs12,vs12
   0x187c <+48>:xvmaxdp vs0,vs12,vs0
   0x1880 <+52>:xxswapd vs0,vs0
   0x1884 <+56>:stxvd2x vs0,0,r8
   0x1888 <+60>:xxswapd vs0,vs0
   0x188c <+64>:addir9,r9,16
   0x1890 <+68>:bdnz0x186c 
=> 0x1894 <+72>:lfd f12,-8(r1)
   0x1898 <+76>:xsmaxdp vs0,vs12,vs0
   0x189c <+80>:stfdf0,0(r10)
   0x18a0 <+84>:blr
and at that point
x/2fg $r1-16
0x3fffed90: 10  9
p $vs0.v2_double
$6 = {10, 9}
p $vs12.v2_double
$7 = {8, 7}
Now, the lfd loads into f12 th

[Bug fortran/88912] New: Fortran compiler segfaults when pre-include file is not found

2019-01-18 Thread sje at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88912

Bug ID: 88912
   Summary: Fortran compiler segfaults when pre-include file is
not found
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sje at gcc dot gnu.org
  Target Milestone: ---

I am using the new -pre-include= option with Fortran and when the file I
am trying to preinclude does not exist the compiler segfaults.

% install/usr/bin/gfortran -fpre-include=/tmp/foo.h -Ofast -S  x.f90
: internal compiler error: Segmentation fault
0xcb1f37 crash_signal
/home/sellcey/gcc-vect-fortran/src/gcc/gcc/toplev.c:326
0x6ca5dc load_file
/home/sellcey/gcc-vect-fortran/src/gcc/gcc/fortran/scanner.c:2481
0x6ca76b gfc_new_file()
/home/sellcey/gcc-vect-fortran/src/gcc/gcc/fortran/scanner.c:2681
0x6f0b97 gfc_init
/home/sellcey/gcc-vect-fortran/src/gcc/gcc/fortran/f95-lang.c:250
0x6164b3 lang_dependent_init
/home/sellcey/gcc-vect-fortran/src/gcc/gcc/toplev.c:1929
0x6164b3 do_compile
/home/sellcey/gcc-vect-fortran/src/gcc/gcc/toplev.c:2161
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.



% install/usr/bin/gfortran -v
Using built-in specs.
COLLECT_GCC=install/usr/bin/gfortran
COLLECT_LTO_WRAPPER=/home/sellcey/gcc-vect-fortran/install/usr/libexec/gcc/aarch64-linux-gnu/9.0.0/lto-wrapper
Target: aarch64-linux-gnu
Configured with: /home/sellcey/gcc-vect-fortran/src/gcc/configure
--prefix=/home/sellcey/gcc-vect-fortran/install/usr --target=aarch64-linux-gnu
--host=aarch64-linux-gnu --build=aarch64-linux-gnu
--enable-gnu-indirect-function
--with-sysroot=/home/sellcey/gcc-vect-fortran/install
--enable-languages=c,c++,fortran --disable-libsanitizer --disable-bootstrap
--enable-threads --enable-shared
Thread model: posix
gcc version 9.0.0 20190117 (experimental) (GCC)

[Bug fortran/88912] Fortran compiler segfaults when pre-include file is not found

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88912

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-01-18
 CC||marxin at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Mine.

[Bug c++/86926] [8/9 Regression] ICE for a recursive generic lambda

2019-01-18 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86926

Marek Polacek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org

--- Comment #4 from Marek Polacek  ---
Fixed by r267859.  I'll add the testcase.

[Bug ipa/88900] [9 Regression] 502.gcc_r SPEC benchmark miscompiles with LTO and PGO

2019-01-18 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88900

Martin Liška  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

--- Comment #5 from Martin Liška  ---
Started with r263875:

Setting Up Run Directories
  Setting up 502.gcc_r refrate (ref) peak gcc7-m64 (1 copy):
run_peak_refrate_gcc7-m64.0002
Running Benchmarks
  Running 502.gcc_r refrate (ref) peak gcc7-m64 (1 copy) [2019-01-18 15:34:36]
Error with '/home/marxin/Programming/cpu2017/bin/specinvoke -d
/home/marxin/Programming/cpu2017/benchspec/CPU/502.gcc_r/run/run_peak_refrate_gcc7-m64.0002
-f compare.cmd -E -e compare.err -o compare.stdout'; no non-empty output files
exist
  Command returned exit code 1

Contents of gcc-pp.opts-O2_-finline-limit_36000_-fpic.err

gcc-pp.c: In function 'fibheap_delete_node':
gcc-pp.c:19958:49: warning: overflow in implicit constant conversion
gcc-pp.c: In function 'htab_mod_1':
gcc-pp.c:25469:7: warning: right shift count >= width of type
gcc-pp.c: At top level:
gcc-pp.c:463503:13: warning: 'compute_transp' used but never defined
gcc-pp.c:463518:12: warning: 'one_cprop_pass' used but never defined
gcc-pp.c:463534:12: warning: 'one_pre_gcse_pass' used but never defined
gcc-pp.c:463542:12: warning: 'one_code_hoisting_pass' used but never defined
gcc-pp.c:463551:25: warning: 'find_rtx_in_ldst' used but never defined
gcc-pp.c:463579:13: warning: 'store_motion' used but never defined
gcc-pp.c:463582:13: warning: 'free_modify_mem_tables' used but never defined
gcc-pp.c:463588:22: warning: 'is_too_expensive' used but never defined



Contents of
gcc-pp.opts-O3_-finline-limit_0_-fif-conversion_-fif-conversion2.err

gcc-pp.c: In function 'fibheap_delete_node':
gcc-pp.c:19958:49: warning: overflow in implicit constant conversion
gcc-pp.c: In function 'htab_mod_1':
gcc-pp.c:25469:7: warning: right shift count >= width of type
gcc-pp.c: At top level:
gcc-pp.c:463503:13: warning: 'compute_transp' used but never defined
gcc-pp.c:463518:12: warning: 'one_cprop_pass' used but never defined
gcc-pp.c:463534:12: warning: 'one_pre_gcse_pass' used but never defined
gcc-pp.c:463542:12: warning: 'one_code_hoisting_pass' used but never defined
gcc-pp.c:463551:25: warning: 'find_rtx_in_ldst' used but never defined
gcc-pp.c:463579:13: warning: 'store_motion' used but never defined
gcc-pp.c:463582:13: warning: 'free_modify_mem_tables' used but never defined
gcc-pp.c:463588:22: warning: 'is_too_expensive' used but never defined



Contents of gcc-smaller.opts-O3_-fipa-pta.err

gcc-smaller.c: In function 'fibheap_delete_node':
gcc-smaller.c:19958:49: warning: overflow in implicit constant conversion
gcc-smaller.c: In function 'htab_mod_1':
gcc-smaller.c:25469:7: warning: right shift count >= width of type



Contents of ref32.opts-O3_-fselective-scheduling_-fselective-scheduling2.err

ref32.c:6213:17: warning: conflicting types for built-in function 'imaxabs'



Contents of ref32.opts-O5.err

ref32.c:6213:17: warning: conflicting types for built-in function 'imaxabs'


*** Miscompare of gcc-smaller.opts-O3_-fipa-pta.s; for details see
   
/home/marxin/Programming/cpu2017/benchspec/CPU/502.gcc_r/run/run_peak_refrate_gcc7-m64.0002/gcc-smaller.opts-O3_-fipa-pta.s.mis
*** Miscompare of gcc-pp.opts-O2_-finline-limit_36000_-fpic.s; for details see
   
/home/marxin/Programming/cpu2017/benchspec/CPU/502.gcc_r/run/run_peak_refrate_gcc7-m64.0002/gcc-pp.opts-O2_-finline-limit_36000_-fpic.s.mis
*** Miscompare of
gcc-pp.opts-O3_-finline-limit_0_-fif-conversion_-fif-conversion2.s; for details
see
   
/home/marxin/Programming/cpu2017/benchspec/CPU/502.gcc_r/run/run_peak_refrate_gcc7-m64.0002/gcc-pp.opts-O3_-finline-limit_0_-fif-conversion_-fif-conversion2.s.mis

I'll work on that next week and I'll try find the problem.

[Bug fortran/88912] Fortran compiler segfaults when pre-include file is not found

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88912

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Note, f951 I think supports just a single of these options, so it is certainly
not recommended to use it for users, as the driver will add its option too on
targets where it is supported and override whatever user wrote.

[Bug fortran/88912] Fortran compiler segfaults when pre-include file is not found

2019-01-18 Thread sje at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88912

--- Comment #3 from Steve Ellcey  ---
It is quite possible I am using the option incorrectly (though that should not
result in a segfault of course).  Should some other flag be adding this to the
command line for me?

[Bug c++/86926] [8/9 Regression] ICE for a recursive generic lambda

2019-01-18 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86926

--- Comment #5 from Marek Polacek  ---
But 8 still ICEs.

[Bug fortran/88912] Fortran compiler segfaults when pre-include file is not found

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88912

--- Comment #4 from Jakub Jelinek  ---
#define TARGET_F951_OPTIONS "%{!nostdinc:\
  %:fortran-preinclude-file(-fpre-include= math-vector-fortran.h finclude%s/)}"
in config/gnu-user.h adds that if the file is found.

[Bug target/37845] gcc ignores FP_CONTRACT pragma set to OFF

2019-01-18 Thread yangyibiao at nju dot edu.cn

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37845

Yibiao Yang  changed:

   What|Removed |Added

 CC||yangyibiao at nju dot edu.cn

--- Comment #7 from Yibiao Yang  ---
Created attachment 45464
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45464&action=edit
[GCOV] Wrong frequencies when a global variable is in a while expression in
gcov

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
7.3.0-27ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr
--with-gcc-major-version-only --program-suffix=-7
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie
--with-system-zlib --with-target-system-zlib --enable-objc-gc=auto
--enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)

$ cat small.c
int b;

void main()
{
int c = 0;
while (b) {
c = 1;
}
}

$ gcc small.c --coverage; ./a.out; gcov small.c; cat small.c.gcov
File 'small.c'
Lines executed:80.00% of 5
Creating 'small.c.gcov'

-:0:Source:small.c
-:0:Graph:small.gcno
-:0:Data:small.gcda
-:0:Runs:1
-:0:Programs:1
-:1:int b;
-:2:
1:3:void main()
-:4:{
1:5:int c = 0;
2:6:while (b) {
#:7:c = 1;
-:8:}
1:9:}


We can find that Line #6 is wrongly marked as executed twice.

[Bug gcov-profile/88913] New: [GCOV] Wrong frequencies when a global variable is in a while expression in gcov

2019-01-18 Thread yangyibiao at nju dot edu.cn

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88913

Bug ID: 88913
   Summary: [GCOV] Wrong frequencies when a global variable is in
a while expression in gcov
   Product: gcc
   Version: 7.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yangyibiao at nju dot edu.cn
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
7.3.0-27ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr
--with-gcc-major-version-only --program-suffix=-7
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie
--with-system-zlib --with-target-system-zlib --enable-objc-gc=auto
--enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)

$ cat small.c
int b;

void main()
{
int c = 0;
while (b) {
c = 1;
}
}

$ gcc small.c --coverage; ./a.out; gcov small.c; cat small.c.gcov
File 'small.c'
Lines executed:80.00% of 5
Creating 'small.c.gcov'

-:0:Source:small.c
-:0:Graph:small.gcno
-:0:Data:small.gcda
-:0:Runs:1
-:0:Programs:1
-:1:int b;
-:2:
1:3:void main()
-:4:{
1:5:int c = 0;
2:6:while (b) {
#:7:c = 1;
-:8:}
1:9:}


We can find that Line #6 is wrongly marked as executed twice.

[Bug gcov-profile/88914] New: [GCOV] Wrong frequencies when unreachable statements within the body of the for loop in gcov

2019-01-18 Thread yangyibiao at nju dot edu.cn

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88914

Bug ID: 88914
   Summary: [GCOV] Wrong frequencies when unreachable statements
within the body of the for loop in gcov
   Product: gcc
   Version: 7.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yangyibiao at nju dot edu.cn
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
7.3.0-27ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr
--with-gcc-major-version-only --program-suffix=-7
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie
--with-system-zlib --with-target-system-zlib --enable-objc-gc=auto
--enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)

$ cat small.c
int bar(a) { return a*a + 1; }

int main()
{
for (int i = 0; i < 10; i++) {
if (bar(i)) {
continue;
}

if (i == 0) {
abort();
} else {
exit(0);
}
}
}

$ gcc small.c -w --coverage; ./a.out; gcov small.c; cat small.c.gcov
File 'small.c'
Lines executed:62.50% of 8
Creating 'small.c.gcov'

-:0:Source:small.c
-:0:Graph:small.gcno
-:0:Data:small.gcda
-:0:Runs:1
-:0:Programs:1
   10:1:int bar(a) { return a*a + 1; }
-:2:
1:3:int main()
-:4:{
   22:5:for (int i = 0; i < 10; i++) {
   10:6:if (bar(i)) {
   10:7:continue;
-:8:}
-:9:
#:   10:if (i == 0) {
#:   11:abort();
-:   12:} else {
#:   13:exit(0);
-:   14:}
-:   15:}
-:   16:}



We can found that Line #5 is wrongly marked as executed much more times. 

When Line #10 to Line #14 are removed. The result is correct.

[Bug tree-optimization/88915] New: Try smaller vectorisation factors in scalar fallback

2019-01-18 Thread ktkachov at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88915

Bug ID: 88915
   Summary: Try smaller vectorisation factors in scalar fallback
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ktkachov at gcc dot gnu.org
Blocks: 53947
  Target Milestone: ---

The get_ref hot function in 525.x264_r inlines a hot helper that performs a
vector average:
void pixel_avg( unsigned char *dst, int i_dst_stride,
   unsigned char *src1, int i_src1_stride,
   unsigned char *src2, int i_src2_stride,
   int i_width, int i_height )
 {
 for( int y = 0; y < i_height; y++ )
 {
 for( int x = 0; x < i_width; x++ )
 dst[x] = ( src1[x] + src2[x] + 1 ) >> 1;
 dst += i_dst_stride;
 src1 += i_src1_stride;
 src2 += i_src2_stride;
 }
 }

GCC 9 already knows how to generate vector average instructions (PR 85694).
For aarch64 it generates a 16x vectorised loop.
Runtime profiling of the arguments to this function, however, show that the
>50% of the time the i_width has value 8 during runtime and therefore the
vector loop is skipped in favour of a scalar fallback:
32.07%  40ed2c  ldrbw3, [x0,x5]
11.41%  40ed30  ldrbw11, [x4,x5]
40ed34  add w3, w3, w11
40ed38  add w3, w3, #0x1
40ed3c  asr w3, w3, #1
0.71%   40ed40  strbw3, [x2,x5]
40ed44  add x5, x5, #0x1
40ed48  cmp w6, w5
40ed4c  b.gt

The most frequent runtime combinations of inputs to this function are:
29240545 i_height: 8, i_width: 8, i_dst_stride: 16, i_src1_stride: 1344,
i_src2_stride: 1344
22714355 i_height: 16, i_width: 16, i_dst_stride: 16, i_src1_stride: 1344,
i_src2_stride: 1344
19669512 i_height: 8, i_width: 8, i_dst_stride: 16, i_src1_stride: 704,
i_src2_stride: 704
3689216 i_height: 16, i_width: 8, i_dst_stride: 16, i_src1_stride: 1344,
i_src2_stride: 1344
3670639 i_height: 8, i_width: 16, i_dst_stride: 16, i_src1_stride: 1344,
i_src2_stride: 1344

That's a shame. AArch64 supports the V8QI form of the vector average
instruction (and advertises it through optabs).
With --param vect-epilogues-nomask=1 we already generate something like:
if (bytes_left > 16)
{
  while (bytes_left > 16)
16x_vectorised;
  if (bytes_left > 8)
8x_vectorised;
  unrolled_scalar_epilogue;
}
else
  scalar_loop;

Could we perhaps generate:
  while (bytes_left > 16)
16x_vectorised;
  if (bytes_left > 8)
8x_vectorised;
  unrolled_scalar_epilogue; // or keep it as a rolled scalar_loop to save on
codesize?

Basically I'm looking for a way to take advantage of the 8x vectorised form.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug target/84481] [8/9 Regression] 429.mcf with -O2 regresses by ~6% and ~4%, depending on tuning, on Zen compared to GCC 7.2

2019-01-18 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84481

--- Comment #8 from Martin Jambor  ---
And even my own measurements show 6% slowdown at both -O2 and -Ofast with
generic march/tuning against GCC 7 and now also 5% slowdown at -Ofast and
native march/tuning against GCC 8.

[Bug c++/86926] [8/9 Regression] ICE for a recursive generic lambda

2019-01-18 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86926

--- Comment #6 from Marek Polacek  ---
Author: mpolacek
Date: Fri Jan 18 16:42:57 2019
New Revision: 268080

URL: https://gcc.gnu.org/viewcvs?rev=268080&root=gcc&view=rev
Log:
PR c++/86926
* g++.dg/cpp1z/constexpr-lambda23.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda23.C
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug c++/86926] [8 Regression] ICE for a recursive generic lambda

2019-01-18 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86926

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|mpolacek at gcc dot gnu.org|unassigned at gcc dot 
gnu.org
Summary|[8/9 Regression] ICE for a  |[8 Regression] ICE for a
   |recursive generic lambda|recursive generic lambda

[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751

2019-01-18 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064

--- Comment #12 from Segher Boessenkool  ---
Yes, I think so (just the vec_select arg?)

[Bug rtl-optimization/88423] [9 Regression] ICE in begin_move_insn, at sched-ebb.c:175

2019-01-18 Thread dmalcolm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88423

David Malcolm  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |dmalcolm at gcc dot 
gnu.org

--- Comment #5 from David Malcolm  ---
Candidate patch: https://gcc.gnu.org/ml/gcc-patches/2019-01/msg01084.html

[Bug tree-optimization/88044] [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88044

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #13 from Jakub Jelinek  ---
Between r266170 and r266171 the difference was in veclower21 dump with -O3:
--- transfer_intrinsic_3.f90.168t.veclower21_   2019-01-18 16:34:47.478873237
+0100
+++ transfer_intrinsic_3.f90.168t.veclower212019-01-18 16:35:09.503515118
+0100
@@ -370,14 +370,7 @@ main (integer(kind=4) argc, character(ki
   __builtin_free (_27);
   parm.10 ={v} {CLOBBER};
   ivtmp.52_75 = ivtmp.52_82 + 1;
-  if (ivtmp.52_75 == 3)
-goto ; [12.36%]
-  else
-goto ; [87.64%]
-
-   [local count: 1069422300]:
-  __builtin_free (_7);
-  return 0;
+  goto ; [100.00%]

 }

and, if I revert the r266171 change on current trunk, the difference between
f951 with the patch reverted and vanilla trunk is (again -O3,
powerpc64le-linux):
--- transfer_intrinsic_3.f90.161t.cunroll_  2019-01-18 17:14:06.625536698
+0100
+++ transfer_intrinsic_3.f90.161t.cunroll   2019-01-18 17:14:24.992238353
+0100
@@ -55,8 +55,10 @@ Number of blocks in CFG: 46
 Number of blocks to update: 11 ( 24%)


+Removing basic block 21
 Removing basic block 32
 Removing basic block 41
+Merging blocks 20 and 23
 Merging blocks 30 and 33
 Removing basic block 34
 Removing basic block 36
@@ -178,8 +180,8 @@ main (integer(kind=4) argc, character(ki
   pretmp_65 = &MEM[(character(kind=1)[0:][1:1] *)_7][0];

[local count: 8656061039]:
-  # n_63 = PHI <0(30), _28(23)>
-  # ivtmp_13 = PHI <4(30), ivtmp_31(23)>
+  # n_63 = PHI <0(30), _28(20)>
+  # ivtmp_13 = PHI <4(30), ivtmp_31(20)>
   _19 = n_63 + -1;
   _20 = (integer(kind=8)) _19;
   _22 = MAX_EXPR <_20, 0>;
@@ -264,18 +266,8 @@ main (integer(kind=4) argc, character(ki
   parm.10 ={v} {CLOBBER};
   _28 = n_63 + 1;
   ivtmp_31 = ivtmp_13 - 1;
-  if (ivtmp_31 == 0)
-goto ; [12.36%]
-  else
-goto ; [87.64%]
-
-   [local count: 7582748748]:
   goto ; [100.00%]

-   [local count: 1069422300]:
-  __builtin_free (_7);
-  return 0;
-
 }


so in both cases, the loop condition is optimized out.

[Bug target/88916] New: [x86] suboptimal code generated for integer comparisons joined with boolean operators

2019-01-18 Thread wojciech_mula at poczta dot onet.pl

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88916

Bug ID: 88916
   Summary: [x86] suboptimal code generated for integer
comparisons joined with boolean operators
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wojciech_mula at poczta dot onet.pl
  Target Milestone: ---

Let's consider these two simple, yet pretty useful functions:

--test.c---
int both_nonnegative(long a, long b) {
return (a >= 0) && (b >= 0);
}

int both_nonzero(unsigned long a, unsigned long b) {
return (a > 0) && (b > 0);
}
---eof--

$ gcc --version
gcc (Debian 8.2.0-13) 8.2.0

$ gcc -O3 test.c -march=skylake -S
$ cat test.s
both_nonnegative:
notq%rdi
movq%rdi, %rax
notq%rsi
shrq$63, %rax
shrq$63, %rsi
andl%esi, %eax
ret

both_nonzero:
testq   %rdi, %rdi
setne   %al
xorl%edx, %edx
testq   %rsi, %rsi
setne   %dl
andl%edx, %eax
ret

I checked different target machines (haswell, broadwell and cannonlake),
however the result remained the same. Also GCC trunk on godbolt.org 
produces the same assembly code.

The first function, `both_nonnegative`, can be rewritten as:

(((unsigned long)(a) | (unsigned long)(b)) >> 63) ^ 1

yielding something like this:

both_nonnegative:
orq %rsi, %rdi
movq%rdi, %rax
shrq$63, %rax
xorl$1, %eax
ret

It's also possible to use this expression:
(long)(unsigned long)a | (unsigned long)b) < 0,
but the assembly output is almost the same.

The condition from `both_nonzero` can be expressed as:

((unsigned long)a | (unsigned long)b) != 0

GCC compiles it to:

both_nonzero:
xorl%eax, %eax
orq %rsi, %rdi
setne   %al
retq

[Bug libbacktrace/88890] libbacktrace on 32-bit system with _FILE_OFFSET_BITS == 64

2019-01-18 Thread ian at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88890

--- Comment #1 from ian at gcc dot gnu.org  ---
Author: ian
Date: Fri Jan 18 17:13:59 2019
New Revision: 268082

URL: https://gcc.gnu.org/viewcvs?rev=268082&root=gcc&view=rev
Log:
PR libbacktrace/88890
* mmapio.c (backtrace_get_view): Change size parameter to
uint64_t.  Check that value fits in size_t.
* read.c (backtrace_get_view): Likewise.
* internal.h (backtrace_get_view): Update declaration.
* elf.c (elf_add): Pass shstrhdr->sh_size to backtrace_get_view.

Modified:
trunk/libbacktrace/ChangeLog
trunk/libbacktrace/elf.c
trunk/libbacktrace/internal.h
trunk/libbacktrace/mmapio.c
trunk/libbacktrace/read.c

[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064

--- Comment #13 from Jakub Jelinek  ---
So, both the following patches should fix it IMHO, but no idea which one if any
is right.
With
--- gcc/config/rs6000/vsx.md.jj 2019-01-01 12:37:44.305529527 +0100
+++ gcc/config/rs6000/vsx.md2019-01-18 18:07:37.194899062 +0100
@@ -4356,7 +4356,9 @@
   ""
   [(const_int 0)]
 {
-  rtx hi = gen_highpart (DFmode, operands[1]);
+  rtx hi = (BYTES_BIG_ENDIAN
+   ? gen_highpart (DFmode, operands[1])
+   : gen_lowpart (DFmode, operands[1]));
   rtx lo = (GET_CODE (operands[2]) == SCRATCH)
? gen_reg_rtx (DFmode)
: operands[2];

the assembly changes:
--- reduction-3.s1  2019-01-18 18:05:14.313229730 +0100
+++ reduction-3.s2  2019-01-18 18:10:20.617233358 +0100
@@ -27,7 +27,7 @@ MAIN__._omp_fn.0:
addi 9,9,16
bdnz .L2
 # vec_extract to same register
-   lfd 12,-8(1)
+   lfd 12,-16(1)
xsmaxdp 0,12,0
stfd 0,0(10)
blr
with:
--- gcc/config/rs6000/vsx.md.jj 2019-01-01 12:37:44.305529527 +0100
+++ gcc/config/rs6000/vsx.md2019-01-18 18:16:30.680186709 +0100
@@ -4361,7 +4361,9 @@
? gen_reg_rtx (DFmode)
: operands[2];

-  emit_insn (gen_vsx_extract_v2df (lo, operands[1], const1_rtx));
+  emit_insn (gen_vsx_extract_v2df (lo, operands[1],
+  BYTES_BIG_ENDIAN
+  ? const1_rtx : const0_rtx));
   emit_insn (gen_df3 (operands[0], hi, lo));
   DONE;
 }
the assembly changes:
--- reduction-3.s1  2019-01-18 18:05:14.313229730 +0100
+++ reduction-3.s3  2019-01-18 18:17:18.977397458 +0100
@@ -26,7 +26,7 @@ MAIN__._omp_fn.0:
xxpermdi 0,0,0,2
addi 9,9,16
bdnz .L2
-# vec_extract to same register
+   xxpermdi 0,0,0,3
lfd 12,-8(1)
xsmaxdp 0,12,0
stfd 0,0(10)

So just judging from this exact testcase, the first patch seems to be more
efficient, though still unsure about that, because it goes through memory in
either case, wouldn't it be better to emit a xxpermdi from 0 to 12 that swaps
the two elements instead of loading it from memory?

[Bug target/87064] [9 regression] libgomp.oacc-fortran/reduction-3.f90 fails starting with r263751

2019-01-18 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87064

--- Comment #14 from Jakub Jelinek  ---
And, if I disable that define_insn_and_split altogether (add 0 && to the
condition), the assembly change is:
--- reduction-3.s2  2019-01-18 18:19:42.184057246 +0100
+++ reduction-3.s4  2019-01-18 18:26:23.079506011 +0100
@@ -9,26 +9,16 @@ MAIN__._omp_fn.0:
.cfi_startproc
ld 10,0(3)
lxvdsx 0,0,10
-   addi 9,1,-16
-   xxpermdi 0,0,0,2
-   stxvd2x 0,0,9
ld 9,8(3)
li 8,5
mtctr 8
 .L2:
-   lxvd2x 0,0,9
-   addi 8,1,-16
-   lxvd2x 12,0,8
-   xxpermdi 12,12,12,2
-   xvmaxdp 0,12,0
-   xxpermdi 0,0,0,2
-   stxvd2x 0,0,8
-   xxpermdi 0,0,0,2
+   lxvd2x 12,0,9
+   xvmaxdp 0,0,12
addi 9,9,16
bdnz .L2
-# vec_extract to same register
-   lfd 12,-16(1)
-   xsmaxdp 0,12,0
+   xxsldwi 12,0,0,2
+   xvmaxdp 0,12,0
stfd 0,0(10)
blr
.long 0

which looks much better.  So, what is the reason for this
define_insn_and_split?  Is it useful for BYTES_BIG_ENDIAN only perhaps?

[Bug libbacktrace/88890] libbacktrace on 32-bit system with _FILE_OFFSET_BITS == 64

2019-01-18 Thread ian at airs dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88890

Ian Lance Taylor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||ian at airs dot com
 Resolution|--- |FIXED

--- Comment #2 from Ian Lance Taylor  ---
Fixed.

[Bug driver/80836] final binaries missing rpath despite all attempts to use the appropriate -Wl,-rpath= statement

2019-01-18 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80836

--- Comment #5 from Jonathan Wakely  ---
I'm unsure what exactly this report is about. The first half seems to be about
building GCC itself, and ensuring it can find the libs that GCC relies on,
right?

That's simple, just build the support libs in-tree so there are no runtime
dependencies on them. That's what the stackoverflow answer suggests, and is
what happens if you follow https://stackoverflow.com/a/10662297/981959 which
refers to https://gcc.gnu.org/wiki/InstallingGCC

In short, stop installing libgmp.so, libmpfr.so and libmpc.so in non-standard
locations and then trying to make GCC find them later. Just don't do that.
Everything's much easier if you don't do that.

The second half of this report seems to be about building your own programs
using GCC, and ensuring they can find GCC's runtime libs. I agree that an
option to add rpaths for GCC's own libs would be nice, but you can make it
happen automatically by installing a custom specs file that adds -rpath to the
*lib: spec.

1 2 >

1 - 100 of 144 matches

Mail list logo