[Bug target/94145] New: Longcalls mis-optimize loading the function address

2020-03-11 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94145

Bug ID: 94145
   Summary: Longcalls mis-optimize loading the function address
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I'm working on a feature where we convert some/all built-in function calls to
use the longcall sequence.  I discovered that the compiler is mis-optimizing
loading up the function address.  This showed up in the Spec 2017 wrf_r
benchmark where I replaced some 60,000 direct calls to longcalls.

In particular, the PowerPC backend is not marking the load of the function
address as being volatile.  This allows the compiler to move the load out of a
loop.

However with the current ELF semantics, you don't want to do this because the
function address changes.  The first call to the function, the address is the
PLT stub, but in subsequent calls it is the address of the function itself
after the shared library is loaded.

In addition, because UNSPECs are used, the compiler is likely to store the
function address in the stack and reload it.  Given that the UNSPEC is just a
load, it would be better not to optimize this to doing the extra load/store.

In fixing the linker bug that this feature uncovered, Alan Modra has a simple
patch to fix it.

[Bug target/94145] Longcalls mis-optimize loading the function address

2020-03-11 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94145

--- Comment #1 from Michael Meissner  ---
Created attachment 48021
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48021&action=edit
Example code

Compile with -mcpu=future -mpcrel -O3 to see the load of the address being
moved out of the loop.

[Bug target/81594] Optimize PowerPC vector set and store

2020-03-18 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81594

Michael Meissner  changed:

   What|Removed |Added

  Attachment #41854|0   |1
is obsolete||

--- Comment #4 from Michael Meissner  ---
Created attachment 48057
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48057&action=edit
Update proposed patch to fix the problem

[Bug target/93937] Variable vector extract & zero extend insn can never match

2020-03-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93937

Michael Meissner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #2 from Michael Meissner  ---
Fixed on Feb. 28, 2020

[Bug target/81594] Optimize PowerPC vector set and store

2020-03-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81594

--- Comment #6 from Michael Meissner  ---
If you look at the original patch, it did try to do this optimization.  When I
looked at it some time later, the combiner no longer generated the sequence
because it thought it was slower (due to length, etc.).

You could spend a lot of time tuning the code so eventually the combiner will
generate it again, but it was simpler to just put the peephole in to catch the
cases that show up.  If you want to take on the bug and do it earlier, go
ahead.

A peephole2 might not catch all uses, but it prevents whack-a-mole, where a
change causes other code generation changes down the pike.

Note, the original patch was written in the power8 time frame, and it would
need to be adjust to power9 and future systems now (i.e. the patch only does
the splitting if the value is a FPR or GPR, while in power9 it could be a
traditional Altivec register).

However, the splitter uses reload_completed that you always seem to object to. 
It could be done before register allocation, but then you would need to make
sure that no other pass recombines the two separate items back into a vector
once again.

[Bug target/94451] New: April 1st 2020 GCC does not compile spec 2017 gcc_r benchmark with -O3

2020-04-01 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94451

Bug ID: 94451
   Summary: April 1st 2020 GCC does not compile spec 2017 gcc_r
benchmark with -O3
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

Created attachment 48166
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48166&action=edit
decimal64.i file that shows the bug.

I was building Spec 2017 with the current master compiler branch, and it failed
in 3 benchmarks.

I looked at the failure of the gcc_r benchmark, and I discovered that the
decimal64.c function gets a compiler error when I build a compiler with default
checks enabled.  I narrowed it down so that it fails with -O2 -fsplit-loops
-ftree-vectorize and -fgnu89-inline (the -fgnu89-inline is not needed for the
failure, but it is generally needed to compile Spec 2017).

-perch-> /opt/at13.0/bin/gdb cc1
GNU gdb (GDB) 8.3.1.20191211-git
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "powerpc64le-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from cc1...
Breakpoint 1 at 0x101dba40: file /home/meissner/fsf-src/trunk/gcc/diagnostic.c,
line 1777.
Breakpoint 2 at 0x1193e368: file /home/meissner/fsf-src/trunk/gcc/diagnostic.c,
line 1706.
Breakpoint 3 at 0x11a15fe8
Breakpoint 4 at 0x11a15fc4
File tree.h will be skipped when stepping.
File is-a.h will be skipped when stepping.
File line-map.h will be skipped when stepping.
File timevar.h will be skipped when stepping.
Function rtx_expr_list::next will be skipped when stepping.
Function rtx_expr_list::element will be skipped when stepping.
Function rtx_insn_list::next will be skipped when stepping.
Function rtx_insn_list::insn will be skipped when stepping.
Function rtx_sequence::len will be skipped when stepping.
Function rtx_sequence::element will be skipped when stepping.
Function rtx_sequence::insn will be skipped when stepping.
Function INSN_UID will be skipped when stepping.
Function PREV_INSN will be skipped when stepping.
Function SET_PREV_INSN will be skipped when stepping.
Function NEXT_INSN will be skipped when stepping.
Function SET_NEXT_INSN will be skipped when stepping.
Function BLOCK_FOR_INSN will be skipped when stepping.
Function PATTERN will be skipped when stepping.
Function INSN_LOCATION will be skipped when stepping.
Function INSN_HAS_LOCATION will be skipped when stepping.
Function JUMP_LABEL_AS_INSN will be skipped when stepping.
Successfully loaded GDB hooks for GCC
(gdb) r -O2 -fsplit-loops -ftree-vectorize -fgnu89-inline -quiet
foo-decimal64.i
Starting program: /home/meissner/fsf-build-ppc64le/trunk/gcc/cc1 -O2
-fsplit-loops -ftree-vectorize -fgnu89-inline -quiet foo-decimal64.i
decimal64.c: In function ‘decDigitsToDPD’:
decimal64.c:662:6: error: missing definition
for SSA_NAME: _292 in statement:
target_205 = _292;

Breakpoint 2, internal_error (gmsgid=0x11acd0d0 "verify_ssa failed") at
/home/meissner/fsf-src/trunk/gcc/diagnostic.c:1787
1787  global_dc->diagnostic_group_nesting_depth++;
(gdb) where
#0  internal_error (gmsgid=0x11acd0d0 "verify_ssa failed") at
/home/meissner/fsf-src/trunk/gcc/diagnostic.c:1787
#1  0x10e2efac in verify_ssa (check_modified_stmt=,
check_ssa_operands=) at
/home/meissner/fsf-src/trunk/gcc/tree-ssa.c:1208
#2  0x109b6ea0 in execute_function_todo (fn=0x75a41550,
data=) at /home/meissner/fsf-src/trunk/gcc/passes.c:1992
#3  0x109b80d4 in do_per_function (callback=,
data=) at /home/meissner/fsf-src/trunk/gcc/passes.c:1640
#4  0x109b82fc in execute_todo (flags=) at
/home/meissner/fsf-src/trunk/gcc/passes.c:2039
#5  0x109bbcc4 in execute_one_pass (pass=pass@entry=) at /home/meissner/fsf-src/trunk/gcc/passes.c:2539
#6  0x109bca64 in execute_pass_list_1 (pass=) at /home/meissner/fsf-src/trunk/gcc/passes.c:2590
#7  0x109bca7c in execute_pass_list_1 (pass=) at /home/meissner/fsf-src/trunk/gcc/passes.c:2591
#8  0x109bca7c in execute_pass_list_1 (pass=) at /home/meissner/fsf-src/trunk/gcc/passes.c:2591
#9  0x109bcb08 in execute_pass_list (fn=

[Bug target/94451] April 1st 2020 GCC does not compile spec 2017 gcc_r benchmark with -O3

2020-04-01 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94451

Michael Meissner  changed:

   What|Removed |Added

 CC||amodra at gcc dot gnu.org,
   ||bergner at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org,
   ||wschmidt at gcc dot gnu.org
   Severity|normal  |critical
   Host||powerpc64le-gnu-linux
  Build||powerpc64le-gnu-linux
 Target||powerpc64le-gnu-linux
   Priority|P3  |P2

--- Comment #1 from Michael Meissner  ---
I built the compiler on Ubuntu 18.04 on a little endian power9 system using
--with-cpu=power9.  I used the Advance Toolchain AT13 compiler to build the
compiler.  I did not bootstrap the compiler.

[Bug target/94557] [9 regression] r9-8486 causes several builtin instruction test case execution failures on power 9

2020-04-13 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94557

--- Comment #1 from Michael Meissner  ---
Created attachment 48263
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48263&action=edit
Proposed patch to fix the problem.

This patch backports a necessary fix from the trunk to fix the problem.

[Bug target/94557] [9 regression] r9-8486 causes several builtin instruction test case execution failures on power 9

2020-04-13 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94557

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2020-04-13

--- Comment #2 from Michael Meissner  ---
The issue is that with the backport patch for PR target/93932, GCC is more
likely to optimize variable extracts from a vector that is in memory to be a
simple load, instead of loading the vector into a vector register, and doing a
vector extract on power9.

The test cases rely on having indexes outside of the range of valid indexes. 
If the vector was loaded into a register, we would automatically mask the index
as part of the extract.

However, if we converted the operation to a single load, we did not do the
masking, and the load would load some random value outside of the vector
boundary.

The trunk had previously had other changes that did this masking as part of the
changes for -mcpu=future and PC-relative support.  The proposed patch just
makes sure the index is properly masked before it is used.

[Bug target/94557] [9 regression] r9-8486 causes several builtin instruction test case execution failures on power 9

2020-04-15 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94557

--- Comment #3 from Michael Meissner  ---
Just to be clear, this bug are only bugs in the GCC 9 branch, and it came about
due to the back port of the patch for PR target/93932 to the GCC 9 branch.  The
master branch generates correct code.  So, I'm not sure this warrants being a
P1 blocker for the GCC 10 release.

[Bug target/94557] [9 regression] r9-8486 causes several builtin instruction test case execution failures on power 9

2020-04-16 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94557

Michael Meissner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Michael Meissner  ---
Fixed by a change to GCC 9 on April 16th, 2020.

[Bug target/93932] PowerPC vec_extract with variable element number has code regressions for V2DI/V2DF vectors

2020-04-16 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93932

Michael Meissner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Michael Meissner  ---
With the committal of the fix to PR target/94557 (fix regression caused on the
GCC 9 branch by PR target/93932 patch), this patch now can be closed.

[Bug target/94630] New: General bug for changes needed to switch the PowerPC long double default

2020-04-16 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630

Bug ID: 94630
   Summary: General bug for changes needed to switch the PowerPC
long double default
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

This is a bug to hold patches and observations about the changes needed to
switch the compiler default with a configuration switch in the GCC 11 time
frame (with a backport to GCC 10.2).

[Bug target/94630] General bug for changes needed to switch the PowerPC long double default

2020-04-16 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630

Michael Meissner  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |meissner at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
Version|10.0|unknown
 Ever confirmed|0   |1
   Priority|P3  |P4
   Severity|normal  |enhancement
   Last reconfirmed||2020-04-17

[Bug target/94630] General bug for changes needed to switch the PowerPC long double default

2020-04-16 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630

--- Comment #1 from Michael Meissner  ---
Created attachment 48296
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48296&action=edit
Patch do the correct mapping for builtin math functions right when long double
default is IEEE.

[Bug target/94630] General bug for changes needed to switch the PowerPC long double default

2020-04-16 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630

--- Comment #2 from Michael Meissner  ---
When the default is changed, we will need to map __builtin_sprintf and company
just like GLIBC will do it if the user includes stdio.h.

Otherwise the gcc.dg/tree-ssa/builtin-sprintf.c test fails because it calls the
wrong sprintf for long double arguments.

[Bug target/94630] General bug for changes needed to switch the PowerPC long double default

2020-04-16 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630

--- Comment #3 from Michael Meissner  ---
Created attachment 48297
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48297&action=edit
Patch to mangle *printf and *scanf built-ins if long double is IEEE-128

[Bug target/94630] General bug for changes needed to switch the PowerPC long double default

2020-04-17 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630

--- Comment #5 from Michael Meissner  ---
Note, at the moment, the patches are to make the existing configure switch
(--with-long-double=ieee) work correctly.

However, we need all of the pieces in place (gcc, glibc, libstdc++, etc.)
before we can contemplate changing the ABI.

[Bug middle-end/91512] [10 Regression] Fortran compile time regression.

2020-04-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91512

--- Comment #31 from Michael Meissner  ---
For the Spec 2017 521.wrf_r benchmark on little endian PowerPC power9 systems,
there was no difference in runtime between a normal run using -Ofast
-mcpu=power9 and one with -Ofast -mcpu=power9 -fno-inline-arg-packing.

Of the seven rate benchmarks in Spec 2017 that use Fortran (548.exchange2_r,
503.bwaves_r, 507.cactuBSSN_r, 521.wrf_r, 527.cam4_r, 549.fotonik3d_r, and
554.roms_r) none of them vary by more tha 0.7% depending on whether the switch
is used or not.

I used the compiler checked out from the master branch on March 27, 2020 to
build and run the benchmarks.

As others have said, using -fno-inline-arg-packing does dramatically reduce the
time it takes to compile 521.wrf_r.

[Bug target/94630] General bug for changes needed to switch the PowerPC long double default

2020-04-23 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630

--- Comment #7 from Michael Meissner  ---
Created attachment 48364
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48364&action=edit
Propsed patch to build ibm-ldouble.c with -mno-gnu-attributes

ibm-ldouble.c in libgcc must be compiled without GNU attributes, so that the
__ibm128 functions can be called if long double is IEEE 128-bit.

[Bug target/92218] New: PowerPC indexed insn attribute misses some insns (bswap, atomic, small int float memory)

2019-10-24 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92218

Bug ID: 92218
   Summary: PowerPC indexed insn attribute misses some insns
(bswap, atomic, small int float memory)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

In working on the PowerPC 'future' processor, I was using the 'indexed' insn
attribute to know when a certain insn used indexed addressing instead of offset
addressing.

However it fails in one crucial case.  If the address is a single register
(i.e. indirect addressing) and the insn form requires indexed addressing, the
indexed_address_mem predicate function will fail.

Off the top of my head, the places where this happens is:
1) Load/store of 8/16/32-bit integers to/from vector/FPR registers;
2) Byte swap to/from memory; or
3) Atomic memory operations.

The simplest approach is to go into each of the problematical insns, and
explicitly set 'indexed' to 'yes' for the alternatives that require indexed
addressing.

[Bug target/92218] PowerPC indexed insn attribute misses some insns (bswap, atomic, small int float/vector load/store)

2019-10-24 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92218

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-10-25
   Assignee|unassigned at gcc dot gnu.org  |meissner at gcc dot 
gnu.org
Summary|PowerPC indexed insn|PowerPC indexed insn
   |attribute misses some insns |attribute misses some insns
   |(bswap, atomic, small int   |(bswap, atomic, small int
   |float memory)   |float/vector load/store)
 Ever confirmed|0   |1

[Bug target/92218] PowerPC indexed insn attribute misses some insns (bswap, atomic, small int float/vector load/store)

2019-10-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92218

--- Comment #1 from Michael Meissner  ---
The VSX instructions load scalar from memory and splat into the register are
another class of x-form only memory instructions that would need the indexed
insn attribute set.

[Bug target/93011] New: PowerPC GCC has warning that aggregate alignment changed in GCC 5

2019-12-19 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93011

Bug ID: 93011
   Summary: PowerPC GCC has warning that aggregate alignment
changed in GCC 5
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I've been doing Spec 2017 builds, and I notice that some of the benchmarks get
notes of the form:

note: the layout of aggregates containing vectors with 8-byte alignment has
changed in GCC 5

(this was from coverage.c:268 in the gcc_r benchmark).

When GCC 10 comes out, it will be 5 releases since the change was made.  I
doubt many people are just now porting code from back then.  Perhaps it is time
to retire the message.

[Bug target/93230] New: PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load

2020-01-10 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

Bug ID: 93230
   Summary: PowerPC GCC vec_extract of a vector in memory does not
fold sign/zero extension into load
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

Created attachment 47634
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47634&action=edit
Example code

In working on some bugs and extensions for -mcpu=future, I noticed that the
code for vec_extract is not optimal when you are extracting an 8/16/32-bit
integer from a vector in memory.  In this case, we convert the vec_extract to
be a load of the scalar value, but we don't have the proper combine insns to
fold the sign extend or zero extend into the load, which means we have to issue
a separate conversion instruction.

For example, consider:

#include 

unsigned long
v8hi_uns_1 (vector unsigned short *p)
{
  return (unsigned long) vec_extract (*p, 1);
}

long
v8hi_sign_1 (vector unsigned short *p)
{
  return (long) vec_extract (*p, 1);
}

It generates:

v8hi_uns_1:
lhz 3,2(3)
rlwinm 3,3,0,0x
blr

v8hi_sign_1:
lhz 3,2(3)
extsh 3,3
blr

It should generate:

v8hi_uns_1:
lhz 3,2(3)
blr

v8hi_sign_1:
lhz 3,2(3)
blr

[Bug target/93230] PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load

2020-01-10 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-01-10
 Ever confirmed|0   |1

[Bug target/93230] PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load

2020-01-10 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

--- Comment #1 from Michael Meissner  ---
Created attachment 47635
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47635&action=edit
Example assembler generated for -mcpu=power9

[Bug target/93230] PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load

2020-01-10 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

Michael Meissner  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug target/93230] PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load

2020-01-10 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

--- Comment #2 from Michael Meissner  ---
There is this code in rs6000.md that thinks it is combining the conversion with
the load, but the insn is using the wrong types:

;; Optimize extracting a single scalar element from memory.
(define_insn_and_split "*vsx_extract__load"
  [(set (match_operand: 0 "register_operand" "=r")
(vec_select:
 (match_operand:VSX_EXTRACT_I 1 "memory_operand" "m")
 (parallel [(match_operand:QI 2 "" "n")])))
   (clobber (match_scratch:DI 3 "=&b"))]
  "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT"
  "#"
  "&& reload_completed"
  [(set (match_dup 0) (match_dup 4))]
{
  operands[4] = rs6000_adjust_vec_address (operands[0], operands[1],
operands[2],
   operands[3], mode);
}
  [(set_attr "type" "load")
   (set_attr "length" "8")])

In addition, the code should also handle sign extension, and loading up the
value into a vector register.

[Bug target/93230] PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load

2020-01-10 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

Michael Meissner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/93230] PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load

2020-01-10 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

--- Comment #3 from Michael Meissner  ---
Also this code if the element number is variable:

;; Optimize extracting a single scalar element from memory.
(define_insn_and_split "*vsx_extract__load"
  [(set (match_operand: 0 "register_operand" "=r")
(vec_select:
 (match_operand:VSX_EXTRACT_I 1 "memory_operand" "m")
 (parallel [(match_operand:QI 2 "" "n")])))
   (clobber (match_scratch:DI 3 "=&b"))]
  "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT"
  "#"
  "&& reload_completed"
  [(set (match_dup 0) (match_dup 4))]
{
  operands[4] = rs6000_adjust_vec_address (operands[0], operands[1],
operands[2],
   operands[3], mode);
}
  [(set_attr "type" "load")
   (set_attr "length" "8")])

[Bug target/93568] [10 regression] r10-6418 causes many ICEs

2020-02-05 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93568

Michael Meissner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |meissner at gcc dot 
gnu.org

[Bug target/93568] [10 regression] r10-6418 causes many ICEs

2020-02-05 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93568

Michael Meissner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Michael Meissner  ---
Fixed.

[Bug target/93569] [10 regression] r10-6419 causes ICE in gcc.target/powerpc/vsx-builtin-15d.c

2020-02-05 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93569

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2020-02-05
   Assignee|unassigned at gcc dot gnu.org  |meissner at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug target/93569] [10 regression] r10-6419 causes ICE in gcc.target/powerpc/vsx-builtin-15d.c

2020-02-06 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93569

Michael Meissner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Michael Meissner  ---
Fixed on February 6th, 2020.
commit r10-6494-ga66219dce7fcba068a0998dd926e2ffc6857f149

[Bug target/81594] Optimize PowerPC vector set and store

2020-02-20 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81594

--- Comment #3 from Michael Meissner  ---
I looked at this a little.  The proposed patch doesn't generate the expected
code any more (due to setting the length attribute, which makes it look like
the fix generates slower code).

I re-implemented it as a peephole2 for ISA 2.07 (power9) and above.  The
peephole2 does find several places in the 2017 Spec INT benchmarks, where it
replaces:

MTVSRDD
XXPERMDI
STV

with:

STD
STD

[Bug target/93932] New: PowerPC vec_extract with variable element number has code regressions for V2DI/V2DF vectors

2020-02-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93932

Bug ID: 93932
   Summary: PowerPC vec_extract with variable element number has
code regressions for V2DI/V2DF vectors
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I've been looking at vec_extract recently, both in terms of support for the
-mcpu=future and to look at supporting PR target/93230 in GCC 11.

In some cases, if you have a vec_extract built-in function where the vector is
in a register, and the element is variable, the compiler decides store this
vector to memory, and then do the variable extract using a scalar load. 
Unfortunately, this lead to a STORE-HIT-LOAD slowdown, as the scalar load will
likely have to wait for the vector store to finish.

The test cases are the fold-vect-extract-.p{7,8,9}.c} files in the
gcc.target/powerpc directory, where  is 'char', 'short', 'int',
'longlong', 'float' and 'double', and the p7/p8/p9 indicates whether the test
is for -mcpu=power7, -mcpu=power8, or -mcpu=power9.

For -mcpu=power8, the regressions are:
fold-vect-extract-double.p8.c: GCC 9.x and current trunk
fold-vect-extract-longlong.p8.c: GCC 9.x and current trunk

For -mcpu=power9, the regressions are:
fold-vect-extract-double.p9.c: GCC 9.x (current trunk is ok)
fold-vect-extract-longlong.p9.c: GCC 9.x (current trunk is ok)

[Bug target/93932] PowerPC vec_extract with variable element number has code regressions for V2DI/V2DF vectors

2020-02-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93932

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-02-25
 CC||dje at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org,
   ||wschmidt at gcc dot gnu.org
 Ever confirmed|0   |1

[Bug target/93932] PowerPC vec_extract with variable element number has code regressions for V2DI/V2DF vectors

2020-02-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93932

Michael Meissner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |meissner at gcc dot 
gnu.org

--- Comment #1 from Michael Meissner  ---
I've discovered that the issue is the combined insn that does variable extract
where it handles both the register case and the memory case:

(define_insn_and_split "vsx_extract__var"
  [(set (match_operand: 0 "gpc_reg_operand" "=v,wa,r")
(unspec: [(match_operand:VSX_D 1 "input_operand" "v,Q,Q")
 (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
UNSPEC_VSX_EXTRACT))
   (clobber (match_scratch:DI 3 "=r,&b,&b"))
   (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
  "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT"
  "#"
  "&& reload_completed"
  [(const_int 0)]
{
  rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
operands[3], operands[4]);
  DONE;
})

If I split the insn into two separate patterns, one that handles only the
register, and the other that only handles memory accesses.  This way the
compiler doesn't create the store and does the variable extract in the
register.

[Bug target/93932] PowerPC vec_extract with variable element number has code regressions for V2DI/V2DF vectors

2020-02-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93932

--- Comment #3 from Michael Meissner  ---
While I agree that in general, we should only use input_operand for moves and
define_expands, I tend to think in the short term (GCC 10) we should just fix
the case we know about.  As you point out, this is used in every single place
where we fold sign/zero/float extension into a load.

In looking at gcc 8 and gcc 9, the variable extract patterns are mostly the
same, except gcc 8 uses 'ww', etc. constraints, while gcc 9/10 uses the 'isa'
attribute to eliminate the cases using power9 instructions on power8.

I don't know why only V2DI/V2DF shows it up, when V4SF/V4SI/V8HI/V16QI use the
same construct, and why -mcpu=power9 compiles it ok on trunk, but not gcc 9.

[Bug target/93937] New: Variable vector extract & zero extend insn can never match

2020-02-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93937

Bug ID: 93937
   Summary: Variable vector extract & zero extend insn can never
match
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

In looking at the variable vector extract code, the insns that attempt to merge
a zero extend with a variable extract of a vector element will never match:

(define_insn_and_split "*vsx_extract__mode_var"
  [(set (match_operand: 0 "gpc_reg_operand" "=r,r,r")
(zero_extend:
 (unspec:
  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,v,Q")
   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
  UNSPEC_VSX_EXTRACT)))
   (clobber (match_scratch:DI 3 "=r,r,&b"))
   (clobber (match_scratch:V2DI 4 "=X,&v,X"))]
  "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT"
  "#"
  "&& reload_completed"
  [(const_int 0)]
{
  machine_mode smode = mode;
  rs6000_split_vec_extract_var (gen_rtx_REG (smode, REGNO (operands[0])),
operands[1], operands[2],
operands[3], operands[4]);
  DONE;
}
  [(set_attr "isa" "p9v,*,*")])

It will never match, because the compiler will never generate code of the form:

(set (reg:SI)
 (zero_extend:SI
  (unspec:SI [(reg:V4SI)
  (reg:DI)] UNSPEC_VSX_EXTRACT)))

I.e. the zero_extend type should be DImode.  Obviously the issue with PR
target/93932 (using input_operand) will also apply to this insn, once the modes
are fixed.

[Bug target/93937] Variable vector extract & zero extend insn can never match

2020-02-25 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93937

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2020-02-26
   Assignee|unassigned at gcc dot gnu.org  |meissner at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug target/56043] ICE in rs6000_builtin_vectorized_libmass for vsx-mass-1.c

2013-02-07 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56043



Michael Meissner  changed:



   What|Removed |Added



 Status|UNCONFIRMED |ASSIGNED

   Last reconfirmed||2013-02-07

 AssignedTo|unassigned at gcc dot   |meissner at gcc dot gnu.org

   |gnu.org |

 Ever Confirmed|0   |1


[Bug target/56043] ICE in rs6000_builtin_vectorized_libmass for vsx-mass-1.c

2013-02-07 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56043



--- Comment #1 from Michael Meissner  2013-02-07 
20:27:19 UTC ---

Created attachment 29390

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29390

Patch to fix the problem



There are two problems here.



The first problem is the segmentation fault if the builtin function does not

have an implicit function.  The patch adds code to return NULL_TREE in this

case, rather than cause a segmentation violation due to a NULL pointer.



However, in the case of powerpc-none-eabi, the vsx-mass-1.c test would still

fail, since some of the builtin functions are not treated as builtin (such as

atan2, which is what caused the fault).  Since the MASS library is only

available for powerpc Linux, I have restricted the test to only run on

powerpc*-*-linux*.


[Bug debug/55586] Incorrect .debug_line section for function with variable number of arguments in PowerPC

2013-02-07 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55586



--- Comment #2 from Michael Meissner  2013-02-07 
23:49:34 UTC ---

As far as I can tell, it is a bug in earlier versions of GDB, and not in the

compiler.



Due to the ABI's, it will only show up in 32-bit powerpc with an older GDB. 

The 64-bit powerpc has a completely different ABI, and for stdarg functions, it

does not pass the values in floating point registers, and it doesn't use CR6 to

indicate that the floating point values were passed.  So there isn't a jump,

etc.



I tested GCC 4.8, 4.7 and found that they essentially generated the same code

for the debugging information.  On my SLES 10 system, I even used the system

compiler which is 4.1.2 based, and it generated the same debug code.  If I used

a GDB that was 7.3 or newer (SLES 11 SP2, IBM Advance Toolchain 5.0, etc.) and

put a breakpoint on the my_function, the debugger puts the breakpoint on the

STWU instruction, and it hits the breakpoint.



If I use the system debugger on SLES 10 which is version 7.1, the debugger

skips the function start, and puts the breakpoint on the first STFD instruction

as you mention, and it won't hit the breakpoint unless you pass floating point

values in the floating point registers.



Here is the assembler output from one of the compilers for -O -m32.  Note,

there is a .loc before the first instruction at line 9 (the beginning of the

function).



my_function:

.LFB12:

.file 1 "bug-55586.c"

.loc 1 9 0

.LVL0:

stwu 1,-128(1)

.LCFI0:

mflr 0

.LCFI1:

stw 31,124(1)

.LCFI2:

stw 0,132(1)

.LCFI3:

stw 4,28(1)

stw 5,32(1)

stw 6,36(1)

stw 7,40(1)

stw 8,44(1)

stw 9,48(1)

stw 10,52(1)

bne 1,.L2

.loc 1 9 0

stfd 1,56(1)

stfd 2,64(1)

stfd 3,72(1)

stfd 4,80(1)

stfd 5,88(1)

stfd 6,96(1)

stfd 7,104(1)

stfd 8,112(1)

.L2:


[Bug target/56043] ICE in rs6000_builtin_vectorized_libmass for vsx-mass-1.c

2013-02-08 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56043



--- Comment #2 from Michael Meissner  2013-02-08 
19:36:12 UTC ---

Author: meissner

Date: Fri Feb  8 19:36:04 2013

New Revision: 195898



URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195898

Log:

[gcc]

2013-02-07  Michael Meissner  



PR target/56043

* config/rs6000/rs6000.c (rs6000_builtin_vectorized_libmass):

If there is no implicit builtin declaration, just return NULL.



[gcc/testsuite]

2013-02-07  Michael Meissner  



PR target/56043

* gcc.target/powerpc/vsx-mass-1.c: Only run this test on

powerpc*-*-linux*.





Modified:

trunk/gcc/ChangeLog

trunk/gcc/config/rs6000/rs6000.c

trunk/gcc/testsuite/ChangeLog

trunk/gcc/testsuite/gcc.target/powerpc/vsx-mass-1.c


[Bug target/56043] ICE in rs6000_builtin_vectorized_libmass for vsx-mass-1.c

2013-02-08 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56043



--- Comment #3 from Michael Meissner  2013-02-08 
19:47:07 UTC ---

Author: meissner

Date: Fri Feb  8 19:46:52 2013

New Revision: 195899



URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195899

Log:

[gcc]

2013-02-08  Michael Meissner  



PR target/56043

* config/rs6000/rs6000.c (rs6000_builtin_vectorized_libmass):

If there is no implicit builtin declaration, just return NULL.



[gcc/testsuite]

2013-02-08  Michael Meissner  



PR target/56043

* gcc.target/powerpc/vsx-mass-1.c: Only run this test on

powerpc*-*-linux*.





Modified:

branches/gcc-4_7-branch/gcc/ChangeLog

branches/gcc-4_7-branch/gcc/config/rs6000/rs6000.c

branches/gcc-4_7-branch/gcc/testsuite/ChangeLog

branches/gcc-4_7-branch/gcc/testsuite/gcc.target/powerpc/vsx-mass-1.c


[Bug target/56043] ICE in rs6000_builtin_vectorized_libmass for vsx-mass-1.c

2013-02-08 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56043



Michael Meissner  changed:



   What|Removed |Added



 Status|ASSIGNED|RESOLVED

 Resolution||FIXED



--- Comment #4 from Michael Meissner  2013-02-08 
19:50:26 UTC ---

Fixed in the mainline with subversion id 195898.

Fixed in the 4.7 branch with subversion id 195899.


[Bug target/50494] gcc.dg/vect/vect-reduc-2char.c fails spuriously on ppc with -flto

2013-02-12 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50494



--- Comment #5 from Michael Meissner  2013-02-12 
18:13:45 UTC ---

Created attachment 29426

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29426

Assembly file of slp-perm-1.c after lto with -mcpu=power6 -O3 -maltivec


[Bug target/50494] gcc.dg/vect/vect-reduc-2char.c fails spuriously on ppc with -flto

2013-02-12 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50494



--- Comment #6 from Michael Meissner  2013-02-12 
18:16:28 UTC ---

Created attachment 29427

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29427

slp-perm-1.c assembly file before LTO is run with -mcpu=power6 -O3 -maltivec


[Bug lto/50494] gcc.dg/vect/vect-reduc-2char.c fails spuriously on ppc with -flto

2013-02-12 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50494



Michael Meissner  changed:



   What|Removed |Added



  Component|target  |lto



--- Comment #7 from Michael Meissner  2013-02-12 
18:25:38 UTC ---

I am switching this to LTO instead of target, as it appears to be an LTO bug. 

Before LTO is run, the alignment of the .rodata section is 16 byte alignment

since the array used to initialize the auto array is copied with altivec

instructions.  After LTO, the alignment of the .rodata section is 4 bytes.  The

powerpc Altivec instructions ignore the bottom 4 bits of the address, and so

depending on what else is linked, the test will randomly fail or succeed.



I added attachments from compiling slp-perm-1.c with -O3 -mcpu=power6 -maltivec

-save-temps to give the asm files.


[Bug lto/50494] gcc.dg/vect/vect-reduc-2char.c fails spuriously on ppc with -flto

2013-02-12 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50494



--- Comment #9 from Michael Meissner  2013-02-12 
19:07:18 UTC ---

The -fsection-anchors option appears to be important.  If I use

-fsection-anchors (which is default for powerpc64-linux), LTO does not align

the .rodata section, but uses Altivec memory instructions.  If I use

-fno-section-anchors, the .rodata section is not aligned, but it doesn't use

Altivec memory instructions, so the test passes.


[Bug lto/50494] gcc.dg/vect/vect-reduc-2char.c fails spuriously on ppc with -flto

2013-02-12 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50494



--- Comment #10 from Michael Meissner  2013-02-12 
19:16:56 UTC ---

If -fno-merge-constants (and the default -fsection-anchors) is used, then the

correct alignment for the table is set (and Altivec memory instructions are

used).



At a guess, it is likely be in the gimplify_init_constructor function in

gimplify.c.


[Bug lto/50494] gcc.dg/vect/vect-reduc-2char.c fails spuriously on ppc with -flto

2013-02-13 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50494



--- Comment #15 from Michael Meissner  2013-02-13 
22:38:12 UTC ---

The patch does align the .rodata section to 16 byte alignment, but the code to

load up the auto vector from constant memory does not do vectorization.



If I use -fno-section-anchors, it aligns .rodata to 4 byte alignment, and does

not vectorize the code.



If I use -fno-merge-constants, it aligns .rodata to 16 byte alignment, and does

vectorize the code.



If I use -fno-merge-constants without -flto, it aligns .rodata to 16 byte

alignment, but it uses unaligned vector loads/stores.



So the patch does help in that the tests now pass that were randomly failing.



While it would be nice if we could get the initialization to be vectorized, I'm

not how performance critical this is.



Eric: if the alignment of the constant data that is used to initialize the auto

array is a mismatch, and you use Altivec instructions, when the compiler

auto-vectorizes the copy, the wrong data gets used.


[Bug target/57150] New: GCC when targeting power7 spills long double using VSX instructions.

2013-05-02 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57150



 Bug #: 57150

   Summary: GCC when targeting power7 spills long double using VSX

instructions.

Classification: Unclassified

   Product: gcc

   Version: 4.9.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: target

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: meiss...@gcc.gnu.org





Created attachment 30008

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30008

Cut down example to show the problem, using -mcpu=power7 -m64



In the glibc file e_scalbl.c, the compiler is using VSX stxvd2x and lxvd2x

instructions to spill long double, even though only 1/2 of the register is

used.  The compiler should use scalar load/store instructions.


[Bug target/57150] GCC when targeting power7 spills long double using VSX instructions.

2013-05-02 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57150



--- Comment #1 from Michael Meissner  2013-05-02 
19:37:21 UTC ---

Created attachment 30009

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30009

Assembler file


[Bug target/57150] GCC when targeting power7 spills long double using VSX instructions.

2013-05-02 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57150



Michael Meissner  changed:



   What|Removed |Added



 Target||powerpc64-gnu-linux

 Status|UNCONFIRMED |ASSIGNED

   Last reconfirmed||2013-05-02

   Host||powerp64-gnu-linux

 Ever Confirmed|0   |1

  Known to fail||4.5.0

  Build||powerpc64-gnu-linux



--- Comment #2 from Michael Meissner  2013-05-02 
19:42:51 UTC ---

This goes back to the original VSX submission for GCC 4.5.



While the code is slow, it does appear to be correct.


[Bug target/57150] GCC when targeting power7 spills long double using VSX instructions.

2013-05-02 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57150



--- Comment #3 from Michael Meissner  2013-05-02 
21:03:08 UTC ---

It shows up due to -fcaller-saves, which creates a V2DF save area.


[Bug target/57150] GCC when targeting power7 spills long double using VSX instructions.

2013-05-03 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57150



--- Comment #4 from Michael Meissner  2013-05-03 
19:18:21 UTC ---

Created attachment 30028

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30028

Patch to use scalar modes for TF/TD caller saves.


[Bug target/57150] GCC when targeting power7 spills long double using VSX instructions.

2013-05-07 Thread meissner at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57150



Michael Meissner  changed:



   What|Removed |Added



 Status|ASSIGNED|RESOLVED

 Resolution||FIXED



--- Comment #5 from Michael Meissner  2013-05-07 
16:26:02 UTC ---

Fixed in subversion id 198593.


[Bug target/52775] Change default for using FCFID instruction

2012-08-16 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52775

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #4 from Michael Meissner  2012-08-16 
22:58:02 UTC ---
Fixed in April, 2012.


[Bug target/53487] [4.8 Regression] Unrecognizable insn for conditional move

2012-08-16 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53487

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #5 from Michael Meissner  2012-08-16 
22:59:35 UTC ---
Fixed on June 5, 2012.


[Bug target/52495] rs6000.c fails to (cross-) build: "implicit declaration of function ‘ASM_WEAKEN_DECL’"

2012-08-16 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52495

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-08-16
 CC||meissner at gcc dot gnu.org
 Ever Confirmed|0   |1

--- Comment #1 from Michael Meissner  2012-08-16 
23:14:17 UTC ---
If the configure scripts think the cross assembler does not support .weak
symbols, the compiler will fail because it does not define ASM_WEAKEN_DECL. 
Note, when I tried this on August 16th, 2012, the current head of binutils
seems broken (the archiver segfaults), but the 2_21 branch builds it fine on my
Linux system with a target of powerpc64-linux and additional targets of
powerpc-linux.  Obviously the compiler should do something more appropriate if
the assembler does not support .weak symbols.


[Bug target/47251] New: Powerpc doesn't like -m32 -msoft-float -mcpu=power7

2011-01-10 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47251

   Summary: Powerpc doesn't like -m32 -msoft-float -mcpu=power7
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: meiss...@gcc.gnu.org
ReportedBy: meiss...@gcc.gnu.org
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux


Created attachment 22941
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22941
Function from libgcc.a that fails with -m32 -mcpu=power7 -msoft-float

If you build GCC with --with-cpu=power7, it fails in building libgcc for -m32
-msoft-float.  This is due to floatunsdidf/floatunsdfdi_mem not having checks
for TARGET_HARD_FLOAT.

The error is:
/home/meissner/fsf-src/trunk-p7/libgcc/../gcc/libgcc2.c: In function
‘__fixunssfdi’:
/home/meissner/fsf-src/trunk-p7/libgcc/../gcc/libgcc2.c:1340:1: error: unable
to generate reloads for:
(insn 24 22 25 2 (set (reg:DF 3 3)
(unsigned_float:DF (reg:DI 10 10 [orig:138 hi+-4 ] [138])))
/home/meissner/fsf-src/trunk-p7/libgcc/../gcc/libgcc2.c:1297 314
{*floatunsdidf2_fcfidu}
 (expr_list:REG_DEAD (reg:DI 10 10 [orig:138 hi+-4 ] [138])
(nil)))
/home/meissner/fsf-src/trunk-p7/libgcc/../gcc/libgcc2.c:1340:1: internal
compiler error: in find_reloads, at reload.c:3805
Please submit a full bug report,
with preprocessed source if appropriate.


[Bug target/47251] Powerpc doesn't like -m32 -msoft-float -mcpu=power7

2011-01-10 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47251

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011.01.10 21:27:14
 Ever Confirmed|0   |1


[Bug target/47251] Powerpc doesn't like -m32 -msoft-float -mcpu=power7

2011-01-10 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47251

Michael Meissner  changed:

   What|Removed |Added

   Target Milestone|--- |4.6.0


[Bug target/47272] New: In addition to the bug uncovered in 42751, gcc can't bootstrap using --with-cpu=power7

2011-01-12 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47272

   Summary: In addition to the bug uncovered in 42751, gcc can't
bootstrap using --with-cpu=power7
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: target
AssignedTo: meiss...@gcc.gnu.org
ReportedBy: meiss...@gcc.gnu.org
CC: berg...@vnet.ibm.com
Depends on: 42751
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux


The VSX support changed to use the VSX form of the instruction if both VSX and
Altivec forms existed.  Unfortunately, there are differences between the
Altivec memory references instructions (LVX/STVX) and the VSX memory reference
instructions (LXVW4X/STXVW4X).  In particular, the Altivec memory instructions
ignore the bottom 3 bits of the address field, and the VSX instructions do not.
 The altivec code in libcpp/lex.c was coded such that it knew about ignoring
the bottom 3 bits of the load.

Thus we should modify __builtin_vec_ld and __builtin_vec_st to use the Altivec
versions of the instructions, and provide other builtins that can use either
the altivec or VSX memory instructions, depending on the switches used.

In addition, during testing, I discovered that __builtin_vec_ld and
__builtin_vec_st don't support the vector double and vector long long types
added with VSX.


[Bug target/47272] GCC can't bootstrap on powerpc64-linx using --with-cpu=power7

2011-01-12 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47272

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011.01.12 21:53:09
 Ever Confirmed|0   |1


[Bug target/47272] GCC can't bootstrap on powerpc64-linx using --with-cpu=power7

2011-01-12 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47272

--- Comment #1 from Michael Meissner  2011-01-12 
21:54:25 UTC ---
Note, the fixes for 47251 will be needed in addition to changes for this bug in
order to do a full bootstrap on a power7 system using the --with-cpu=power7
configure option.


[Bug regression/47385] New: Test gcc.target/powerpc/pr37168.c fails if compiled using a compiled configured with --with-cpu=power7

2011-01-20 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47385

   Summary: Test gcc.target/powerpc/pr37168.c fails if compiled
using a compiled configured with --with-cpu=power7
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: regression
AssignedTo: meiss...@gcc.gnu.org
ReportedBy: meiss...@gcc.gnu.org
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux


The test case pr37168 fails if VSX instructions are enabled.  This is due to
the fact that the vector constant used has 4 single precision floating point
values, and the compiler thinks it can create this via Altivec integer
instructions (the bit value is 26, and the compiler wants to load 13 into each
word and then double it to get 26).  The case fails because in this case, V4SF
uses the VSX vector unit and not the Altivec vector unit.  The fix is to allow
either VSX or Altivec vector units.


[Bug regression/47385] Test gcc.target/powerpc/pr37168.c fails if compiled using a compiled configured with --with-cpu=power7

2011-01-20 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47385

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011.01.20 20:43:06
 Ever Confirmed|0   |1


[Bug regression/47385] Test gcc.target/powerpc/pr37168.c fails if compiled using a compiled configured with --with-cpu=power7

2011-01-20 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47385

--- Comment #1 from Michael Meissner  2011-01-20 
20:44:09 UTC ---
Created attachment 23051
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23051
Patch to fix the problem


[Bug target/47251] Powerpc doesn't like -m32 -msoft-float -mcpu=power7

2011-01-20 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47251

Michael Meissner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #1 from Michael Meissner  2011-01-20 
20:50:49 UTC ---
Fixed on January 13th, 2011.


[Bug target/47272] GCC can't bootstrap on powerpc64-linx using --with-cpu=power7

2011-01-20 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47272

--- Comment #2 from Michael Meissner  2011-01-20 
20:57:54 UTC ---
Created attachment 23052
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23052
Preliminary patch to allow --with-cpu=power7 to work

The root problem is under VSX, the vec_ld/vec_st builtins use VSX memory
instructions which have different semantics than Altivec when the memory is not
aligned.  The Altivec speedup in libcpp/lex.c specifically knows about the
Altivec behaviour and accesses the wrong memory location if the compiler is
built with VSX instructions.

This patch changes vec_ld/vec_st to go back to using Altivec instructions.  It
also adds vector double/vector long long support to the Altivec builtin whole
vector memory operations.  However, in doing so, it may affect users who have
been using GCC 4.5 for VSX that expects to use VSX instructions.  I anticipate
this is not the final patch for the problem.


[Bug target/47408] New: Several of the Altivec tests fail if run with a compiler built with --with-cpu=power7

2011-01-21 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47408

   Summary: Several of the Altivec tests fail if run with a
compiler built with --with-cpu=power7
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: meiss...@gcc.gnu.org
ReportedBy: meiss...@gcc.gnu.org
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux


Some of the Altivec tests fail if the default cpu for the powerpc compiler is
power7, because these tests are looking for specific code sequences and/or
errors.  The fix is to add -mno-vsx to the options.


[Bug target/47408] Several of the Altivec tests fail if run with a compiler built with --with-cpu=power7

2011-01-21 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47408

--- Comment #1 from Michael Meissner  2011-01-21 
20:00:09 UTC ---
Created attachment 23072
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23072
Patch that adds -mno-vsx to altivec tests


[Bug target/47408] Several of the Altivec tests fail if run with a compiler built with --with-cpu=power7

2011-01-21 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47408

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011.01.21 20:03:18
 Ever Confirmed|0   |1


[Bug target/47408] Several of the Altivec tests fail if run with a compiler built with --with-cpu=power7

2011-01-24 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47408

--- Comment #2 from Michael Meissner  2011-01-24 
16:47:20 UTC ---
Author: meissner
Date: Mon Jan 24 16:47:16 2011
New Revision: 169167

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169167
Log:
Fix PR 47408 and 47385

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/altivec.md
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/ext/altivec-15.C
trunk/gcc/testsuite/g++.dg/ext/altivec-types-1.C
trunk/gcc/testsuite/g++.dg/ext/altivec-types-2.C
trunk/gcc/testsuite/g++.dg/ext/altivec-types-3.C
trunk/gcc/testsuite/g++.dg/ext/altivec-types-4.C
trunk/gcc/testsuite/gcc.target/powerpc/altivec-11.c
trunk/gcc/testsuite/gcc.target/powerpc/altivec-14.c
trunk/gcc/testsuite/gcc.target/powerpc/altivec-33.c
trunk/gcc/testsuite/gcc.target/powerpc/altivec-types-1.c
trunk/gcc/testsuite/gcc.target/powerpc/altivec-types-2.c
trunk/gcc/testsuite/gcc.target/powerpc/altivec-types-3.c
trunk/gcc/testsuite/gcc.target/powerpc/altivec-types-4.c
trunk/gcc/testsuite/gcc.target/powerpc/ppc-vector-memcpy.c
trunk/gcc/testsuite/gcc.target/powerpc/ppc-vector-memset.c


[Bug target/47408] Several of the Altivec tests fail if run with a compiler built with --with-cpu=power7

2011-01-24 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47408

--- Comment #3 from Michael Meissner  2011-01-24 
16:57:07 UTC ---
Author: meissner
Date: Mon Jan 24 16:57:04 2011
New Revision: 169168

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169168
Log:
Fix PR 47408 and 47385

Modified:
branches/ibm/gcc-4_5-branch/gcc/ChangeLog.ibm
branches/ibm/gcc-4_5-branch/gcc/config/rs6000/altivec.md
branches/ibm/gcc-4_5-branch/gcc/testsuite/ChangeLog.ibm
branches/ibm/gcc-4_5-branch/gcc/testsuite/g++.dg/ext/altivec-15.C
branches/ibm/gcc-4_5-branch/gcc/testsuite/g++.dg/ext/altivec-types-1.C
branches/ibm/gcc-4_5-branch/gcc/testsuite/g++.dg/ext/altivec-types-2.C
branches/ibm/gcc-4_5-branch/gcc/testsuite/g++.dg/ext/altivec-types-3.C
branches/ibm/gcc-4_5-branch/gcc/testsuite/g++.dg/ext/altivec-types-4.C
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/altivec-11.c
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/altivec-14.c
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/altivec-33.c
   
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/altivec-types-1.c
   
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/altivec-types-2.c
   
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/altivec-types-3.c
   
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/altivec-types-4.c
   
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/ppc-vector-memcpy.c
   
branches/ibm/gcc-4_5-branch/gcc/testsuite/gcc.target/powerpc/ppc-vector-memset.c


[Bug target/43154] vec_mergel and vec_mergeh should support V2DF/V2DI

2011-01-24 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43154

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #4 from Michael Meissner  2011-01-24 
19:52:28 UTC ---
Fixed on February 2, 2010.


[Bug target/47580] New: Powerpc GCC fails test gcc.dg/pr41551.c if built with --with-cpu=power7

2011-02-01 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47580

   Summary: Powerpc GCC fails test gcc.dg/pr41551.c if built with
--with-cpu=power7
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: meiss...@gcc.gnu.org
ReportedBy: meiss...@gcc.gnu.org
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux


Test gcc.dg/pr41551.c fails for powerpc if the default target is power7.

This is due to the fact that the expander for floatunsdidf (and others) uses
gpc_reg_operand:

(define_expand "floatunsdidf2"
  [(set (match_operand:DF 0 "gpc_reg_operand" "")
(unsigned_float:DF
 (match_operand:DI 1 "gpc_reg_operand" "")))]
  "TARGET_HARD_FLOAT && (TARGET_FCFIDU || VECTOR_UNIT_VSX_P (DFmode))"
  "")

However, the corresponding VSX matcher uses vsx_register_operand:
(define_insn "vsx_floatuns2"
  [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,?wa")
(unsigned_float:VSX_B (match_operand: 1 "vsx_register_operand"
",")))]
  "VECTOR_UNIT_VSX_P (mode)"
  "xcvux %x0,%x1"
  [(set_attr "type" "")
   (set_attr "fp_type" "")])

Gpc_reg_operand allows the virtual stack registers while vsx_register_operand
does not.  Since the test is:

__extension__ typedef __SIZE_TYPE__ size_t;

int main(void)
{
 int var, *p = &var;
 return (double)(size_t)(p);
}

It means the expander creates:

(insn 5 4 6 3 (set (reg:DF 125)
(unsigned_float:DF (reg/f:DI 115 virtual-stack-vars))) pr41551.c:11 -1
 (nil))

Which then doesn't match when the target is VSX.  There are several different
ways this can be solved:
  1) Allow virtual stack registers to be used in the vsx register operands.
  2) Add a new predicate that doesn't allow virtual stack registers in the
expander;
  3) Add code in the expander to copy the results if it is in a virtual
register.


[Bug target/47580] Powerpc GCC fails test gcc.dg/pr41551.c if built with --with-cpu=power7

2011-02-01 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47580

--- Comment #1 from Michael Meissner  2011-02-01 
19:02:58 UTC ---
Author: meissner
Date: Tue Feb  1 19:02:55 2011
New Revision: 169499

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169499
Log:
Fix PR 47580

Modified:
branches/ibm/power7-meissner/gcc/ChangeLog.power7
branches/ibm/power7-meissner/gcc/config/rs6000/predicates.md


[Bug target/47580] Powerpc GCC fails test gcc.dg/pr41551.c if built with --with-cpu=power7

2011-02-01 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47580

--- Comment #2 from Michael Meissner  2011-02-01 
19:09:53 UTC ---
Created attachment 23203
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23203
Patch that allows virtual registers in vsx register predicates.


[Bug target/47580] Powerpc GCC fails test gcc.dg/pr41551.c if built with --with-cpu=power7

2011-02-01 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47580

Michael Meissner  changed:

   What|Removed |Added

  Attachment #23203|0   |1
is obsolete||

--- Comment #3 from Michael Meissner  2011-02-01 
19:17:48 UTC ---
Created attachment 23204
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23204
Replacement patch


[Bug target/47580] Powerpc GCC fails test gcc.dg/pr41551.c if built with --with-cpu=power7

2011-02-01 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47580

Michael Meissner  changed:

   What|Removed |Added

  Attachment #23204|0   |1
is obsolete||

--- Comment #4 from Michael Meissner  2011-02-02 
01:16:01 UTC ---
Created attachment 23207
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23207
Replacement patch #2


[Bug target/47580] Powerpc GCC fails test gcc.dg/pr41551.c if built with --with-cpu=power7

2011-02-02 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47580

--- Comment #5 from Michael Meissner  2011-02-03 
00:41:21 UTC ---
Author: meissner
Date: Thu Feb  3 00:41:16 2011
New Revision: 169776

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169776
Log:
Fix PR target/47580

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/vsx.md


[Bug target/47272] GCC can't bootstrap on powerpc64-linx using --with-cpu=power7

2011-02-02 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47272

--- Comment #3 from Michael Meissner  2011-02-03 
05:42:23 UTC ---
Author: meissner
Date: Thu Feb  3 05:42:19 2011
New Revision: 169780

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169780
Log:
Fix PR target/47272

Added:
trunk/gcc/testsuite/gcc.target/powerpc/vsx-builtin-8.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/altivec.h
trunk/gcc/config/rs6000/altivec.md
trunk/gcc/config/rs6000/rs6000-builtin.def
trunk/gcc/config/rs6000/rs6000-c.c
trunk/gcc/config/rs6000/rs6000-protos.h
trunk/gcc/config/rs6000/rs6000.c
trunk/gcc/config/rs6000/rs6000.h
trunk/gcc/config/rs6000/vector.md
trunk/gcc/config/rs6000/vsx.md
trunk/gcc/doc/extend.texi
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/powerpc/avoid-indexed-addresses.c
trunk/gcc/testsuite/gcc.target/powerpc/ppc32-abi-dfp-1.c
trunk/gcc/testsuite/gcc.target/powerpc/ppc64-abi-dfp-1.c


[Bug target/47272] GCC can't bootstrap on powerpc64-linx using --with-cpu=power7

2011-02-02 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47272

Michael Meissner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #4 from Michael Meissner  2011-02-03 
05:43:34 UTC ---
Patch committed Feb. 3, 2011, subversion id 169780.


[Bug target/47580] Powerpc GCC fails test gcc.dg/pr41551.c if built with --with-cpu=power7

2011-02-02 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47580

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #6 from Michael Meissner  2011-02-03 
05:51:03 UTC ---
Patch checked in on Feb. 2nd, 2011, subversion id 169776.


[Bug tree-optimization/46728] [4.6 Regression] GCC no longer generates fmadd for pow (x, 0.75)+y on powerpc

2011-02-04 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46728

--- Comment #2 from Michael Meissner  2011-02-04 
18:11:09 UTC ---
When the initial changes for bug 42694 was added that optimizes pow (x, 0.75)
into sqrt(sqrt(x))*sqrt(x) under fast math, there was a desire to move this RTL
optimization into the tree level.  Ideally it should before the vectorization
of math functions and FMA (floating point multiplication and add) passes.

Here is the discussion about the changes in April 2010:
http://gcc.gnu.org/ml/gcc-patches/2010-04/msg00788.html

Presumably most of the optimizations done in expand_builtin_pow,
expand_builtin_powi and expand_builtin_pow_root in the builtins.c file should
be moved to a tree pass.


[Bug target/47636] New: Powerpc rs6000.md uses RS6000_RECIP_HAVE_RSQRT_P

2011-02-07 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47636

   Summary: Powerpc rs6000.md uses RS6000_RECIP_HAVE_RSQRT_P
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: meiss...@gcc.gnu.org
ReportedBy: meiss...@gcc.gnu.org
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux


There is a typo in rs6000.md in the rsqrt generator functions.  It refers
to the RS6000_RECIP_HAVE_RSQRT_P macro, but the actual macro is
RS6000_RECIP_HAVE_RSQRTE_P.  You get a warning that the function is unknown in
the build, but it doesn't stop the build since it just puts out a relocation
for the RS6000_RECIP_HAVE_RSQRT_P function to be loaded later.  However the
rsqrt generators are never called, you never get an error.


[Bug target/47636] Powerpc rs6000.md uses RS6000_RECIP_HAVE_RSQRT_P

2011-02-07 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47636

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011.02.07 19:43:31
 Ever Confirmed|0   |1


[Bug target/47636] Powerpc rs6000.md uses RS6000_RECIP_HAVE_RSQRT_P

2011-02-07 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47636

--- Comment #1 from Michael Meissner  2011-02-07 
20:32:51 UTC ---
Author: meissner
Date: Mon Feb  7 20:32:45 2011
New Revision: 169901

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169901
Log:
Fix PR target/47636

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.md


[Bug target/47636] Powerpc rs6000.md uses RS6000_RECIP_HAVE_RSQRT_P

2011-02-07 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47636

--- Comment #2 from Michael Meissner  2011-02-07 
20:35:02 UTC ---
Created attachment 23269
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23269
Patch that fixes the problem

Spell RS6000_RECIP_HAVE_RSQRTE_P correctly.


[Bug target/47755] New: VSX code generates a TOC reference to clear memory

2011-02-15 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47755

   Summary: VSX code generates a TOC reference to clear memory
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: target
AssignedTo: meiss...@gcc.gnu.org
ReportedBy: meiss...@gcc.gnu.org
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux


If you have an array of pointers or longs in 64-bit mode that you want to
clear, via loops like when you have automatic vectorization on:

for (i = 0; i < sizeof (array) / sizeof (array[0]); i++)
  array[i] = 0;

The compiler generates the 128-bit zero constant, puts it into the constant
pool, and loads it from memory in order to set the array with vector
instructions.

Normally this would be a missed optimization, but we discovered it in compiling
_dl_start with -O3 -mcpu=power7, and at the time _dl_start is run, the TOC
registers are not yet set up, so the program crashes before it starts.

The cause of the bug is that V2DI mode is not true for either VSX_VECTOR_MODE
or ALTIVEC_VECTOR_MODE, since there are no native 64-bit operations in the VSX
or Altivec vector instructions.  This means that easy_vector_constant fails,
which in turn makes LEGITIMATE_CONSTANT_P fail.

The solution is use macros that test whether Altivec/VSX memory references can
be done, instead of macros that say we have native arithmetic support for those
modes.


[Bug target/47755] VSX code generates a TOC reference to clear memory

2011-02-15 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47755

--- Comment #1 from Michael Meissner  2011-02-15 
15:41:49 UTC ---
Created attachment 23352
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23352
Patch to allow V2DI easy vector constants


[Bug target/44218] Improve powerpc -mrecip support

2011-02-15 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44218

Michael Meissner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #3 from Michael Meissner  2011-02-15 
17:56:50 UTC ---
Fixed with checkin on June 3rd, 2010.


[Bug target/47636] Powerpc rs6000.md uses RS6000_RECIP_HAVE_RSQRT_P

2011-02-15 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47636

Michael Meissner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #3 from Michael Meissner  2011-02-15 
17:58:14 UTC ---
Fixed on February 7th, 2011.


[Bug target/47408] Several of the Altivec tests fail if run with a compiler built with --with-cpu=power7

2011-02-15 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47408

Michael Meissner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #4 from Michael Meissner  2011-02-15 
17:59:16 UTC ---
Fixed with checkin on January 24th, 2011.


[Bug target/47755] VSX code generates a TOC reference to clear memory

2011-02-15 Thread meissner at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47755

--- Comment #2 from Michael Meissner  2011-02-15 
18:43:01 UTC ---
Author: meissner
Date: Tue Feb 15 18:42:59 2011
New Revision: 170189

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170189
Log:
Fix PR 47755

Added:
trunk/gcc/testsuite/gcc.target/powerpc/pr47755.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/predicates.md
trunk/gcc/testsuite/ChangeLog


  1   2   3   4   5   6   7   8   9   10   >