[Bug web/94581] New: Error in upcoming release notes

2020-04-13 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94581

Bug ID: 94581
   Summary: Error in upcoming release notes
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

The web release notes for the upcoming 10.0 release have an error in the
"Fortran" section. The statement "If that is the case, -finline-arg-packing can
be used to disable inline argument packing." is wrong, as this option ENABLES
packing.

[Bug web/96547] New: Two errors on https://gcc.gnu.org/gcc-11/changes.html

2020-08-09 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96547

Bug ID: 96547
   Summary: Two errors on https://gcc.gnu.org/gcc-11/changes.html
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

There are currently two misspellings on the "changes" page for GCC 11:

"limitted" -> "limited"
"reinterpreter_casts" -> "reinterpret_casts"

-erik

[Bug libgcc/57058] New: Bootstrap problems on AIX (libgcc configure, 64-bit)

2013-04-24 Thread schnetter at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57058



 Bug #: 57058

   Summary: Bootstrap problems on AIX (libgcc configure, 64-bit)

Classification: Unclassified

   Product: gcc

   Version: 4.7.3

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: libgcc

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: schnet...@gmail.com





I am trying to bootstrap gcc 4.7.3 on AIX. Both xlc and gcc 4.2.0 are

pre-installed. The main issue I encountered is that I need to use the option

"-maix64" when building with gcc, as I otherwise encounter assembler errors

about ".llong" (but that's fine since I want to use 64 bits anyway).



I encountered a problem when the stage 1 compiler tries to build libgcc. The

stage 1 compiler still needs to use -maix64. However, libgcc's configure script

seems to ignore all ways in which I could add this to CFLAGS (BOOT_CFLAGS,

CFLAGS_FOR_TARGET). The symptoms are errors about ".llong", since the stage 1

compiler doesn't use -maix64.



I believe this is due to an error in the following lines from libgcc's

configure.ac:



{{{

AC_CACHE_CHECK([whether to use setjmp/longjmp exceptions],

[libgcc_cv_lib_sjlj_exceptions],

[AC_LANG_CONFTEST(

  [AC_LANG_SOURCE([

void bar ();

void clean (int *);

void foo ()

{

  int i __attribute__ ((cleanup (clean)));

  bar();

}

])])

CFLAGS_hold=$CFLAGS

CFLAGS="--save-temps -fexceptions"

libgcc_cv_lib_sjlj_exceptions=unknown

AS_IF([ac_fn_c_try_compile],

  [if grep _Unwind_SjLj_Resume conftest.s >/dev/null 2>&1; then

libgcc_cv_lib_sjlj_exceptions=yes

  elif grep _Unwind_Resume conftest.s >/dev/null 2>&1; then

libgcc_cv_lib_sjlj_exceptions=no

  fi])

CFLAGS=$CFLAGS_hold

rm -f conftest*

])

}}}



Note that these lines unconditionally set CFLAGS before compiling a test

program. Instead, they should presumably be adding to CFLAGS. When changing the

offending line to



CFLAGS="$CFLAGS --save-temps -fexceptions"



the bootstrap went past this problem.


[Bug rtl-optimization/47010] New: Missed optimization: x86-64 prologue not deleted

2010-12-18 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47010

   Summary: Missed optimization: x86-64 prologue not deleted
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


Created attachment 22818
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22818
pre-processed bzipped source code

The following code is generated by g++ 4.5.1 on an x86-64 architecture (Mac OS
10.6). This is a static function where g++ may even have modified the argument
list. I believe the three instructions "pushq", "movq", and "leave" are not
necessary. This routine is called in a compute-intensive inner loop that has
problems fitting into the level 1 instruction cache.

The disassembled routine is:

__ZL20PDstandardNth11_implPKdll.clone.1:
0140pushq   %rbp
0141movupd  0x10(%rdi),%xmm3
0146movupd  0xf0(%rdi),%xmm0
014bmovupd  0x08(%rdi),%xmm2
0150addpd   %xmm3,%xmm0
0154movupd  0xf8(%rdi),%xmm1
0159movq%rsp,%rbp
015caddpd   %xmm2,%xmm1
0160mulpd   0x000a0578(%rip),%xmm1
0168addpd   %xmm0,%xmm1
016cmovupd  (%rdi),%xmm0
0170mulpd   0x000a0578(%rip),%xmm0
0178leave
0179addpd   %xmm1,%xmm0
017dret

The original function is defined as:

static CCTK_REAL_VEC PDstandardNth11_impl(CCTK_REAL const* restrict const u,
ptrdiff_t const dj, ptrdiff_t const dk) __attribute__((pure))
__attribute__((noinline)) __attribute__((unused));

static CCTK_REAL_VEC PDstandardNth11_impl(CCTK_REAL const* restrict const u,
ptrdiff_t const dj, ptrdiff_t const dk)
{ return
kmadd(ToReal(30),vec_loadu_maybe3(0,0,0,(u)[(0)+dj*(0)+dk*(0)]),kmadd(ToReal(-16),kadd(vec_loadu_maybe3(-1,0,0,(u)[(-1)+dj*(0)+dk*(0)]),vec_loadu_maybe3(1,0,0,(u)[(1)+dj*(0)+dk*(0)])),kadd(vec_loadu_maybe3(-2,0,0,(u)[(-2)+dj*(0)+dk*(0)]),vec_loadu_maybe3(2,0,0,(u)[(2)+dj*(0)+dk*(0)];
}

where CCTK_REAL is double, and CCTK_REAL_VEC is __m128d, the SSE2 vector of
doubles. The function body contains macros that translate directly to Intel
SSE2 vector instructions.

The code was compiled with gcc 4.5.1 with the options

g++-mp-4.5 -g3 -m128bit-long-double -march=native -std=gnu++0x -O3
-funsafe-loop-optimizations -fsee -ftree-loop-linear -ftree-loop-im -fivopts
-fvect-cost-model -funroll-loops -funroll-all-loops
-fvariable-expansion-in-unroller -fprefetch-loop-arrays -ffast-math
-fassociative-math -freciprocal-math -fno-trapping-math -fexcess-precision=fast
-fopenmp -Wall -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align
-Woverloaded-virtual 

I attach the complete pre-processed and bzipped source code. The source code
itself is auto-generated.


[Bug inline-asm/47318] New: _mm256_maskstore_pd has wrong prototype

2011-01-16 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47318

   Summary: _mm256_maskstore_pd has wrong prototype
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: inline-asm
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


The AVX functions _mm256_maskstore_pd and _mm_maskstore_pd have the wrong
prototype in gcc 4.5.2 and gcc 4.6.0, as declared in avxintrin.h. According to
Intel (e.g.
),
the mask argument should have type __m256i, whereas gcc declares it as __m256d.


[Bug tree-optimization/47561] New: Error message does not say to which option it refers

2011-01-31 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47561

   Summary: Error message does not say to which option it refers
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


When trying gcc 4.6.0, I received the error message:

In file included from
/Users/eschnett/EinsteinToolkit-hg/arrangements/McLachlan/ML_BSSN_O8/src/ML_BSSN_O8_RHS1.cc:13:0:
/Users/eschnett/EinsteinToolkit-hg/arrangements/McLachlan/ML_BSSN_O8/src/Differencing.h:
In function '__m128d PDstandardNth23_impl(const double*, ptrdiff_t,
ptrdiff_t)':
/Users/eschnett/EinsteinToolkit-hg/arrangements/McLachlan/ML_BSSN_O8/src/Differencing.h:124:22:
sorry, unimplemented: Graphite loop optimizations cannot be used

Apparently I am asking gcc to apply some optimizations that are not enabled in
this particular version. I think gcc should tell me which particular
optimization option (or set of options) leads to this problem. Since the error
message speaks only about "graphite optimizations", I cannot tell which options
I have to avoid.

I used the following optimization options:

-O3 -funsafe-loop-optimizations -fsee -ftree-loop-linear -ftree-loop-im
-fivopts -fvect-cost-model -funroll-loops -funroll-all-loops
-fvariable-expansion-in-unroller -fprefetch-loop-arrays -ffast-math
-fassociative-math -freciprocal-math -fno-trapping-math -fexcess-precision=fast


[Bug c++/47808] New: internal compiler error: in tsubst_copy_and_build, at cp/pt.c:13326

2011-02-18 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47808

   Summary: internal compiler error: in tsubst_copy_and_build, at
cp/pt.c:13326
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


Created attachment 23398
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23398
failing source code

I receive an internal compiler error with gcc version (on Mac OS X, the current
gcc46 version from MacPorts)

g++-mp-4.6 (GCC) 4.6.0 20110212 (experimental)

I execute the command

g++-mp-4.6 -fopenmp -Wall -g3 -m128bit-long-double -march=native -std=gnu++0x
-fbounds-check -fstack-protector-all -ftrapv -O0 -fopenmp -Wall -Wshadow
-Wpointer-arith -Wcast-qual -Wcast-align -Woverloaded-virtual -c iobasic.ii

and receive the output

/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetIOBasic/src/iobasic.cc:
In function 'bool CarpetIOBasic::UseScientificNotation(const T&) [with T =
int]':/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetLib/src/typecase.hh:149:118:
  instantiated from
here/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetIOBasic/src/iobasic.cc:703:22:
internal compiler error: in tsubst_copy_and_build, at cp/pt.c:13326Please
submit a full bug report,with preprocessed source if appropriate.See
 for instructions.

I attach the preprocessed source code.


[Bug rtl-optimization/50440] New: 128 bit unsigned int subtraction generates too many register moves

2011-09-16 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50440

 Bug #: 50440
   Summary: 128 bit unsigned int subtraction generates too many
register moves
Classification: Unclassified
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


I want to perform 128 bit integer arithmetic, and I am declaring my type like
this:

{{{
typedef unsigned int uint128_t __attribute__((mode(TI)));
uint128_t add (uint128_t x, uint128_t y) { return x+y; }
uint128_t sub (uint128_t x, uint128_t y) { return x-y; }
}}}

This is on an Intel Xeon processor in x86_64 mode. I build with the command

gcc-4.6.1 -O3 -march=native -S sub128.c

and I find that, while the "add" routine looks optimal, the "sub" routine has
several unnecessary register moves:

{{{
add:
movq%rdx, %rax
movq%rcx, %rdx
addq%rdi, %rax
adcq%rsi, %rdx
ret
sub:
movq%rsi, %r10
movq%rdi, %rsi
subq%rdx, %rsi
movq%r10, %rdi
sbbq%rcx, %rdi
movq%rsi, %rax
movq%rdi, %rdx
ret
}}}


[Bug other/53918] New: Incorrect version for cloog-ppl listed in prerequisites.html

2012-07-10 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53918

 Bug #: 53918
   Summary: Incorrect version for cloog-ppl listed in
prerequisites.html
Classification: Unclassified
   Product: gcc
   Version: 4.7.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


The file INSTALL/prerequisites.html lists the file "cloog-ppl-0.15.tar.gz" as
prerequisite. This is incorrect:
(1) a file with this exact name does not exist at the location

(2) e.g. the file "cloog-ppl-0.15.9.tar.gz" does not work, as it requires
ppl-0.10, but the prerequisites explicitly require ppl-0.11

I believe that pointing to the file "cloog-ppl-0.15.11.tar.gz" instead would be
correct.


[Bug web/53919] New: Version-specific install instructions not available

2012-07-10 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53919

 Bug #: 53919
   Summary: Version-specific install instructions not available
Classification: Unclassified
   Product: gcc
   Version: 4.7.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


The gcc web site does not seem to list version-specific install instructions.
This is inconvenient, since e.g. prerequisites or other details may differ
significantly between different versions.

 shows install instructions, but these seem to
pertain to the (unreleased) trunk. Unfortunately, this fact is not even
mentioned.

 only has the manual, not the install
instructions.

The online install instructions should prominently mention that they are not
valid for any released version, and that one should download the release and
read those install instructions instead. Alternatively, the version-specific
install instructions should be added next to the online manual.


[Bug web/53919] Version-specific install instructions not available

2012-07-10 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53919

--- Comment #2 from Erik Schnetter  2012-07-10 
18:26:53 UTC ---
I am currently installing gcc 4.7.1, and and dealing with the prerequisites.
gmp 5.x was not recognised as valid version; I had to install gmp 4.x instead.
Finding a valid combination of ppl/cloog/isl versions was another issue: after
downloading and installing isl, I found that gcc 4.7.1 doesn't even offer the
respective configuration option.

These may just be "details" in the prerequisites, but it is just these details
that I am looking for in the install instructions.

I appreciate that the instructions generally don't change that much between
versions, but the same can probably be said about the manual -- new options,
new optimisations, and new architectures tend to be few among a large bulk of
things that remain the same. Yet, the web site makes a clear distinctions
between manuals for different versions, and doesn't even seem to offer an
online manual for the trunk.


[Bug web/53919] Version-specific install instructions not available

2012-07-10 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53919

--- Comment #5 from Erik Schnetter  2012-07-10 
19:05:01 UTC ---
Yes, the isl changes are part of what I mean. Since the version-specific
install instructions are already there, they may as well be available on the
web, and/or the web could warn about such possible differences.


[Bug web/53919] Version-specific install instructions not available

2012-07-12 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53919

Erik Schnetter  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|INVALID |

--- Comment #8 from Erik Schnetter  2012-07-12 
14:43:59 UTC ---
I disagree with this assessment that this report is invalid. This is not a
complaint that I couldn't find the right instructions, this is a suggestion for
improving the web site. Since I mentioned several possibilities, I would
appreciate some guidance as to which way you want things to look, and may then
even come up with a patch.

The instructions on the web speak of "Installing GCC", and do not even mention
that they are not version-specific nor for the current release branch. They
also do not point to the install instructions in the tarball. This is
misleading at best.

In particular, the web instructions speak of "old instructions", but refer only
to "really old instructions", not to instructions for previous releases.

Another issue is that the web instructions refer to released versions when
downloading (suggesting that one should install released versions), and at the
same time list prerequisites that are not applicable to any released version.


[Bug target/83531] Build broken on macOS 10.13.2

2019-03-28 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83531

Erik Schnetter  changed:

   What|Removed |Added

 CC||schnetter at gmail dot com

--- Comment #7 from Erik Schnetter  ---
I don't think that people didn't notice. I rather think that they gave up
building the sanitizer. See also
https://github.com/spack/spack/tree/develop/var/spack/repos/builtin/packages/gcc
and
https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/gcc/darwin/headers-10.13-fix.patch
, which includes this fix automatically when GCC is built via Spack.

[Bug bootstrap/89879] New: GCC fails to build on macOS 10.14.4

2019-03-28 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89879

Bug ID: 89879
   Summary: GCC fails to build on macOS 10.14.4
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

Created attachment 46053
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46053&action=edit
patch

After upgrading to macOS 10.14.4, GCC 8.3.0 does not build any more. The issue
is unrelated to Spack; even a vanilla GCC fails to install.

This StackExchange issue
<https://apple.stackexchange.com/questions/355049/compilation-error-with-mojave-error-atomic-does-not-name-a-type/355103#355103>
is a description of the problem including the actual error message. The
underlying problem is that a macOS header file uses the _Atomic keyword for C++
code, although this is only a C keyword. I assume that Clang defines _Atomic
even for C++ code as extension to the C++ standard.

The proper solution is probably adding a fixinclude for GCC.

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-03-28 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #7 from Erik Schnetter  ---
I tried adding a fixinclude that #defines _Atomic to volatile if the system
header is included from C++, and this resolved the issue for me.

A possible implementation is described here
. I plan to submit a proper patch
to GCC next week.

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-03-29 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #16 from Erik Schnetter  ---
The proper way to fix this via fixinclude is to replace declarations such as

_Atomic u_long

with

_Atomic(u_long)

which is still legal in C. In C++, one can then add

#include 
#ifndef _Atomic
#define _Atomic(T) std::atomic< T >
#endif

to create proper C++ code.

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-03 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #20 from Erik Schnetter  ---
I have a patch that works for 8.3.0. It doesn't work for 9.0.0 (i.e. an svn
checkout); I'm working on this.

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-03 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #21 from Erik Schnetter  ---
https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00162.html

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #24 from Erik Schnetter  ---
On Thu, Apr 4, 2019 at 5:43 AM iains at gcc dot gnu.org <
gcc-bugzi...@gcc.gnu.org> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864
>
> --- Comment #22 from Iain Sandoe  ---
> (In reply to Erik Schnetter from comment #21)
> > https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00162.html
>
> Additional to the comments on list.
>
> Perhaps this is just unfixable :(
>
> I suspect that Apple will want to re-release the SDK, and the best real
> fix is
> to use the SDK from  the previous Xcode command line tools (you can still
> use
> the latest tools from XC10.2 - just install the older version somewhere and
> then use --with-sysroot= and/or --sysroot=)
>
> (a) there's no guarantee that _Atomic u_long has the same size or
> alignment as
> volatile u_long.
>
> C11: 6.2.5 Types
> ...
> 27 ... The size, representation, and alignment of an atomic type need not
> be
> the same as those of the corresponding unqualified type.
> ...
>
> .. although it *probably* is for simple types for which there are direct
> atomic
> ops.
>

This is for Apple systems, where they presumably control the ABI, or are at
least aware of the ABI when writing header files.

(b) If we hack around it with "volatile" (assuming that the type happens to
> have the same size and alignment), this will silently fail in any case it's
> used.
>

_Atomic is used only in a single struct, which is marked "this structure
should not be used outside the kernel", and protected by a "#ifdef
__APPLE_API_UNSTABLE" (which unfortunately defaults to being defined). To
my knowledge, no part of GCC will use this structure.

(c) the  header is only available from C++11, AFAIR, and GCC is
> supposed to be boot-strappable with C++98.  Iff Apple were to elect to
> declare
> that the OS *requires* C++11 to operate, then we should fix the
> configuration
> for Darwin to ensure that this is enforced.
>

The header file works for all versions of C and C++. It uses _Atomic for
C11, _Atomic for C++11 (the bug we're seeing), and volatile in all other
cases. There is no requirement for C++11.

(d) In any case, is there any guarantee that the representation of the
> u_long
> as a C++ atomic is the same size and align as its C11 counterpart? (I've
> not
> checked this).
>

This seems to be an ABI question, and I assume Apple checked this on their
ABIs (probably only Intel and ARM). I assume that this is generally the
case as C11 and C++11 atomics were designed at the same time, so the ABI
designers will want to ensure sure.

-erik

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #25 from Erik Schnetter  ---
> On Thu, Apr 4, 2019 at 5:43 AM iains at gcc dot gnu.org <
> gcc-bugzi...@gcc.gnu.org> wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864
> >
> > --- Comment #22 from Iain Sandoe  ---
> > (In reply to Erik Schnetter from comment #21)
> > > https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00162.html
> >
> > Additional to the comments on list.
> >
> > Perhaps this is just unfixable :(
> >
> > I suspect that Apple will want to re-release the SDK, and the best real
> > fix is
> > to use the SDK from  the previous Xcode command line tools (you can
still
> > use
> > the latest tools from XC10.2 - just install the older version somewhere
and
> > then use --with-sysroot= and/or --sysroot=)
> >
> > (a) there's no guarantee that _Atomic u_long has the same size or
> > alignment as
> > volatile u_long.
> >
> > C11: 6.2.5 Types
> > ...
> > 27 ... The size, representation, and alignment of an atomic type need
not
> > be
> > the same as those of the corresponding unqualified type.
> > ...
> >
> > .. although it *probably* is for simple types for which there are direct
> > atomic
> > ops.
> >
>
> This is for Apple systems, where they presumably control the ABI, or are
at
> least aware of the ABI when writing header files.
>
> (b) If we hack around it with "volatile" (assuming that the type happens
to
> > have the same size and alignment), this will silently fail in any case
it's
> > used.
> >
>
> _Atomic is used only in a single struct, which is marked "this structure
> should not be used outside the kernel", and protected by a "#ifdef
> __APPLE_API_UNSTABLE" (which unfortunately defaults to being defined). To
> my knowledge, no part of GCC will use this structure.
>
> (c) the  header is only available from C++11, AFAIR, and GCC is
> > supposed to be boot-strappable with C++98.  Iff Apple were to elect to
> > declare
> > that the OS *requires* C++11 to operate, then we should fix the
> > configuration
> > for Darwin to ensure that this is enforced.
> >
>
> The header file works for all versions of C and C++. It uses _Atomic for
> C11, _Atomic for C++11 (the bug we're seeing), and volatile in all other
> cases. There is no requirement for C++11.
>
> (d) In any case, is there any guarantee that the representation of the
> > u_long
> > as a C++ atomic is the same size and align as its C11 counterpart? (I've
> > not
> > checked this).
> >
>
> This seems to be an ABI question, and I assume Apple checked this on their
> ABIs (probably only Intel and ARM). I assume that this is generally the
> case as C11 and C++11 atomics were designed at the same time, so the ABI
> designers will want to ensure sure.
>
> -erik

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #28 from Erik Schnetter  ---
On Thu, Apr 4, 2019 at 8:11 AM iains at gcc dot gnu.org <
gcc-bugzi...@gcc.gnu.org> wrote:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864
>
> --- Comment #26 from Iain Sandoe  ---
> (In reply to Erik Schnetter from comment #25)
> > > On Thu, Apr 4, 2019 at 5:43 AM iains at gcc dot gnu.org <
> > > gcc-bugzi...@gcc.gnu.org> wrote:
>
> > > _Atomic is used only in a single struct, which is marked "this
structure
> > > should not be used outside the kernel", and protected by a "#ifdef
> > > __APPLE_API_UNSTABLE" (which unfortunately defaults to being
defined). To
> > > my knowledge, no part of GCC will use this structure.
>
> Perhaps this provides an easier fix route:
> a) what causes __APPLE_API_UNSTABLE to be defined?
> b) what uses __APPLE_API_UNSTABLE?
> c) could we undef it locally to solve the issue?

Unfortunately this route does not work:

:

#ifndef __APPLE_API_UNSTABLE
#define __APPLE_API_UNSTABLE
#endif /* __APPLE_API_UNSTABLE */

-erik

> - GCC is not currently claiming to be capable of building the kernel
(I've not
> tried building any kernel > darwin9 with GCC).
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.



--
Erik Schnetter 
http://www.perimeterinstitute.ca/personal/eschnetter/

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #29 from Erik Schnetter  ---
On Thu, Apr 4, 2019 at 8:11 AM iains at gcc dot gnu.org <
gcc-bugzi...@gcc.gnu.org> wrote:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864
>
> --- Comment #26 from Iain Sandoe  ---
> (In reply to Erik Schnetter from comment #25)
> > > On Thu, Apr 4, 2019 at 5:43 AM iains at gcc dot gnu.org <
> > > gcc-bugzi...@gcc.gnu.org> wrote:
>
> > > _Atomic is used only in a single struct, which is marked "this
structure
> > > should not be used outside the kernel", and protected by a "#ifdef
> > > __APPLE_API_UNSTABLE" (which unfortunately defaults to being
defined). To
> > > my knowledge, no part of GCC will use this structure.
>
> Perhaps this provides an easier fix route:
> a) what causes __APPLE_API_UNSTABLE to be defined?
> b) what uses __APPLE_API_UNSTABLE?
> c) could we undef it locally to solve the issue?

Unfortunately this route does not work:

:

#ifndef __APPLE_API_UNSTABLE
#define __APPLE_API_UNSTABLE
#endif /* __APPLE_API_UNSTABLE */

-erik

> - GCC is not currently claiming to be capable of building the kernel
(I've not
> tried building any kernel > darwin9 with GCC).
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.



--
Erik Schnetter 
http://www.perimeterinstitute.ca/personal/eschnetter/

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #31 from Erik Schnetter  ---
Here is an updated version of the patch that fixincludes both
 and , and does not need to touch any sources
files in GCC any more:

Index: fixincludes/inclhack.def
===
--- fixincludes/inclhack.def (revision 270127)
+++ fixincludes/inclhack.def (working copy)
@@ -1298,6 +1298,69 @@ fix = {
 };

 /*
+ *  macOS 10.14.4  uses the C _Atomic keyword in C++
+ *  code, and this file is included by .
+ */
+fix = {
+hackname  = darwin_sysctl3__Atomic;
+mach  = "*-*-darwin18.5.*";
+files = sys/sysctl.h;
+select= "#include ";
+
+c_fix = wrap;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+#  define _Atomic volatile
+ #endif
+
+ _EOArg_;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+ #  undef _Atomic
+ #endif
+
+ _EOArg_;
+
+test_text = "#include \n";
+};
+
+
+/*
+ *  macOS 10.14.4  uses the C _Atomic keyword in C++
+ *  code.
+ */
+fix = {
+hackname  = darwin_ucred__Atomic;
+mach  = "*-*-darwin18.5.*";
+files = sys/ucred.h;
+select= "_Atomic";
+
+c_fix = wrap;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+#  define _Atomic volatile
+ #endif
+
+ _EOArg_;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+ #  undef _Atomic
+ #endif
+
+ _EOArg_;
+
+test_text = "#include \n";
+};
+
+/*
  *  For the AAB_darwin7_9_long_double_funcs fix to be useful,
  *  you have to not use "" includes.
  */


--
Erik Schnetter 
http://www.perimeterinstitute.ca/personal/eschnetter/

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #33 from Erik Schnetter  ---
Here is an updated version of the patch that fixincludes both
 and , and does not need to touch any sources
files in GCC any more:


Index: fixincludes/inclhack.def
===
--- fixincludes/inclhack.def (revision 270127)
+++ fixincludes/inclhack.def (working copy)
@@ -1298,6 +1298,69 @@ fix = {
 };

 /*
+ *  macOS 10.14.4  uses the C _Atomic keyword in C++
+ *  code, and this file is included by .
+ */
+fix = {
+hackname  = darwin_sysctl3__Atomic;
+mach  = "*-*-darwin18.5.*";
+files = sys/sysctl.h;
+select= "#include ";
+
+c_fix = wrap;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+#  define _Atomic volatile
+ #endif
+
+ _EOArg_;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+ #  undef _Atomic
+ #endif
+
+ _EOArg_;
+
+test_text = "#include \n";
+};
+
+
+/*
+ *  macOS 10.14.4  uses the C _Atomic keyword in C++
+ *  code.
+ */
+fix = {
+hackname  = darwin_ucred__Atomic;
+mach  = "*-*-darwin18.5.*";
+files = sys/ucred.h;
+select= "_Atomic";
+
+c_fix = wrap;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+#  define _Atomic volatile
+ #endif
+
+ _EOArg_;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+ #  undef _Atomic
+ #endif
+
+ _EOArg_;
+
+test_text = "#include \n";
+};
+
+/*
  *  For the AAB_darwin7_9_long_double_funcs fix to be useful,
  *  you have to not use "" includes.
  */


-erik

--
Erik Schnetter 
http://www.perimeterinstitute.ca/personal/eschnetter/

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #34 from Erik Schnetter  ---
Here is an updated version of the patch that fixincludes both
 and , and does not need to touch any
sources files in GCC any more:


Index: fixincludes/inclhack.def
===
--- fixincludes/inclhack.def (revision 270127)
+++ fixincludes/inclhack.def (working copy)
@@ -1298,6 +1298,69 @@ fix = {
 };

 /*
+ *  macOS 10.14.4  uses the C _Atomic keyword in C++
+ *  code, and this file is included by .
+ */
+fix = {
+hackname  = darwin_sysctl3__Atomic;
+mach  = "*-*-darwin18.5.*";
+files = sys/sysctl.h;
+select= "#include ";
+
+c_fix = wrap;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+#  define _Atomic volatile
+ #endif
+
+ _EOArg_;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+ #  undef _Atomic
+ #endif
+
+ _EOArg_;
+
+test_text = "#include \n";
+};
+
+
+/*
+ *  macOS 10.14.4  uses the C _Atomic keyword in C++
+ *  code.
+ */
+fix = {
+hackname  = darwin_ucred__Atomic;
+mach  = "*-*-darwin18.5.*";
+files = sys/ucred.h;
+select= "_Atomic";
+
+c_fix = wrap;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+#  define _Atomic volatile
+ #endif
+
+ _EOArg_;
+
+c_fix_arg = <<- _EOArg_
+
+ #if defined(__cplusplus) && __cplusplus >= 201103L
+ #  undef _Atomic
+ #endif
+
+ _EOArg_;
+
+test_text = "#include \n";
+};
+
+/*
  *  For the AAB_darwin7_9_long_double_funcs fix to be useful,
  *  you have to not use "" includes.
  */


--
Erik Schnetter 
http://www.perimeterinstitute.ca/personal/eschnetter/

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #37 from Erik Schnetter  ---
(In reply to Jürgen Reuter from comment #35)
> The latest fix doesn't work. It fails at the darwin-driver.c. So yes, all
> the files mentioned before have to be modified, asan_mac.cc,
> sanitizer_mac.cc,
> sanitizer_platform_limits_posix.cc, darwin-driver.c

Did you regenerate the autogenerated fixinclude files? The patch only contains
the manual changes (according to GCC's patch submission guidelines). You need
to run "./genincludes" (?) in the "fixinclude" directory.

-erik

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-04 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #38 from Erik Schnetter  ---
(In reply to Iain Sandoe from comment #36)
> (In reply to Jürgen Reuter from comment #35)
> > The latest fix doesn't work. It fails at the darwin-driver.c. So yes, all
> > the files mentioned before have to be modified, asan_mac.cc,
> > sanitizer_mac.cc,
> > sanitizer_platform_limits_posix.cc, darwin-driver.c
> 
> that's not how fix includes work;
> the idea is to make a change such that the includes can be used "normally"
> (which isn't 'modify every use site' ;) )
> 
> grep says that ucred.h is included from a *lot* of places in the SDK, so
> "just fixing bootstrap" isn't enough, we need a fix that works when building
> user code too.

I originally assumed that fixincluded files would come first in the include
search path, so that correcting the offending file prevents all problems. That
doesn't seem to be the case -- only files included from user code seem to see
the fixincluded files, whereas system files don't see them. Is this correct? If
so, that certainly makes it difficult to correct a problem, since this would
require all header files that (transitively) include the offending header file
also need to be fixincluded. I'm sure I misunderstand something here.

> I note that there is __APPLE_API_STRICT_CONFORMANCE which gates the
> __APPLE_ABI_UNSTABLE .. perhaps this should be investigated?

I missed this. I'll have a look.

-erik

[Bug bootstrap/89864] [9 regression] gcc fails to build/bootstrap with XCode 10.2

2019-04-09 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89864

--- Comment #46 from Erik Schnetter  ---
The patch does not include the generated files. You need to run "genfixes" in
the "fixincludes" directory after applying the patch.

[Bug rtl-optimization/70471] New: Superfluous move instructions in floating-point instruction sequence

2016-03-30 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70471

Bug ID: 70471
   Summary: Superfluous move instructions in floating-point
instruction sequence
   Product: gcc
   Version: 5.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

In a function consisting of a long chain of floating-point operations, GCC
5.3.0 on Darwin 15.4.0 targeting an "Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz"
with options

-march=native
-Ofast
-fopenmp

generates this sequence (line 6302 ff):

vmovapd %ymm4, %ymm8
vmovapd %ymm4, -10192(%rbp)
vsubpd  %ymm1, %ymm8, %ymm0
vmovapd -4496(%rbp), %ymm8

Am I right with my interpretation that the first vmovapd is strictly
superfluous? The register %ymm8 is set on the first line, is overwritten on the
last line, and is used only once in the third line, where the original value
%ymm4 is still available.

Additional comments:

This is not the only place where inspection by eye indicates superfluous move
instructions; these seem to occur in many places. As you see below, both the
input and the generated code are quite long. Most relevant for performance is
likely the cache footprint. Thus, superfluous instructions are worrisome.



Details:

$ uname -a
Darwin Redshift 15.4.0 Darwin Kernel Version 15.4.0: Fri Feb 26 22:08:05 PST
2016; root:xnu-3248.40.184~3/RELEASE_X86_64 x86_64

$
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
-v
Reading specs from
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64/gcc/x86_64-apple-darwin15.4.0/5.3.0/specs
COLLECT_GCC=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
COLLECT_LTO_WRAPPER=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/libexec/gcc/x86_64-apple-darwin15.4.0/5.3.0/lto-wrapper
Target: x86_64-apple-darwin15.4.0
Configured with:
/Users/eschnett/src/spack/var/spack/stage/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/gcc-5.3.0/configure
--prefix=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42
--libdir=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64
--disable-multilib --enable-languages=fortran,c,java,objc,c++
--with-mpc=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/mpc-1.0.3-kg7pswhyszxa6vbgohqjhy2pywb76gpc
--with-mpfr=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/mpfr-3.1.4-taib2hirt72ggnirqb2brytc4cvp2igf
--with-gmp=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gmp-6.1.0-ld7rtqn2neg3z47mzg2vnexqeet4pz3i
--enable-lto --with-quad
--with-isl=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/isl-0.14-cn4dbzoocjsf2a5jwamnhnverh2hwccr
Thread model: posix
gcc version 5.3.0 (GCC) 

Compiled with:
$
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
-fopenmp -march=native -std=gnu++11 -Ofast -S
ML_BSSN_FD4_EvolutionInterior.cc.i

Pre-processed input "ML_BSSN_FD4_EvolutionInterior.cc.i" (3.5 MByte):
https://gist.github.com/eschnett/10bf0b2b1977348f3e15ae29db871bb0

Compiler output "ML_BSSN_FD4_EvolutionInterior.cc.s" (470 kByte):
https://gist.github.com/eschnett/79d31fe08e9588d28763a9ad5c77ccfa

[Bug rtl-optimization/70471] Superfluous move instructions in floating-point instruction sequence

2016-03-31 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70471

--- Comment #2 from Erik Schnetter  ---
Compiler invocation with "-v" appended:

$
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
-fopenmp -march=native -std=gnu++11 -Ofast -S
ML_BSSN_FD4_EvolutionInterior.cc.i -v
Reading specs from
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64/gcc/x86_64-apple-darwin15.4.0/5.3.0/specs
COLLECT_GCC=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/bin/g++
Target: x86_64-apple-darwin15.4.0
Configured with:
/Users/eschnett/src/spack/var/spack/stage/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/gcc-5.3.0/configure
--prefix=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42
--libdir=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64
--disable-multilib --enable-languages=fortran,c,java,objc,c++
--with-mpc=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/mpc-1.0.3-kg7pswhyszxa6vbgohqjhy2pywb76gpc
--with-mpfr=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/mpfr-3.1.4-taib2hirt72ggnirqb2brytc4cvp2igf
--with-gmp=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gmp-6.1.0-ld7rtqn2neg3z47mzg2vnexqeet4pz3i
--enable-lto --with-quad
--with-isl=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/isl-0.14-cn4dbzoocjsf2a5jwamnhnverh2hwccr
Thread model: posix
gcc version 5.3.0 (GCC) 
COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.11.4' '-fopenmp' '-march=native'
'-std=gnu++11' '-Ofast' '-S' '-v' '-shared-libgcc'

/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/libexec/gcc/x86_64-apple-darwin15.4.0/5.3.0/cc1plus
-fpreprocessed ML_BSSN_FD4_EvolutionInterior.cc.i -march=haswell -mmmx
-mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes
-mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -mbmi2
-mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd
-mf16c -mfsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mxsave -mxsaveopt
-mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1
-mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw
-mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-pcommit
-mno-mwaitx --param l1-cache-size=32 --param l1-cache-line-size=64 --param
l2-cache-size=6144 -mtune=haswell -fPIC -quiet -dumpbase
ML_BSSN_FD4_EvolutionInterior.cc.i -mmacosx-version-min=10.11.4 -auxbase
ML_BSSN_FD4_EvolutionInterior.cc -Ofast -std=gnu++11 -version -fopenmp -o
ML_BSSN_FD4_EvolutionInterior.cc.s
GNU C++11 (GCC) version 5.3.0 (x86_64-apple-darwin15.4.0)
compiled by GNU C version 5.3.0, GMP version 6.1.0, MPFR version 3.1.4,
MPC version 1.0.3
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++11 (GCC) version 5.3.0 (x86_64-apple-darwin15.4.0)
compiled by GNU C version 5.3.0, GMP version 6.1.0, MPFR version 3.1.4,
MPC version 1.0.3
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: e534f97573a0a1c1d792b0a47110beef
COMPILER_PATH=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/libexec/gcc/x86_64-apple-darwin15.4.0/5.3.0/:/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/libexec/gcc/x86_64-apple-darwin15.4.0/5.3.0/:/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/libexec/gcc/x86_64-apple-darwin15.4.0/:/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64/gcc/x86_64-apple-darwin15.4.0/5.3.0/:/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64/gcc/x86_64-apple-darwin15.4.0/
LIBRARY_PATH=/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64/gcc/x86_64-apple-darwin15.4.0/5.3.0/:/Users/eschnett/src/spack/opt/spack/darwin-x86_64/gcc-4.2.1/gcc-5.3.0-j3ujojmhirf6t2mi5enfosb6545duy42/lib64/gcc/x86_64-apple-darwin15.4.0/5.3.0/../../../:/usr/lib/
COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.11.4' '-fopenmp' '-march=native'
'-std=gnu++11' '-Ofast' '-S' '-v' '-shared-libgcc'

[Bug c++/71057] New: ICE in schedule_generic_params_dies_gen, at dwarf2out.c:24142

2016-05-10 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71057

Bug ID: 71057
   Summary: ICE in schedule_generic_params_dies_gen, at
dwarf2out.c:24142
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

A self-built version of GCC
{{{
$
/Users/eschnett/src/spack/opt/spack/darwin-x86_64/clang-7.3.0-apple/gcc-6.1.0-t6ty2xtlhizrs3elacvgpludfccnekb2/bin/g++
--version
g++ (GCC) 6.1.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
}}}

aborts with an ICE for this command
{{{
Users/eschnett/src/spack/opt/spack/darwin-x86_64/clang-7.3.0-apple/gcc-6.1.0-t6ty2xtlhizrs3elacvgpludfccnekb2/bin/g++
-g -S fun_test.ii
}}}

with this error
{{{
fun/fun_test.cc:41:1: internal compiler error: in
schedule_generic_params_dies_gen, at dwarf2out.c:24142
 }
 ^

fun/fun_test.cc:41:1: internal compiler error: Abort trap: 6
g++: internal compiler error: Abort trap: 6 (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
}}}

for the preprocessed C++ source code available at
<https://gist.github.com/eschnett/1825784e6da47b80741ddbbd0d742db4>.

Leaving out the "-g" flag avoids the error.

This is on OS X with an x86-64 CPU.

[Bug target/80360] New: internal compiler error: in int_mode_for_mode, at stor-layout.c:405

2017-04-07 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80360

Bug ID: 80360
   Summary: internal compiler error: in int_mode_for_mode, at
stor-layout.c:405
   Product: gcc
   Version: 6.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

Created attachment 41155
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41155&action=edit
Gzipped preprocessed failing source code

I encounter an ICE with SIMD vector intrinsics for Intel's Knight's Landing.

{{{
$ /project/projectdirs/m152/schnette/cori/src/spack-view/bin/g++ --version
g++ (GCC) 6.3.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
}}}

{{{
$ /project/projectdirs/m152/schnette/cori/src/spack-view/bin/g++ -march=knl -O
-S test.ii
In file included from
/global/project/projectdirs/m152/schnette/cori/src/spack-view/lib64/gcc/x86_64-pc-linux-gnu/6.3.0/include/immintrin.h:45:0,
 from
/global/project/projectdirs/m152/schnette/cori-knl/Cvanilla/arrangements/CactusUtils/Vectors/src/vectors-4-AVX.h:9,
 from
/global/project/projectdirs/m152/schnette/cori-knl/Cvanilla/arrangements/CactusUtils/Vectors/src/vectors.h:16,
 from
/global/project/projectdirs/m152/schnette/cori-knl/Cvanilla/arrangements/CactusUtils/Vectors/src/test.cc:1:
/global/project/projectdirs/m152/schnette/cori/src/spack-view/lib64/gcc/x86_64-pc-linux-gnu/6.3.0/include/avx512fintrin.h:
In function 'void Vectors_Test(cGH*)':
/global/project/projectdirs/m152/schnette/cori/src/spack-view/lib64/gcc/x86_64-pc-linux-gnu/6.3.0/include/avx512fintrin.h:10018:48:
internal compiler error: in int_mode_for_mode, at stor-layout.c:405
   return (__mmask16) __builtin_ia32_kortestchi ((__mmask16) __A,
  ~~^
   (__mmask16) __B);
   
0xac3ff7 int_mode_for_mode(machine_mode)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/stor-layout.c:405
0x8638ce emit_move_via_integer
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/expr.c:3137
0x86b93a emit_move_insn_1(rtx_def*, rtx_def*)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/expr.c:3518
0x86bc94 emit_move_insn(rtx_def*, rtx_def*)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/expr.c:3586
0x854202 copy_to_reg(rtx_def*)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/explow.c:582
0xd520bd ix86_expand_builtin
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/config/i386/i386.c:41506
0x77753c expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/builtins.c:5626
0x868f70 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/expr.c:10624
0x871d39 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, bool,
tree_node*)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/expr.c:5406
0x872a8f expand_assignment(tree_node*, tree_node*, bool)
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/expr.c:5175
0x792c94 expand_call_stmt
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/cfgexpand.c:2658
0x792c94 expand_gimple_stmt_1
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/cfgexpand.c:3548
0x792c94 expand_gimple_stmt
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/cfgexpand.c:3714
0x794635 expand_gimple_basic_block
   
/global/project/projectdirs/m152/schnette/cori/src/spack/var/spack/stage/gcc-6.3.0-vqq6v5pzhzo6lz3d2phtyirsatu5cbj7/gcc-6.3.0/gcc/cfgexp

[Bug web/78315] New: "Changes" don't explain what "LRA" is

2016-11-11 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78315

Bug ID: 78315
   Summary: "Changes" don't explain what "LRA" is
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
      Reporter: schnetter at gmail dot com
  Target Milestone: ---

The page <https://gcc.gnu.org/gcc-7/changes.html> points to the GCC wiki page
<https://gcc.gnu.org/wiki/LRAIsDefault> "LRA by default". However, this page
never explains what "LRA" actually is. Since this wiki page is mentioned so
prominently, it could spend a paragraph explaining the acronym.

[Bug web/64469] New: Broken link on main page

2015-01-01 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64469

Bug ID: 64469
   Summary: Broken link on main page
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: web
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com

The link to <https://gcc.gnu.org/gcc-5/> on the main page gcc.gnu.org is
broken. It leads to the error message "You don't have permission to access
/gcc-5/ on this server.".


[Bug target/64897] New: Floating-point "and" not optimized on x86-64

2015-02-01 Thread schnetter at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64897

Bug ID: 64897
   Summary: Floating-point "and" not optimized on x86-64
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com

I notice that gcc does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include 
#include 
double fand1(double x)
{
  unsigned long ix;
  memcpy(&ix, &x, 8);
  ix &= 0x7fffUL;
  memcpy(&x, &ix, 8);
  return x;
}
double fand2(double x)
{
  return fabs(x);
}
}}}

When I compile this via:
{{{
gcc-mp-4.9 -O3 -march=native -S fand.c -o fand-gcc-4.9.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1:
movabsq$9223372036854775807, %rax
vmovd%xmm0, %rdx
andq%rdx, %rax
vmovd%rax, %xmm0
ret
_fand2:
vmovsdLC1(%rip), %xmm1
vandpd%xmm1, %xmm0, %xmm0
ret
}}}

This shows that (a) gcc performs the bitwise and operation in an integer
register, which is probably slower, and (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.


[Bug c++/48015] New: ICE: unexpected expression 'std::min' of kind overload

2011-03-07 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48015

   Summary: ICE: unexpected expression 'std::min' of kind overload
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


Created attachment 23568
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23568
bzipp2d preprocessed source code

Using gcc version:

$ g++-mp-4.6 --version
g++-mp-4.6 (GCC) 4.6.0 20110305 (experimental)

I execute the command:

$ g++-mp-4.6 -fopenmp -Wall -g3 -m128bit-long-double -march=native -std=gnu++0x
-fbounds-check -fstack-protector-all -ftrapv -O0 -fopenmp -Wall -Wshadow
-Wpointer-arith -Wcast-qual -Wcast-align -Woverloaded-virtual -c data.ii

and receive the error message:

/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetLib/src/data.cc:
In function 'void call_operator(void (*)(const T*, const ivect3&, T*, const
ivect3&, const ibbox3&, const ibbox3&, const ibbox3&), const T*, const ivect3&,
T*, const ivect3&, const ibbox3&, const ibbox3&, const ibbox3&)':
/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetLib/src/data.cc:80:56:
internal compiler error: unexpected expression 'std::min' of kind overload
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

The same code compiles fine with g++ 4.5.2.


[Bug c++/48015] ICE: unexpected expression 'std::min' of kind overload

2011-03-07 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48015

--- Comment #2 from Erik Schnetter  2011-03-07 
15:14:00 UTC ---
Created attachment 23570
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23570
Same source code preprocessed with gcc 4.5.2

If I use gcc 4.5.2:

$ g++-mp-4.5  --version
g++-mp-4.5 (GCC) 4.5.2

things work fine. I added an attachment containing the same original source
code preprocessed with gcc 4.5.2 (as "data-gcc45.ii.bz2"). If I compile this
preprocessed source code with:

$ g++-mp-4.5 -fopenmp -Wall -g3 -m128bit-long-double -march=native -std=gnu++0x
-fbounds-check -fstack-protector-all -ftrapv -O0 -fopenmp -Wall -Wshadow
-Wpointer-arith -Wcast-qual -Wcast-align -Woverloaded-virtual -c data-gcc45.ii

I see no screen output, and the object file is produced fine.


[Bug rtl-optimization/48037] New: Missed optimization: unnecessary register moves

2011-03-08 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48037

   Summary: Missed optimization: unnecessary register moves
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


Created attachment 23587
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23587
Source code

I want to perform certain operations on an SSE double precision vector. I am
using the intrinsics offered by emmintrin.h to decompose the vector into two
scalars, perform the operation on both elements, and reconstruct the vector. As
example operation I calculate the square root using scalar instructions. I am
aware that there is a vector instruction for this; I am only using this as a
placeholder to simplify the code.

I use gcc 4.6.0:

$ g++-mp-4.6 --version
g++-mp-4.6 (GCC) 4.6.0 20110305 (experimental)

on a MacBook Pro:

$ uname -a
Darwin erik-schnetters-macbook-pro.local 10.6.0 Darwin Kernel Version 10.6.0:
Wed Nov 10 18:13:17 PST 2010; root:xnu-1504.9.26~3/RELEASE_I386 i386

with a 2.66 GHz Intel Core i7 processor and I compile with the options

$ g++-mp-4.6 -S -O3 -march=native -ffast-math vecmath.cc

I tried four different ways of extracting the scalars for the vector, and I
find that gcc generates unnecessary register-register moves in almost every
case.


[Bug rtl-optimization/48037] Missed optimization: unnecessary register moves

2011-03-08 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48037

--- Comment #1 from Erik Schnetter  2011-03-09 
03:32:42 UTC ---
Created attachment 23588
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23588
Generated assembler code


[Bug c++/47808] internal compiler error: in tsubst_copy_and_build, at cp/pt.c:13326

2011-03-09 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47808

Erik Schnetter  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #1 from Erik Schnetter  2011-03-09 
12:50:20 UTC ---
I can compile this source file without problems with

g++ (GCC) 4.6.0 20110308 (experimental)


[Bug c++/47808] internal compiler error: in tsubst_copy_and_build, at cp/pt.c:13326

2011-03-09 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47808

Erik Schnetter  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|FIXED   |

--- Comment #2 from Erik Schnetter  2011-03-09 
12:55:28 UTC ---
Strangely, I now receive the same error message for a different source file. I
am therefore reopening this bug report instead of opening a new one.

I am using:

g++ (GCC) 4.6.0 20110308 (experimental)

When I execute:

$ /Users/eschnett/gcc/bin/g++ -fopenmp -Wall -g3 -m128bit-long-double
-march=native -std=gnu++0x -fbounds-check -fstack-protector-all -ftrapv -O0
-fopenmp -Wall -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align
-Woverloaded-virtual -c iobasic.ii

I receive the error message:

/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetIOBasic/src/iobasic.cc:
In function ‘bool CarpetIOBasic::UseScientificNotation(const T&) [with T =
int]’:
/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetLib/src/typecase.hh:149:118:
  instantiated from here
/Users/eschnett/EinsteinToolkit-hg/arrangements/Carpet/CarpetIOBasic/src/iobasic.cc:703:22:
internal compiler error: in tsubst_copy_and_build, at cp/pt.c:13386

I can compile the same source code without problems with g++ 4.5.2.


[Bug c++/30952] Unclear error message when callling via a function pointer

2011-03-09 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30952

--- Comment #2 from Erik Schnetter  2011-03-09 
12:56:16 UTC ---
Created attachment 23596
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23596
failing source code


[Bug c++/30952] Unclear error message when callling via a function pointer

2011-03-09 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30952

Erik Schnetter  changed:

   What|Removed |Added

  Attachment #23596|0   |1
is obsolete||

--- Comment #3 from Erik Schnetter  2011-03-09 
12:57:23 UTC ---
Comment on attachment 23596
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23596
failing source code

I attached this source file to the wrong bug report. Please disregard it.


[Bug c++/47808] internal compiler error: in tsubst_copy_and_build, at cp/pt.c:13326

2011-03-09 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47808

Erik Schnetter  changed:

   What|Removed |Added

  Attachment #23398|0   |1
is obsolete||

--- Comment #3 from Erik Schnetter  2011-03-09 
12:58:17 UTC ---
Created attachment 23597
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23597
failing source code


[Bug c++/47808] internal compiler error: in tsubst_copy_and_build, at cp/pt.c:13326

2011-03-09 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47808

Erik Schnetter  changed:

   What|Removed |Added

  Attachment #23597|0   |1
is obsolete||

--- Comment #6 from Erik Schnetter  2011-03-09 
13:47:38 UTC ---
Created attachment 23600
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23600
failing source code

I reduced the source code slightly and used -DNDEBUG. This code fails with the
simplified command

/Users/eschnett/gcc/bin/g++ -std=gnu++0x -c iobasic.ii


[Bug tree-optimization/48098] New: internal compiler error: in build_vector_from_val, at tree.c:1380

2011-03-12 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48098

   Summary: internal compiler error: in build_vector_from_val, at
tree.c:1380
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: schnet...@gmail.com


Created attachment 23638
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23638
Failing preprocessed source coden

I am using

$ /Users/eschnett/gcc/bin/gcc --version
gcc (GCC) 4.6.0 20110311 (experimental)

$ svn info
Path: .
URL: svn://gcc.gnu.org/svn/gcc/trunk
Repository Root: svn://gcc.gnu.org/svn/gcc
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 170879
Node Kind: directory
Schedule: normal
Last Changed Author: jsm28
Last Changed Rev: 170879
Last Changed Date: 2011-03-11 11:51:29 -0500 (Fri, 11 Mar 2011)

I issue the command

$ /Users/eschnett/gcc/bin/gcc -std=gnu99 -O3 -c rotatingsymmetry180.i

and I receive the error message

/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/RotatingSymmetry180/src/rotatingsymmetry180.c:
In function ‘BndRot180VI’:
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/RotatingSymmetry180/src/rotatingsymmetry180.c:495:11:
warning: passing argument 1 of ‘free’ discards ‘restrict’ qualifier from
pointer target type [enabled by default]
/usr/include/stdlib.h:160:6: note: expected ‘void *’ but argument is of type
‘struct slabsetup * restrict* restrict’
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/RotatingSymmetry180/src/rotatingsymmetry180.c:514:12:
warning: passing argument 5 of ‘Slab_MultiTransfer_Apply’ from incompatible
pointer type [enabled by default]
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/Slab/src/slab.h:113:1:
note: expected ‘const void * const restrict* const restrict’ but argument is of
type ‘void * restrict* restrict’
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/RotatingSymmetry180/src/rotatingsymmetry180.c:521:12:
warning: passing argument 7 of ‘Slab_MultiTransfer’ from incompatible pointer
type [enabled by default]
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/Slab/src/slab.h:128:1:
note: expected ‘const void * const restrict* const restrict’ but argument is of
type ‘void * restrict* restrict’
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/RotatingSymmetry180/src/rotatingsymmetry180.c:544:10:
warning: passing argument 10 of ‘Hyperslab_GlobalMappingByIndex’ from
incompatible pointer type [enabled by default]
/Users/eschnett/EinsteinToolkit-hg/configs/sim-gcc46/bindings/include/RotatingSymmetry180_Prototypes.h:61:5:
note: expected ‘int *’ but argument is of type ‘int (*)[3]’
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/RotatingSymmetry180/src/rotatingsymmetry180.c:716:3:
warning: passing argument 1 of ‘free’ discards ‘restrict’ qualifier from
pointer target type [enabled by default]
/usr/include/stdlib.h:160:6: note: expected ‘void *’ but argument is of type
‘void * restrict* restrict’
/Users/eschnett/EinsteinToolkit-hg/arrangements/CactusNumerical/RotatingSymmetry180/src/rotatingsymmetry180.c:19:5:
internal compiler error: in build_vector_from_val, at tree.c:1380
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

The same code compiles fine with gcc 4.5.2. It also compiles fine if I use -O2
instead of -O3.


[Bug tree-optimization/48098] internal compiler error: in build_vector_from_val, at tree.c:1380

2011-03-12 Thread schnetter at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48098

--- Comment #1 from Erik Schnetter  2011-03-12 
23:30:41 UTC ---
Here is a reduced test case:

void BndRot180VI ()
{
  static char *restrict * slab_setups;
  static int num_slab_setups = 0;
  num_slab_setups = GetRefinementLevels (0);
  slab_setups = malloc (num_slab_setups);
  for (int rl=0; rl

[Bug target/99912] New: Unnecessary / inefficient spilling of AVX2 ymm registers

2021-04-04 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

Bug ID: 99912
   Summary: Unnecessary / inefficient spilling of AVX2 ymm
registers
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

Created attachment 50507
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50507&action=edit
Compressed preprocessed source code

I am using "g++ (Spack GCC) 11.0.1 20210404 (experimental)" (fresh checkout) on
MacOS 11.2.3 with a x86-64 Skylake CPU.

I am manually SIMD-vectorizing a loop kernel using AVX2 intrinsics. The
generated code is correct, but has obvious inefficiencies. I find these issues:

1. There are spills (?) of AVX2 ymm registers that are overwritten by another
spill a few instructions later, without being read in the mean time

2. The same register is spilled into multiple stack slots in consecutive
instructions

3. After spilling an ymm register, the stack slot is copied to another stack
slot, using xmm registers (i.e. using two loads/stores)

I tried to reproduce the issue in a small example, but failed. If this issue is
really due to spilling, then it might not be possible to have a small test
case.



Here is an example of issues 1 and 2; I show a few lines from the attached
disassembled file to clarify:
{{{
1520: c5 fd 29 8c 24 a0 24 00 00vmovapd %ymm1, 9376(%rsp)
1529: c5 fd 29 8c 24 20 29 00 00vmovapd %ymm1, 10528(%rsp)
1532: c5 fd 29 b4 24 80 28 00 00vmovapd %ymm6, 10368(%rsp)
153b: c5 fd 29 ac 24 a0 28 00 00vmovapd %ymm5, 10400(%rsp)
1544: c5 fd 29 a4 24 c0 28 00 00vmovapd %ymm4, 10432(%rsp)
154d: c5 fd 29 9c 24 e0 28 00 00vmovapd %ymm3, 10464(%rsp)
1556: c5 fd 29 94 24 00 29 00 00vmovapd %ymm2, 10496(%rsp)
155f: c4 a2 1d 2d 34 30 vmaskmovpd  (%rax,%r14), %ymm12,
%ymm6
1565: 48 8b 84 24 00 05 00 00   movq1280(%rsp), %rax
156d: c5 fd 29 b4 24 00 24 00 00vmovapd %ymm6, 9216(%rsp)
1576: c4 a2 1d 2d 2c 30 vmaskmovpd  (%rax,%r14), %ymm12,
%ymm5
157c: 48 8b 84 24 38 07 00 00   movq1848(%rsp), %rax
1584: c5 fd 29 ac 24 20 24 00 00vmovapd %ymm5, 9248(%rsp)
158d: c4 a2 1d 2d 24 30 vmaskmovpd  (%rax,%r14), %ymm12,
%ymm4
1593: 48 8b 84 24 60 04 00 00   movq1120(%rsp), %rax
159b: c5 fd 29 a4 24 40 24 00 00vmovapd %ymm4, 9280(%rsp)
15a4: c4 a2 1d 2d 1c 30 vmaskmovpd  (%rax,%r14), %ymm12,
%ymm3
15aa: 48 8b 84 24 68 04 00 00   movq1128(%rsp), %rax
15b2: c5 fd 29 9c 24 60 24 00 00vmovapd %ymm3, 9312(%rsp)
15bb: c4 a2 1d 2d 14 30 vmaskmovpd  (%rax,%r14), %ymm12,
%ymm2
15c1: c5 fd 29 94 24 80 24 00 00vmovapd %ymm2, 9344(%rsp)
15ca: 48 8b 84 24 08 05 00 00   movq1288(%rsp), %rax
15d2: c4 a2 1d 2d 0c 30 vmaskmovpd  (%rax,%r14), %ymm12,
%ymm1
15d8: 48 8b 84 24 70 04 00 00   movq1136(%rsp), %rax
15e0: c5 fd 29 8c 24 a0 24 00 00vmovapd %ymm1, 9376(%rsp)
15e9: c5 fd 29 b4 24 40 29 00 00vmovapd %ymm6, 10560(%rsp)
15f2: c5 fd 29 ac 24 60 29 00 00vmovapd %ymm5, 10592(%rsp)
15fb: c5 fd 29 a4 24 80 29 00 00vmovapd %ymm4, 10624(%rsp)
1604: c5 fd 29 9c 24 a0 29 00 00vmovapd %ymm3, 10656(%rsp)
160d: c5 fd 29 94 24 c0 29 00 00vmovapd %ymm2, 10688(%rsp)
1616: c5 fd 29 8c 24 e0 29 00 00vmovapd %ymm1, 10720(%rsp)
}}}

The beginning and end of this sample are what I think might be spill
instructions. The instruction at 1520 writes to 9376(%rsp), and the instruction
at 15e0 overwrites this stack slot. Also, the register %ymm1 is written
multiple times to different stack slots. (That by itself could be fine, but it
looks strange.)

A few instructions later I find this code:
{{{
16d7: c5 79 6f 84 24 80 28 00 00vmovdqa 10368(%rsp), %xmm8
16e0: c5 79 6f ac 24 20 29 00 00vmovdqa 10528(%rsp), %xmm13
16e9: c5 79 7f 84 24 e0 19 00 00vmovdqa %xmm8, 6624(%rsp)
16f2: c5 79 6f 84 24 90 28 00 00vmovdqa 10384(%rsp), %xmm8
16fb: c5 79 7f ac 24 80 1a 00 00vmovdqa %xmm13, 6784(%rsp)
1704: c5 79 7f 84 24 f0 19 00 00vmovdqa %xmm8, 6640(%rsp)
}}}
This copies the 32 bytes at 10368(%rsp) (written above), but uses %xmm8 to copy
the stack slot in 16-byte chunks. This shouldn't happen; there is no reason to
copy from one stack slot to another (presumably, since I know the code, but I
could be mistaken here). There is also no reason to copy in 16-byte chunks.
(All relevant local variables are ultimately of type __m256d, wrapped in C++
structs, and should thus be correctly aligned.)



To give some background information: The loop is quite large; it is part of a
complex numerical k

[Bug target/99912] Unnecessary / inefficient spilling of AVX2 ymm registers

2021-04-04 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

--- Comment #1 from Erik Schnetter  ---
Created attachment 50508
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50508&action=edit
Compressed disassembled object file

[Bug target/99912] Unnecessary / inefficient spilling of AVX2 ymm registers

2021-04-04 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

--- Comment #2 from Erik Schnetter  ---
I did not describe the scale of the issue. There are more than just a few
inefficient or unnecessary operations:

The loop kernel (a single basic block) extends from address 0x1240 to 0xbf27 in
the attached disassembled object file.

Out of about 6000 instructions in the loop, 1000 are inefficient (and likely
superfluous) moves that copy one 32-byte stack slot into another, using 16-byte
wide copies.

For example, the stack slot 9376(%rsp) is written 9 times in the loop kernel,
but is read only once.

[Bug target/99912] Unnecessary / inefficient spilling of AVX2 ymm registers

2021-04-06 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

--- Comment #4 from Erik Schnetter  ---
I build with the compiler options

/Users/eschnett/src/CarpetX/Cactus/view-compilers/bin/g++  -fopenmp -Wall -pipe
-g -march=skylake -std=gnu++17 -O3 -fcx-limited-range -fexcess-precision=fast
-fno-math-errno -fno-rounding-math -fno-signaling-nans
-funsafe-math-optimizations   -c -o configs/sim/build/Z4c/rhs.cxx.o
configs/sim/build/Z4c/rhs.cxx.ii

One of the kernels in question (the one I describe above) is the C++ lambda in
lines 281013 to 281119. The call to the "noinline" function ensures that the
kernel (and surrounding for loops) is compiled as a separate function, which
produces more efficient code. The function "grid.loop_int_device" contains
essentially three nested for loops, and the actual kernel is the C++ lambda in
lines 281015 to 281118.

I'll have a look at -fdump-tree-optimized.

[Bug target/99912] Unnecessary / inefficient spilling of AVX2 ymm registers

2021-04-06 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

--- Comment #5 from Erik Schnetter  ---
As you suggested, the problem is probably not caused by register spills, but by
stores into a struct that are not optimized away. In this case, the respective
struct elements are unused in the code.

I traced the results of the first __builtin_ia32_maskloadpd256:

  _63940 = __builtin_ia32_maskloadpd256 (_63955, prephitmp_86203);
  MEM  [(struct mat3 *)&vars + 992B] = _63940;
  _178613 = .FMA (_63940, _64752, _178609);
  MEM  [(struct mat3 *)&vars + 1312B] = _63940;

The respective struct locations (+ 992B, + 1312B) are indeed not used anywhere
else.

The struct is of type z4c_vars. It (and its parent) are defined in lines 279837
to 280818. It is large.

Is there e.g. a parameter I could set to make GCC try harder avoid unnecessary
stores?

[Bug tree-optimization/100102] New: ICE in tsubst, at cp/pt.c:15310

2021-04-15 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

Bug ID: 100102
   Summary: ICE in tsubst, at cp/pt.c:15310
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

I am using GCC 10.3.0 on x86_64 GNU/Linux. GCC was built via Spack, and is
called from nvcc.

I encounter the following ICE:

cd
/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-build-eiivnj5/Src
&&
/home/eschnetter/src/CarpetX/spack/opt/spack/linux-ubuntu18.04-skylake_avx512/gcc-10.3.0/cuda-11.2.2-jbyezwujy3vielujb4xz3izwi6q36jnb/bin/nvcc
-forward-unknown-to-host-compiler
-ccbin=/home/eschnetter/src/CarpetX/Cactus/view-cuda-compilers/bin/g++
-Damrex_EXPORTS
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/Base
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/Boundary
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/AmrCore
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/Amr
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/LinearSolvers/MLMG
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/LinearSolvers/Projections
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/Particle
-I/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-build-eiivnj5
-isystem=/home/eschnetter/src/CarpetX/spack/opt/spack/linux-ubuntu18.04-skylake_avx512/gcc-10.3.0/openmpi-4.0.5-jl7qr7jpt3fe6z5rdfkgj2n4t5b4xbdn/include
-isystem=/home/eschnetter/src/CarpetX/spack/opt/spack/linux-ubuntu18.04-skylake_avx512/gcc-10.3.0/hdf5-1.10.7-gkflrn3su7geakoyly56sqebg2pqa2yr/include
-isystem=/home/eschnetter/src/CarpetX/spack/opt/spack/linux-ubuntu18.04-skylake_avx512/gcc-10.3.0/zlib-1.2.11-dd2emzewyp4o4c22f3niqq3dyhjhqkzs/include
-m64 --expt-relaxed-constexpr --expt-extended-lambda
-Wno-deprecated-gpu-targets -gencode=arch=compute_75,code=sm_75
-maxrregcount=255 -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored
--use_fast_math -Xcudafe --display_error_number --Wext-lambda-captures-this
--Werror ext-lambda-captures-this --Werror cross-execution-space-call
--generate-line-info --source-in-ptx -O2 -g -DNDEBUG -Xcompiler=-fPIC
-Xcompiler=-fopenmp -Xcompiler=-Werror=return-type -Xcompiler -pthread
-std=c++14 -MD -MT Src/CMakeFiles/amrex.dir/Base/AMReX_BlockMutex.cpp.o -MF
CMakeFiles/amrex.dir/Base/AMReX_BlockMutex.cpp.o.d -x cu -dc
/tmp/eschnetter/spack-stage/spack-stage-amrex-21.04-eiivnj5bgmpnqg6o7ofgmy4yvdfgxasa/spack-src/Src/Base/AMReX_BlockMutex.cpp
-o CMakeFiles/amrex.dir/Base/AMReX_BlockMutex.cpp.o
/home/eschnetter/src/CarpetX/spack/opt/spack/linux-ubuntu18.04-skylake_avx512/gcc-10.1.0/gcc-10.3.0-74t7ecp2jgn6myrtnrziqo5hg6bncbb4/include/c++/10.3.0/chrono:
In substitution of 'template template using __is_harmonic =
std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep,
_Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den /
std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))),
((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den,
_Period::den)) * (_Period::num / std::chrono::duration<_Rep,
_Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 =
_Period2; _Rep = _Rep; _Period = _Period]':
/home/eschnetter/src/CarpetX/spack/opt/spack/linux-ubuntu18.04-skylake_avx512/gcc-10.1.0/gcc-10.3.0-74t7ecp2jgn6myrtnrziqo5hg6bncbb4/include/c++/10.3.0/chrono:473:154:
  required from here
/home/eschnetter/src/CarpetX/spack/opt/spack/linux-ubuntu18.04-skylake_avx512/gcc-10.1.0/gcc-10.3.0-74t7ecp2jgn6myrtnrziqo5hg6bncbb4/include/c++/10.3.0/chrono:428:27:
internal compiler error: Segmentation fault
  428 |  _S_gcd(intmax_t __m, intmax_t __n) noexcept
  |   ^~
0xc5d6af crash_signal
   
/tmp/eschnetter/spack-stage/spack-stage-gcc-10.3.0-74t7ecp2jgn6myrtnrziqo5hg6bncbb4/spack-src/gcc/toplev.c:328
0x754d6d tsubst(tree_node*, tree_node*, int, tree_node*)
   
/tmp/eschnetter/spack-stage/spack-stage-gcc-10.3.0-74t7ecp2jgn6myrtnrziqo5hg6bncbb4/spack-src/gcc/cp/pt.c:15310
0x767d76 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
   
/tmp/eschnetter/spack-stage/spack-stage-gcc-10.3.0-74t7ecp2jgn6myrtnrziqo5hg6bncbb4/spack-src/gcc/cp/pt.c:13225
0x760766 tsubst_aggr_type
   
/tmp/eschnetter/spack-stage/spack-stage-gcc-10.3.0-74t7ecp2jgn6myrtnrziqo5hg6bncbb4/spack-src/gcc/cp/pt.c:13428
0x76aa5f tsubst_f

[Bug tree-optimization/100102] ICE in tsubst, at cp/pt.c:15310

2021-04-15 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

--- Comment #1 from Erik Schnetter  ---
Created attachment 50605
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50605&action=edit
Compressed preprocessed source code

[Bug tree-optimization/100102] [8/9/10/11 Regression] ICE in tsubst, at cp/pt.c:15310

2021-04-16 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102

--- Comment #6 from Erik Schnetter  ---
I looked for the string "GCC" in the user header files, but could not find any
place where things would differ between GCC 10.2 and 10.3. I assume there could
be a difference in GCC-provided header files (the error message mentions
"chrono" and "gcd"), or it could be that nvcc examines the GCC version and
produces different code.

[Bug target/99912] Unnecessary / inefficient spilling of AVX2 ymm registers

2021-04-27 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99912

--- Comment #11 from Erik Schnetter  ---
The number of active local variables is likely much larger than the number of
registers, and I expect there to be a lot of spilling. I hope that the compiler
is clever about changing the order in which expressions are evaluated to reduce
spilling as much as possible.

Because the loop is so large, I split it into two, each calculating about half
of the output variables. The code here looks at one of the loops. To simplify
the code, each loop still loads all variables (via masked loads), but may not
use all of them. The unused masked loads do not surprise me per se, but I
expect the compiler to remove them.

[Bug driver/100347] New: GCC 11 does not recognize skylake; translates "march=native" to "x86_64"

2021-04-29 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100347

Bug ID: 100347
   Summary: GCC 11 does not recognize skylake; translates
"march=native" to "x86_64"
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schnetter at gmail dot com
  Target Milestone: ---

I just built GCC 11.1.0 (via Spack). I find that "-march=native" does not work
any more. It used to work with GCC 10.3 and earlier. The symptom is that
manually vectorized code does not compile any more.

This demonstrates the problem:

$ ./view-compilers/bin/gcc -march=native -Q --help=target | grep march
  -march=   x86-64
  Known valid arguments for -march= option:

This outputs "x86-64" where I expect "skylake".

$ ./view-compilers/bin/gcc --version
gcc (Spack GCC) 11.1.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

It works with an older version of GCC:

$ /opt/local/bin/gcc -march=native -Q --help=target | grep march
  -march=   skylake
  Known valid arguments for -march= option:

$ /opt/local/bin/gcc --version
gcc (MacPorts gcc10 10.3.0_0) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

My system is a

$ uname -a
Darwin redshift.local 20.4.0 Darwin Kernel Version 20.4.0: Fri Mar  5 01:14:14
PST 2021; root:xnu-7195.101.1~3/RELEASE_X86_64 x86_64 i386 MacBookPro15,1
Darwin

[Bug driver/100347] GCC 11 does not recognize skylake; translates "march=native" to "x86_64"

2021-04-29 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100347

--- Comment #1 from Erik Schnetter  ---
Forgot to add: When I explicitly use "-march=skylake", everything works as
expected.

[Bug target/100347] [11/12 Regression] GCC 11 does not recognize skylake; translates "march=native" to "x86_64"

2021-04-30 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100347

--- Comment #5 from Erik Schnetter  ---
This is my hardware configuration:

$ sysctl -a | grep machdep.cpu
machdep.cpu.address_bits.physical: 39
machdep.cpu.address_bits.virtual: 48
machdep.cpu.arch_perf.events: 0
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.fixed_number: 3
machdep.cpu.arch_perf.fixed_width: 48
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.version: 4
machdep.cpu.arch_perf.width: 48
machdep.cpu.cache.L2_associativity: 4
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.size: 256
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.linesize_max: 64
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.sub_Cstates: 286531872
machdep.cpu.thermal.ACNT_MCNT: 1
machdep.cpu.thermal.core_power_limits: 1
machdep.cpu.thermal.dynamic_acceleration: 1
machdep.cpu.thermal.energy_policy: 1
machdep.cpu.thermal.fine_grain_clock_mod: 1
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.package_thermal_intr: 1
machdep.cpu.thermal.sensor: 1
machdep.cpu.thermal.thresholds: 2
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.data.small_level1: 64
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tsc_ccc.denominator: 2
machdep.cpu.tsc_ccc.numerator: 216
machdep.cpu.xsave.extended_state: 31 832 1088 0
machdep.cpu.xsave.extended_state1: 15 832 256 0
machdep.cpu.brand: 0
machdep.cpu.brand_string: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
machdep.cpu.core_count: 6
machdep.cpu.cores_per_package: 8
machdep.cpu.extfamily: 0
machdep.cpu.extfeature_bits: 1241984796928
machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP
TSCI
machdep.cpu.extmodel: 9
machdep.cpu.family: 6
machdep.cpu.feature_bits: 9221960262849657855
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA
CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ
DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC
MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_feature_bits: 43806655 1073741824
machdep.cpu.leaf7_feature_bits_edx: 2617255424
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 HLE AVX2 SMEP
BMI2 ERMS INVPCID RTM FPU_CSDS MPX RDSEED ADX SMAP CLFSOPT IPT SGXLC MDCLEAR
TSXFA IBRS STIBP L1DF SSBD
machdep.cpu.logical_per_package: 16
machdep.cpu.max_basic: 22
machdep.cpu.max_ext: 2147483656
machdep.cpu.microcode_version: 222
machdep.cpu.model: 158
machdep.cpu.processor_flag: 5
machdep.cpu.signature: 591594
machdep.cpu.stepping: 10
machdep.cpu.thread_count: 12
machdep.cpu.vendor: GenuineIntel

[Bug target/100347] [11/12 Regression] GCC 11 does not recognize skylake; translates "march=native" to "x86_64"

2021-05-06 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100347

--- Comment #12 from Erik Schnetter  ---
Yes, GCC 10.3 (built via MacPorts) still works. The sample program reports a
Skylake CPU with both compilers.

[Bug target/100347] [11/12 Regression] GCC 11 does not recognize skylake; translates "march=native" to "x86_64"

2021-05-06 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100347

--- Comment #13 from Erik Schnetter  ---
The failing GCC 11.1.0 is built by Apple Clang 12.0.5 via Spack. Looking at
debug output, I see that Spack inserts a "-march=skylake" command line option.
(I was not aware of this before.) It does so by creating a compiler wrapper
(called "clang++" as well), which calls the actual compiler and adds this (and
some other) flags. 

I seem to recall having read somewhere that GCC's CPU detection code must be
built without any "-march=..." flag.

[Bug target/100347] [11/12 Regression] GCC 11 does not recognize skylake; translates "march=native" to "x86_64"

2021-05-07 Thread schnetter at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100347

--- Comment #15 from Erik Schnetter  ---
When I try to rebuild GCC 10.3 or 10.2, they end up having the same problem.
Also, when I enable bootstrapping, bootstrapping fails with differences in many
files. Given that this used to work on a previous version of the OS, the
problem isn't caused by GCC.

One thing that e.g. changed is that there is now a newer version of Apple
Clang.

Thank you for the help and suggestions.