[Bug target/104673] powerpc e500mc Error: unrecognized opcode: `isel'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104673 Chris Packham changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #4 from Chris Packham --- Looks like a duplicate of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104090 *** This bug has been marked as a duplicate of bug 104090 ***
[Bug target/104090] [10/11/12 Regression] powerpc: asm machine directive wrong for FSL processors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104090 Chris Packham changed: What|Removed |Added CC||judge.packham at gmail dot com --- Comment #6 from Chris Packham --- *** Bug 104673 has been marked as a duplicate of this bug. ***
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 cuilili changed: What|Removed |Added CC||lili.cui at intel dot com --- Comment #23 from cuilili --- (In reply to Richard Biener from comment #17) > I do wonder though how CLX is fine with such access pattern ;) (did you test > with just -O2?) Actually CLX also has STLF issues, there is 13.7% regression when comparing "gcc trunk + -O2" w/ and w/t "-fno-tree-vectorize"
[Bug target/104683] -march=haswell generates invalid instructions on Celeron Haswell CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104683 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2022-02-25 Status|UNCONFIRMED |WAITING --- Comment #1 from Richard Biener --- I don't think we should fix anything in GCC here - the family names are really aliases and while we now have skylake-avx512 nobody thought of haswell-avx2 (or haswell-noavx2). I really blame Intel here for very bad product segmentation here ... Can you please provide the exact CPU model here, maybe cut&pasting from /proc/cpuinfo?
[Bug bootstrap/84554] make check: FAIL: tversion: ERROR! The versions of gmp.h (5.0.5) and libgmp (4.3.1) do not match.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84554 --- Comment #9 from Krishna --- x86_64 GNU/Linux: I am doing this for the gcc-11.2.0 ../gcc-11.2.0/configure -v --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --prefix=/usr/local/gcc-11.2.0 --enable-checking=release --enable-languages=c,c++,fortran --disable-multilib –program-suffix=-11.2 –enable-stage1-languages=c,c++ I did run the make -k check, make[4]: Entering directory '/home/krishna/latest/mpfr/tests' make[5]: Entering directory '/home/krishna/latest/mpfr/tests' FAIL: tversion PASS: tinternals PASS: tinits PASS: tisqrt PASS: tsgn PASS: tcheck PASS: tisnan PASS: texceptions PASS: tset_exp . . . PASS: tvalist PASS: ty0 PASS: ty1 PASS: tyn PASS: tzeta PASS: tzeta_ui Testsuite summary for MPFR 3.1.6 # TOTAL: 160 # PASS: 158 # SKIP: 1 # XFAIL: 0 # FAIL: 1 # XPASS: 0 # ERROR: 0 See tests/test-suite.log ERROR! The versions of gmp.h (6.1) and libgmp (6.2.0) do not match.
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #24 from cuilili --- (In reply to cuilili from comment #23) > (In reply to Richard Biener from comment #17) > > I do wonder though how CLX is fine with such access pattern ;) (did you > > test > > with just -O2?) > Sorry, correct w/ and w/t order. Actually CLX also has STLF issues, there is 13.7% regression when comparing "gcc trunk + -O2" w/t and w/ "-fno-tree-vectorize"
[Bug target/103196] [12 regression] gcc.target/powerpc/p9-vec-length-full-7.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103196 Kewen Lin changed: What|Removed |Added Last reconfirmed||2022-02-25 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 --- Comment #3 from Kewen Lin --- The extra stxvl instructions are from the function with type int8 and uint8. The commit r12-5129 changes the vectorized loop body slightly, it makes the later cunroll pass have different estimated sizes for the unrolled loops and makes different decisions to unroll it completely or not. such as: for test_npeel_int8_t before the commit, we have: Loop 1 likely iterates at most 3 times. size: 8-2, last_iteration: 8-4 Loop size: 8 Estimated size after unrolling: 14 Not unrolling loop 1: size would grow. right after that, we have: Loop 1 likely iterates at most 2 times. size: 8-4, last_iteration: 8-4 Loop size: 8 Estimated size after unrolling: 8 So the vectorized loop of test_npeel_int8_t gets unrolled completely. The fix could be to disable pass tree-cunroll for this case.
[Bug libstdc++/68350] std::uninitialized_copy overly restrictive for trivially_copyable types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68350 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |13.0
[Bug tree-optimization/104675] [9/10/11/12 Regression] ICE: in expand_expr_real_2, at expr.cc:9773 at -O with __real__ + __imag__ extraction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104675 --- Comment #7 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:758671b88b78d7629376b118ec6ca6bcfbabbd36 commit r12-7385-g758671b88b78d7629376b118ec6ca6bcfbabbd36 Author: Jakub Jelinek Date: Fri Feb 25 10:55:17 2022 +0100 match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675] We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types, &/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR. So, we should avoid simplifications which turn valid complex type expressions into something that will ICE during expansion. 2022-02-25 Jakub Jelinek PR tree-optimization/104675 * match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for COMPLEX_TYPE. * gcc.dg/pr104675-1.c: New test. * gcc.dg/pr104675-2.c: New test.
[Bug middle-end/104679] [12 Regression] ICE in connect_traces, at dwarf2cfi.c:3071 since r12-6637-g463d9108766dcbb6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104679 --- Comment #2 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:526fbcfa636fb7e544c1ad69101dbccecbee8b28 commit r12-7386-g526fbcfa636fb7e544c1ad69101dbccecbee8b28 Author: Jakub Jelinek Date: Fri Feb 25 10:56:46 2022 +0100 internal-fn: Call do_pending_stack_adjust in expand_SPACESHIP [PR104679] The following testcase is miscompiled on ia32 at -O2, because when expand_SPACESHIP is called, we have pending stack adjustment from the foo call right before it. Now, ix86_expand_fp_spaceship uses emit_jump_insn several times but then emit_jump also several times. While emit_jump_insn doesn't do do_pending_stack_adjust (), emit_jump does, so we end up with: ... 8: call [`_Z3foodl'] argc:0x10 REG_CALL_DECL `_Z3foodl' 9: r88:DF=[`a'] 10: r89:HI=unspec[cmp(r88:DF,0.0)] 25 11: flags:CC=unspec[r89:HI] 26 12: pc={(unordered(flags:CCFP,0))?L27:pc} REG_BR_PROB 536868 66: NOTE_INSN_BASIC_BLOCK 4 13: pc={(uneq(flags:CCFP,0))?L19:pc} REG_BR_PROB 214748364 67: NOTE_INSN_BASIC_BLOCK 5 14: pc={(flags:CCFP>0)?L23:pc} REG_BR_PROB 536870916 68: NOTE_INSN_BASIC_BLOCK 6 15: r86:SI=0x 16: {sp:SI=sp:SI+0x10;clobber flags:CC;} REG_ARGS_SIZE 0 17: pc=L29 18: barrier 19: L19: 69: NOTE_INSN_BASIC_BLOCK 7 ... The sp += 16 pending stuck adjust was emitted in the middle of the sequence and is effective only for the single case of the 4 possibilities where .SPACESHIP returns -1, in all other cases the stack isn't adjusted and so we ICE during dwarf2cfi. Now, we could either call do_pending_stack_adjust in ix86_expand_fp_spaceship, or use there calls that actually don't call do_pending_stack_adjust (but having the stack adjustment across branches is generally undesirable), or we can call it in expand_SPACESHIP for all targets (note, just i386 currently implements it). I chose the generic code because e.g. expand_{addsub,neg,mul}_overflow in the same file also call do_pending_stack_adjust in internal-fn.cc for the same reasons, that it is expected that most if not all targets will expand those through jumps and we don't want all of the targets to need to deal with that. 2022-02-25 Jakub Jelinek PR middle-end/104679 * internal-fn.cc (expand_SPACESHIP): Call do_pending_stack_adjust. * g++.dg/torture/pr104679.C: New test.
[Bug middle-end/104679] [12 Regression] ICE in connect_traces, at dwarf2cfi.c:3071 since r12-6637-g463d9108766dcbb6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104679 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Jakub Jelinek --- Fixed.
[Bug target/104683] -march=haswell generates invalid instructions on Celeron Haswell CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104683 --- Comment #2 from Richard Biener --- Same for -march=broadwell and -march=skylake (G3900) btw (just browsing Intel Ark database). Can't figure what's the "celeron/pentium" successor to the skylake family is to check if even newer SKUs are affected. As said changing -march=skylake to not include AVX2 would be a bad idea, adding -march=skylake-noavx might work but then nobody would use that so I'm not sure what's the point. But again - fun ... let's see if Intel manages to fuse of AVX for some Alderlake SKUs ...
[Bug target/104683] -march=haswell generates invalid instructions on Celeron Haswell CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104683 Richard Biener changed: What|Removed |Added Resolution|--- |INVALID Status|WAITING |RESOLVED --- Comment #3 from Richard Biener --- G5905 aka Comet Lake also has no AVX (but that's just skylake++ and we don't have any -march= for those). Note that our documentation says: @item haswell Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, FSGSBASE, RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE and HLE instruction set support. so it explicitely mentions AVX2 - that means you cannot match Intel Product Names and GCC -march= identifiers 1:1. So I'd say it works as intended.
[Bug tree-optimization/104675] [9/10/11/12 Regression] ICE: in expand_expr_real_2, at expr.cc:9773 at -O with __real__ + __imag__ extraction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104675 --- Comment #8 from Jakub Jelinek --- Created attachment 52512 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52512&action=edit gcc12-pr104675-2.patch Untested fix for the issue Marc mentioned above. In theory we could handle also integral vectors if uniform_vector_cst_p, but let's defer that to GCC 13.
[Bug target/104674] [11/12 Regression] i686 sse2: The two results of __divmoddi4 are mixed up
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104674 --- Comment #6 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:eabf7bbe601f2c0d87bd0a1012d7a602df2037da commit r12-7388-geabf7bbe601f2c0d87bd0a1012d7a602df2037da Author: Jakub Jelinek Date: Fri Feb 25 12:06:52 2022 +0100 i386: Use a new temp slot kind for splitter to floatdi2_i387_with_xmm [PR104674] As mentioned in the PR, the following testcase is miscompiled for similar reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various places during expansion and during expansion we can guarantee that the lifetime of those temporary slot doesn't overlap. But the following splitter uses SLOT_TEMP too and in between expansion and split1 there is a possibility that something extends the lifetime of SLOT_TEMP created slots across an instruction that will be split by this splitter. The following patch fixes it by using a new temp slot kind to make sure it doesn't reuse a SLOT_TEMP that could be live across the instruction. 2022-02-25 Jakub Jelinek PR target/104674 * config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387. * config/i386/i386.md (splitter to floatdi2_i387_with_xmm): Use SLOT_FLOATxFDI_387 rather than SLOT_TEMP. * gcc.target/i386/pr104674.c: New test.
[Bug fortran/104684] New: ICE: 'verify_gimple' failed (Error: non-trivial conversion in 'component_ref')
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104684 Bug ID: 104684 Summary: ICE: 'verify_gimple' failed (Error: non-trivial conversion in 'component_ref') Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- gfortran-12.0.1 20220220 snapshot (g:e49508ac6b36adb8a2056c5a1fb6e0178de2439d) ICEs when compiling the following testcase, reduced from gfortran-bugs/gfortran-20220127.f90 from the Neil Carlson's Fortran Compiler Tests repository[1], w/ -fcoarray=single or -fcoarray=none: program main type :: index_map integer, allocatable :: send_index(:) end type type(index_map) :: imap contains subroutine sub(this) type(index_map), intent(inout), target :: this type :: box integer, pointer :: array(:) end type type(box), allocatable :: buffer[:] allocate(buffer[*]) buffer%array => this%send_index end subroutine end program % gfortran-12.0.1 -fcoarray=single -c ft2kqibk.f90 ft2kqibk.f90:7:16: 7 | subroutine sub(this) |^ Error: non-trivial conversion in 'component_ref' struct array02_integer(kind=4) struct array01_integer(kind=4) MEM[(struct box *)_5].array = this->send_index; ft2kqibk.f90:7:16: internal compiler error: 'verify_gimple' failed 0xf9af9d verify_gimple_in_seq(gimple*) /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/tree-cfg.cc:5213 0xc8b185 gimplify_body(tree_node*, bool) /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/gimplify.cc:16313 0xc8b34c gimplify_function_tree(tree_node*) /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/gimplify.cc:16384 0xfffc18 gimplify_all_functions /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/tree-nested.cc:3703 0xfffc07 gimplify_all_functions /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/tree-nested.cc:3707 0x1005f40 lower_nested_functions(tree_node*) /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/tree-nested.cc:3724 0xaa3143 cgraph_node::analyze() /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/cgraphunit.cc:681 0xaa6047 analyze_functions /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/cgraphunit.cc:1240 0xaa6ced symbol_table::finalize_compilation_unit() /var/tmp/portage/sys-devel/gcc-12.0.1_p20220220/work/gcc-12-20220220/gcc/cgraphunit.cc:2500 I tent to believe that this issue is distinct from the one reported in PR95338. [1] https://github.com/nncarlson/fortran-compiler-tests
[Bug target/104674] [11 Regression] i686 sse2: The two results of __divmoddi4 are mixed up
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104674 Jakub Jelinek changed: What|Removed |Added Summary|[11/12 Regression] i686 |[11 Regression] i686 sse2: |sse2: The two results of|The two results of |__divmoddi4 are mixed up|__divmoddi4 are mixed up --- Comment #7 from Jakub Jelinek --- Fixed on the trunk so far.
[Bug target/104681] [9/10/11/12 Regression] ppc64le -mabi=ieeelongdouble ICE since r9-6460
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104681 --- Comment #1 from Jakub Jelinek --- Created attachment 52513 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52513&action=edit gcc12-pr104681.patch Patch I'm testing right now.
[Bug gcov-profile/104685] New: multiple common of `__gcov_var'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104685 Bug ID: 104685 Summary: multiple common of `__gcov_var' Product: gcc Version: 8.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: gejoed at rediffmail dot com CC: marxin at gcc dot gnu.org Target Milestone: --- In the code that I work on, in my organization the gcov build was successfully done with shared libs using the --coverage flags. However, the -Wl,--warn-common option was enabled and also the -Wl,--fatal-warnings during the linking stage. The multiple symbol common warning appeared as follows : /xxx-linux/8.2.0/real-ld: lib.so and /xxx-linux/8.2.0/libgcov.a(_gcov.o): warning: multiple common of `__gcov_var' If I remove the --fatal-warning flag, the warning appears and build succeeds. I also tried adding -Wl,--allow-multiple-definition (with --warn-common and --fatal-warnings present) but that didn't resolve the build failure. I went through ld options available and couldn't find out any solution here. Is the only solution - to turn of --warn-common option or --fatal-warning option ? Since I couldn't make a local working example myself, please bear with my query having just one line statement of the warning above (due to info sec limitations). Let me know if you would be able to help here. Thanks
[Bug target/104686] New: [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 Bug ID: 104686 Summary: [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- We see a recent compile-time regression on 538.imagick_r for intel architectures (not reproducible with -march=znver2): https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=32.507.8 it happens on the magick/enhance.c file when building with -Ofast -march=skylake I will attach preprocessed source.
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 Richard Biener changed: What|Removed |Added Target||x86_64-*-* Target Milestone|--- |12.0 Keywords||compile-time-hog, ||needs-bisection
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 --- Comment #1 from Richard Biener --- Created attachment 52514 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52514&action=edit preprocessed source
[Bug tree-optimization/102819] [11 Regression] IFN_COMPLEX_MUL matches things that it shouldn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102819 --- Comment #7 from CVS Commits --- The releases/gcc-11 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:876e7c7f0fe47bae7c1922e2683ccb5e6e3ec9fe commit r11-9621-g876e7c7f0fe47bae7c1922e2683ccb5e6e3ec9fe Author: Tamar Christina Date: Fri Feb 25 11:59:32 2022 + vect: Simplify and extend the complex numbers validation routines. (GCC-11 Backport) This is a backport of the GCC 12 patch backporting only the correctness part of the fix. This also backports two small helper functions and documentation update on the optabs. The patch boosts the analysis for complex mul,fma and fms in order to ensure that it doesn't create an incorrect output. Essentially it adds an extra verification to check that the two nodes it's going to combine do the same operations on compatible values. The reason it needs to do this is that if one computation differs from the other then with the current implementation we have no way to deal with it since we have to remove the permute. When we can keep the permute around we can probably handle these by unrolling. While implementing this since I have to do the traversal anyway I took advantage of it by simplifying the code a bit. Previously we would determine whether something is a conjugate and then try to figure out which conjugate it is and then try to see if the permutes match what we expect. Now the code that does the traversal will detect this in one go and return to us whether the operation is something that can be combined and whether a conjugate is present. Secondly because it does this I can now simplify the checking code itself to essentially just try to apply fixed patterns to each operation. The patterns represent the order operations should appear in. For instance a complex MUL operation combines : Left 1 + Right 1 Left 2 + Right 2 with a permute on the nodes consisting of: { Even, Even } + { Odd, Odd } { Even, Odd } + { Odd, Even } By abstracting over these patterns the checking code becomes quite simple. As part of this I was checking the order of the operands which was left in "slp" order. as in, the same order they showed up in during SLP, which means that the accumulator is first. However it looks like I didn't document this. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * gimple.h (gimple_num_args, gimple_arg): New. * doc/md.texi: Update docs for cfms, cfma. * tree-data-ref.h (same_data_refs): Accept optional offset. * tree-vect-slp-patterns.c (is_linear_load_p): Fix issue with repeating patterns. (vect_normalize_conj_loc): Remove. (is_eq_or_top): Change to take two nodes. (enum _conj_status, compatible_complex_nodes_p, vect_validate_multiplication): New. (class complex_add_pattern, complex_add_pattern::matches, complex_add_pattern::recognize, class complex_mul_pattern, complex_mul_pattern::recognize, class complex_fms_pattern, complex_fms_pattern::recognize,, class complex_fma_pattern, complex_fma_pattern::recognize, class complex_operations_pattern, complex_operations_pattern::recognize, addsub_pattern::recognize): Pass new cache. (complex_fms_pattern::matches, complex_fma_pattern::matches, complex_mul_pattern::matches): Pass new cache and use new validation code. * tree-vect-slp.c (vect_match_slp_patterns_2, vect_match_slp_patterns, vect_analyze_slp): Pass along cache. (compatible_calls_p): Expose. * tree-vectorizer.h (compatible_calls_p, slp_node_hash, slp_compat_nodes_map_t): New. (class vect_pattern): Update signatures include new cache. gcc/testsuite/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * g++.dg/vect/pr99149.cc: xfail for now. * gcc.dg/vect/complex/pr102819-1.c: New test. * gcc.dg/vect/complex/pr102819-2.c: New test. * gcc.dg/vect/complex/pr102819-3.c: New test. * gcc.dg/vect/complex/pr102819-4.c: New test. * gcc.dg/vect/complex/pr102819-5.c: New test. * gcc.dg/vect/complex/pr102819-6.c: New test. * gcc.dg/vect/complex/pr102819-7.c: New test. * gcc.dg/vect/complex/pr102819-8.c: New test. * gcc.dg/vect/complex/pr102819-9.c: New test. * gcc.dg/vect/complex/pr103169.c: New test.
[Bug tree-optimization/103169] [12 Regression] ICE: verify_ssa failed (error: definition in block 3 does not dominate use in block 4) since r12-4785-ged3de62ac949c92ad41ef6de7cc926fbb2a510ce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103169 --- Comment #8 from CVS Commits --- The releases/gcc-11 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:876e7c7f0fe47bae7c1922e2683ccb5e6e3ec9fe commit r11-9621-g876e7c7f0fe47bae7c1922e2683ccb5e6e3ec9fe Author: Tamar Christina Date: Fri Feb 25 11:59:32 2022 + vect: Simplify and extend the complex numbers validation routines. (GCC-11 Backport) This is a backport of the GCC 12 patch backporting only the correctness part of the fix. This also backports two small helper functions and documentation update on the optabs. The patch boosts the analysis for complex mul,fma and fms in order to ensure that it doesn't create an incorrect output. Essentially it adds an extra verification to check that the two nodes it's going to combine do the same operations on compatible values. The reason it needs to do this is that if one computation differs from the other then with the current implementation we have no way to deal with it since we have to remove the permute. When we can keep the permute around we can probably handle these by unrolling. While implementing this since I have to do the traversal anyway I took advantage of it by simplifying the code a bit. Previously we would determine whether something is a conjugate and then try to figure out which conjugate it is and then try to see if the permutes match what we expect. Now the code that does the traversal will detect this in one go and return to us whether the operation is something that can be combined and whether a conjugate is present. Secondly because it does this I can now simplify the checking code itself to essentially just try to apply fixed patterns to each operation. The patterns represent the order operations should appear in. For instance a complex MUL operation combines : Left 1 + Right 1 Left 2 + Right 2 with a permute on the nodes consisting of: { Even, Even } + { Odd, Odd } { Even, Odd } + { Odd, Even } By abstracting over these patterns the checking code becomes quite simple. As part of this I was checking the order of the operands which was left in "slp" order. as in, the same order they showed up in during SLP, which means that the accumulator is first. However it looks like I didn't document this. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * gimple.h (gimple_num_args, gimple_arg): New. * doc/md.texi: Update docs for cfms, cfma. * tree-data-ref.h (same_data_refs): Accept optional offset. * tree-vect-slp-patterns.c (is_linear_load_p): Fix issue with repeating patterns. (vect_normalize_conj_loc): Remove. (is_eq_or_top): Change to take two nodes. (enum _conj_status, compatible_complex_nodes_p, vect_validate_multiplication): New. (class complex_add_pattern, complex_add_pattern::matches, complex_add_pattern::recognize, class complex_mul_pattern, complex_mul_pattern::recognize, class complex_fms_pattern, complex_fms_pattern::recognize,, class complex_fma_pattern, complex_fma_pattern::recognize, class complex_operations_pattern, complex_operations_pattern::recognize, addsub_pattern::recognize): Pass new cache. (complex_fms_pattern::matches, complex_fma_pattern::matches, complex_mul_pattern::matches): Pass new cache and use new validation code. * tree-vect-slp.c (vect_match_slp_patterns_2, vect_match_slp_patterns, vect_analyze_slp): Pass along cache. (compatible_calls_p): Expose. * tree-vectorizer.h (compatible_calls_p, slp_node_hash, slp_compat_nodes_map_t): New. (class vect_pattern): Update signatures include new cache. gcc/testsuite/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * g++.dg/vect/pr99149.cc: xfail for now. * gcc.dg/vect/complex/pr102819-1.c: New test. * gcc.dg/vect/complex/pr102819-2.c: New test. * gcc.dg/vect/complex/pr102819-3.c: New test. * gcc.dg/vect/complex/pr102819-4.c: New test. * gcc.dg/vect/complex/pr102819-5.c: New test. * gcc.dg/vect/complex/pr102819-6.c: New test. * gcc.dg/vect/complex/pr102819-7.c: New test. * gcc.dg/vect/complex/pr102819-8.c: New test. * gcc.dg/vect/complex/pr102819-9.c: New test. * gcc.dg/vect/complex/pr103169.c: New test.
[Bug tree-optimization/102819] [11 Regression] IFN_COMPLEX_MUL matches things that it shouldn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102819 --- Comment #8 from CVS Commits --- The releases/gcc-11 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:6bb338eab3debf7742b1455146dd556b3ce3737c commit r11-9622-g6bb338eab3debf7742b1455146dd556b3ce3737c Author: Tamar Christina Date: Fri Feb 25 12:00:11 2022 + AArch64: use canonical ordering for complex mul, fma and fms After the first patch in the series this updates the optabs to expect the canonical sequence. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * config/aarch64/aarch64-simd.md (cml4): Use canonical order. * config/aarch64/aarch64-sve.md (cml4): Likewise.
[Bug tree-optimization/103169] [12 Regression] ICE: verify_ssa failed (error: definition in block 3 does not dominate use in block 4) since r12-4785-ged3de62ac949c92ad41ef6de7cc926fbb2a510ce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103169 --- Comment #9 from CVS Commits --- The releases/gcc-11 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:6bb338eab3debf7742b1455146dd556b3ce3737c commit r11-9622-g6bb338eab3debf7742b1455146dd556b3ce3737c Author: Tamar Christina Date: Fri Feb 25 12:00:11 2022 + AArch64: use canonical ordering for complex mul, fma and fms After the first patch in the series this updates the optabs to expect the canonical sequence. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * config/aarch64/aarch64-simd.md (cml4): Use canonical order. * config/aarch64/aarch64-sve.md (cml4): Likewise.
[Bug tree-optimization/103169] [12 Regression] ICE: verify_ssa failed (error: definition in block 3 does not dominate use in block 4) since r12-4785-ged3de62ac949c92ad41ef6de7cc926fbb2a510ce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103169 --- Comment #10 from CVS Commits --- The releases/gcc-11 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:7d713d56ec32b8c7101066619b4852b797955e24 commit r11-9623-g7d713d56ec32b8c7101066619b4852b797955e24 Author: Tamar Christina Date: Fri Feb 25 12:00:46 2022 + AArch32: use canonical ordering for complex mul, fma and fms After the first patch in the series this updates the optabs to expect the canonical sequence. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * config/arm/vec-common.md (cml4): Use canonical order.
[Bug tree-optimization/102819] [11 Regression] IFN_COMPLEX_MUL matches things that it shouldn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102819 --- Comment #9 from CVS Commits --- The releases/gcc-11 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:7d713d56ec32b8c7101066619b4852b797955e24 commit r11-9623-g7d713d56ec32b8c7101066619b4852b797955e24 Author: Tamar Christina Date: Fri Feb 25 12:00:46 2022 + AArch32: use canonical ordering for complex mul, fma and fms After the first patch in the series this updates the optabs to expect the canonical sequence. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * config/arm/vec-common.md (cml4): Use canonical order.
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 Richard Biener changed: What|Removed |Added Keywords||ra --- Comment #2 from Richard Biener --- Samples: 204K of event 'cycles', Event count (approx.): 222925904390 Overhead Samples Command Shared Object Symbol 92.53%189096 cc1 cc1 [.] update_conflict_hard_regno_costs # 0.51% 1043 cc1 cc1 [.] build_object_conflicts
[Bug tree-optimization/102819] [11 Regression] IFN_COMPLEX_MUL matches things that it shouldn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102819 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #10 from Tamar Christina --- fixed in all affected branches.
[Bug tree-optimization/103037] [11/12 Regression] Wrong code with -O2 since r11-6100-gd41b097350d3c5d0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103037 --- Comment #9 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:e25dce501334053239dcc433e4c46ecbddbcb13e commit r12-7389-ge25dce501334053239dcc433e4c46ecbddbcb13e Author: Richard Biener Date: Thu Feb 24 13:04:29 2022 +0100 tree-optimization/103037 - PRE simplifying valueized expressions This fixes a long-standing issue in PRE where we track valueized expressions in our expression sets that we use for PHI translation, code insertion but also feed into match-and-simplify via vn_nary_simplify. But that's not what is expected from vn_nary_simplify or match-and-simplify which assume we are simplifying with operands available at the point of the expression so they can use contextual information on the SSA names like ranges. While the VN side was updated to ensure this with the rewrite to RPO VN, thereby removing all workarounds that nullified such contextual info on all SSA names, the PRE side still suffers from this. The following patch tries to apply minimal surgery at this point and makes PRE track un-valueized expressions in the expression sets but only for the NARY kind (both NAME and CONSTANT do not suffer from this issue), leaving the REFERENCE kind alone. The REFERENCE kind is important when trying to remove the workarounds still in place in compute_avail for code hoisting, but that's a separate issue and we have a working workaround in place. Doing this comes at the cost of duplicating the VN IL on the PRE side for NARY and eventually some extra overhead for translated expressions that is difficult to assess. 2022-02-25 Richard Biener PR tree-optimization/103037 * tree-ssa-sccvn.h (alloc_vn_nary_op_noinit): Declare. (vn_nary_length_from_stmt): Likewise. (init_vn_nary_op_from_stmt): Likewise. (vn_nary_op_compute_hash): Likewise. * tree-ssa-sccvn.cc (alloc_vn_nary_op_noinit): Export. (vn_nary_length_from_stmt): Likewise. (init_vn_nary_op_from_stmt): Likewise. (vn_nary_op_compute_hash): Likewise. * tree-ssa-pre.cc (pre_expr_obstack): New obstack. (get_or_alloc_expr_for_nary): Pass in the value-id to use, (re-)compute the hash value and if the expression is not found allocate it from pre_expr_obstack. (phi_translate_1): Do not insert the NARY found in the VN tables but build a PRE expression from the valueized NARY with the value-id we eventually found. (find_or_generate_expression): Assert we have an entry for constant values. (compute_avail): Insert not valueized expressions into EXP_GEN using the value-id from the VN tables. (init_pre): Allocate pre_expr_obstack. (fini_pre): Free pre_expr_obstack. * gcc.dg/torture/pr103037.c: New testcase.
[Bug tree-optimization/103037] [11 Regression] Wrong code with -O2 since r11-6100-gd41b097350d3c5d0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103037 Richard Biener changed: What|Removed |Added Summary|[11/12 Regression] Wrong|[11 Regression] Wrong code |code with -O2 since |with -O2 since |r11-6100-gd41b097350d3c5d0 |r11-6100-gd41b097350d3c5d0 Known to fail|12.0| Known to work||12.0 --- Comment #10 from Richard Biener --- Fixed on trunk sofar. Let's watch for fallout.
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #3 from Martin Liška --- I see 2 regressions: 1) r12-7246-g4963079769c99c40 5.77s -> 35.56s and then: 2) r12-7319-g90d693bdc9d71841
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 --- Comment #4 from Martin Liška --- > > 2) r12-7319-g90d693bdc9d71841 to 57s
[Bug gcov-profile/104685] multiple common of `__gcov_var'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104685 --- Comment #1 from Richard Biener --- I suppose your setup will warn for t.c --- int i; t2.c --- int i; as well. -Wl,--warn-common isn't something I'd recommend, esp. the 'multiple common of ' kind is prone to false positives. It does catch errors like when one of the 'i' above is 'float' instead of 'int'. __gcov_var is a common symbol intentionally, there's not much you can do here.
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 --- Comment #5 from Martin Liška --- Created attachment 52515 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52515&action=edit Partially reduced test-case For the test-case I get: Bisecting latest revisions a9e2ebe839d56416(24 Feb 2022 22:16)(ol...@adacore.com): [took: 16.38 s] result: OK 250f234988b62316(20 Apr 2021 09:51)(stefa...@linux.ibm.com): [took: 1.75 s] result: OK
[Bug gcov-profile/104685] multiple common of `__gcov_var'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104685 --- Comment #2 from Richard Biener --- Using link-time optimization is a more reliable way of detecting real errors here btw: rguenther@ryzen:/tmp> cat t1.c int i; rguenther@ryzen:/tmp> cat t2.c float i; int main() { return i; } rguenther@ryzen:/tmp> gcc-7 t1.c t2.c -flto t2.c:1:7: warning: type of 'i' does not match original declaration [-Wlto-type-mismatch] float i; ^ t1.c:1:5: note: type 'int' should match type 'float' int i; ^ t1.c:1:5: note: 'i' was previously declared here t1.c:1:5: note: code may be misoptimized unless -fno-strict-aliasing is used
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 --- Comment #6 from Richard Biener --- Both revisions affect vectorizer cost modeling only. With -fno-vect-cost-model it compiles faster for me but still a slow 30s and 91% in RA.
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 --- Comment #7 from Martin Liška --- (In reply to Richard Biener from comment #6) > Both revisions affect vectorizer cost modeling only. With > -fno-vect-cost-model it compiles faster for me but still a slow 30s and 91% > in RA. There are numbers with -fno-vect-cost-model: Bisecting latest revisions a9e2ebe839d56416(24 Feb 2022 22:16)(ol...@adacore.com): [took: 36.06 s] result: OK 250f234988b62316(20 Apr 2021 09:51)(stefa...@linux.ibm.com): [took: 18.35 s] result: OK I'm going to find out where the change happensed.
[Bug libbacktrace/104463] Split debug info not loaded from .debug/ if .gnu_debuglink points to binary itself
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104463 Mark Wielaard changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2022-02-25 --- Comment #1 from Mark Wielaard --- Note. The term "split dwarf" is often associated with the -gsplit-dwarf option, which uses .dwo sections (and files) to split DWARF debuginfo. This bug isn't about that. I can replicate the issue, but haven't fully traced why it happens. It seems libbacktrace gets confused about where /proc/self/exe points to and tries to open /proc/self/bug, /proc/self/.debug/bug and /usr/lib/debug//proc/self/bug
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 Martin Liška changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #8 from Martin Liška --- (In reply to Martin Liška from comment #7) > (In reply to Richard Biener from comment #6) > > Both revisions affect vectorizer cost modeling only. With > > -fno-vect-cost-model it compiles faster for me but still a slow 30s and 91% > > in RA. > > There are numbers with -fno-vect-cost-model: > > Bisecting latest revisions > a9e2ebe839d56416(24 Feb 2022 22:16)(ol...@adacore.com): [took: 36.06 s] > result: OK > 250f234988b62316(20 Apr 2021 09:51)(stefa...@linux.ibm.com): [took: 18.35 > s] result: OK > > I'm going to find out where the change happensed. Which started with r12-2463-ga6291d88d5b6c17d.
[Bug fortran/104684] [9/10/11/12 Regression] ICE: 'verify_gimple' failed (Error: non-trivial conversion in 'component_ref') since r7-5771-gde91486c745d5ff6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104684 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW CC||marxin at gcc dot gnu.org, ||vehre at gcc dot gnu.org Ever confirmed|0 |1 Summary|ICE: 'verify_gimple' failed |[9/10/11/12 Regression] |(Error: non-trivial |ICE: 'verify_gimple' failed |conversion in |(Error: non-trivial |'component_ref')|conversion in ||'component_ref') since ||r7-5771-gde91486c745d5ff6 Last reconfirmed||2022-02-25 --- Comment #1 from Martin Liška --- Started with r7-5771-gde91486c745d5ff6.
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 Richard Biener changed: What|Removed |Added CC||vmakarov at gcc dot gnu.org --- Comment #9 from Richard Biener --- (In reply to Martin Liška from comment #8) > (In reply to Martin Liška from comment #7) > > (In reply to Richard Biener from comment #6) > > > Both revisions affect vectorizer cost modeling only. With > > > -fno-vect-cost-model it compiles faster for me but still a slow 30s and > > > 91% > > > in RA. > > > > There are numbers with -fno-vect-cost-model: > > > > Bisecting latest revisions > > a9e2ebe839d56416(24 Feb 2022 22:16)(ol...@adacore.com): [took: 36.06 s] > > result: OK > > 250f234988b62316(20 Apr 2021 09:51)(stefa...@linux.ibm.com): [took: 18.35 > > s] result: OK > > > > I'm going to find out where the change happensed. > > Which started with r12-2463-ga6291d88d5b6c17d. I think you want to keep r12-7293 in the tree - the above introduced a huge regression that was already fixed. So this testcase was probably always slow to compile and spending all time in RA? When using callgrind on the reduced testcase and a -O0 compiler I see most time spent in ira_object_conflict_iter_cond, in particular the loop /* Skip bits that are zero. */ for (; (word & 1) == 0; word >>= 1) bit_num++; and the load obj = ira_object_id_map[bit_num + i->base_conflict_id]; maybe we can use ctz_hwi here (hopefully we optimize this loop with -O2). This function is most called from allocnos_conflict_p which is called from update_conflict_hard_regno_costs. In particular we have 677544 calls to get_next_upate_cost () and 1824339 calls to allocnos_conflict () there from just 34480 calls to update_conflict_hard_regno_costs. queue_update_cost is called 1365947 times. Maybe we can improve things or at least cut things off with decreasing precision somehow? Vlad?
[Bug c/85487] Support '#pragma region' and '#pragma endregion' to allow code folding with Visual Studio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85487 rsandifo at gcc dot gnu.org changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #12 from rsandifo at gcc dot gnu.org --- (In reply to Jonathan Wakely from comment #9) > (In reply to Jakub Jelinek from comment #7) > > I would say that is a terrible design... > > Yes, I completely agree, but I don't see why GCC should be in the business > of diagnosing other people's junk :-) > > Maybe Visual Studio's editor and VScode do have checks, just not the VC++ > compiler. And if so, then that's even more reason that we don't need GCC to > do its own checking. Agreed. And if people with strict linting requirements want a warning about pragmas that are recognised but have no effect on the compiler, we could still provide an option to do that (but it shouldn't be in -Wall or even -Wextra). That shouldn't be a requirement for this PR though, unless anyone can show that someone somewhere really does want these pragmas to generate a warning. Jeff said at the end of the thread that he wouldn't mind if someone else approves it, so it's probably worth posting again. The patch LGTM FWIW: only (very) minor comment is that the unused argument name in handle_pragma_region can be dropped. I think the patch would need to wait for GCC 13 now though.
[Bug target/104144] [12 Regression] build fails due to: Error: unknown architecture `armv9-a'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104144 --- Comment #7 from Martin Liška --- (In reply to Przemyslaw Wirkus from comment #6) > Yes, I will update docs, cheers! Any update on this, please?
[Bug target/104637] [9/10/11/12 Regression] ICE: maximum number of LRA assignment passes is achieved (30) with -Og -fno-forward-propagate -mavx since r9-5221-gd8fcab689435a29d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104637 Jakub Jelinek changed: What|Removed |Added Target Milestone|--- |9.5 CC||vmakarov at gcc dot gnu.org Priority|P3 |P2 Summary|ICE: maximum number of LRA |[9/10/11/12 Regression] |assignment passes is|ICE: maximum number of LRA |achieved (30) with -Og |assignment passes is |-fno-forward-propagate |achieved (30) with -Og |-mavx since |-fno-forward-propagate |r9-5221-gd8fcab689435a29d |-mavx since ||r9-5221-gd8fcab689435a29d
[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686 --- Comment #10 from Richard Biener --- diff --git a/gcc/ira-int.h b/gcc/ira-int.h index 957604b22e9..7465af72e98 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -1379,8 +1379,9 @@ ira_object_conflict_iter_cond (ira_object_conflict_iterator *i, } /* Skip bits that are zero. */ - for (; (word & 1) == 0; word >>= 1) - bit_num++; + int off = ctz_hwi (word); + bit_num += off; + word >>= off; obj = ira_object_id_map[bit_num + i->base_conflict_id]; i->bit_num = bit_num + 1; improves compile-time from 31s to 24s for the full preprocessed source with -fno-vect-cost-model.
[Bug target/104637] [9/10/11/12 Regression] ICE: maximum number of LRA assignment passes is achieved (30) with -Og -fno-forward-propagate -mavx since r9-5221-gd8fcab689435a29d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104637 --- Comment #2 from Jakub Jelinek --- If I change the testcase to following (so that it doesn't rely on __builtin_convertvector), it started ICEing with r0-122162-gb7aa4e9afcd3da4f09d6f982a663ea2094b1f2cf typedef short __attribute__((__vector_size__ (64))) U; typedef unsigned long long __attribute__((__vector_size__ (32))) V; typedef long double __attribute__((__vector_size__ (64))) F; int i; U u; F f; void foo (char a, char b, _Complex char c, V v) { u = (U) { u[0] / 0, u[1] / 0, u[2] / 0, u[3] / 0, u[4] / 0, u[5] / 0, u[6] / 0, u[7] / 0, u[8] / 0, u[0] / 0, u[9] / 0, u[10] / 0, u[11] / 0, u[12] / 0, u[13] / 0, u[14] / 0, u[15] / 0, u[16] / 0, u[17] / 0, u[18] / 0, u[19] / 0, u[20] / 0, u[21] / 0, u[22] / 0, u[23] / 0, u[24] / 0, u[25] / 0, u[26] / 0, u[27] / 0, u[28] / 0, u[29] / 0, u[30] / 0, u[31] / 0 }; c += i; f = (F) { v[0], v[1], v[2], v[3] }; i = (char) (__imag__ c + i); } In any case, I don't see anything wrong on the GIMPLE side and it isn't clear on reloading which insn it is ICEing.
[Bug testsuite/104687] New: gcc.dg/lto/20090717_[01].c is an invalid execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104687 Bug ID: 104687 Summary: gcc.dg/lto/20090717_[01].c is an invalid execution test Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- 20090717_0.c has: struct variable { const char *string; }; struct variable table[] = { }; and 20090717_1.c has: struct variable { const char *string; }; extern struct variable table[]; int main(int argc, char *argv[]) { struct variable *p; for(p = table; p->string; p++) ; return 0; } but the access of p->string dereferences a pointer to an empty array, so it accesses out of bounds, and is therefore invalid. Therefore the test should not be an execution test in the testsuite. Or am I missing something?
[Bug testsuite/104687] gcc.dg/lto/20090717_[01].c is an invalid execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104687 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2022-02-25 CC||marxin at gcc dot gnu.org Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org --- Comment #1 from Martin Liška --- You are correct of course: gcc pr104687.c pr104687-2.c -fsanitize=address -g && ./a.out = ==3100==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00404100 at pc 0x004011e9 bp 0x7fffde10 sp 0x7fffde08 READ of size 8 at 0x00404100 thread T0 #0 0x4011e8 in main /home/marxin/Programming/testcases/pr104687-2.c:8 #1 0x773ca62f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #2 0x773ca6ef in __libc_start_main_impl ../csu/libc-start.c:392 #3 0x4010a4 in _start (/home/marxin/Programming/testcases/a.out+0x4010a4) 0x00404100 is located 0 bytes to the right of global variable 'table' defined in 'pr104687.c:4:17' (0x404100) of size 0 'table' is ascii string '' SUMMARY: AddressSanitizer: global-buffer-overflow /home/marxin/Programming/testcases/pr104687-2.c:8 in main Shadow bytes around the buggy address: 0x800787d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x800787e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x800787f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x80078800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x80078810: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 =>0x80078820:[f9]f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 0x80078830: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x80078840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x80078850: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x80078860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x80078870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Let me fix that.
[Bug testsuite/104687] gcc.dg/lto/20090717_[01].c is an invalid execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104687 --- Comment #2 from CVS Commits --- The master branch has been updated by Martin Liska : https://gcc.gnu.org/g:219a8826cd5d7ee91165491034795b8876811817 commit r12-7391-g219a8826cd5d7ee91165491034795b8876811817 Author: Martin Liska Date: Fri Feb 25 15:08:44 2022 +0100 testsuite: Fix ASAN error [PR104687] PR testsuite/104687 gcc/testsuite/ChangeLog: * gcc.dg/lto/20090717_0.c: Fix asan error.
[Bug testsuite/104687] gcc.dg/lto/20090717_[01].c is an invalid execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104687 Martin Liška changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Martin Liška --- Fixed.
[Bug target/104688] New: gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 Bug ID: 104688 Summary: gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: xry111 at mengyan1223 dot wang Target Milestone: --- In Dec 2021, Intel updated the SDM and added the following content: > Processors that enumerate support for Intel® AVX (by setting the feature flag > CPUID.01H:ECX.AVX[bit 28]) guarantee that the 16-byte memory operations > performed by the following instructions will always be carried out atomically: > - MOVAPD, MOVAPS, and MOVDQA. > - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128. > - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and > k0 (masking disabled). > > (Note that these instructions require the linear addresses of their memory > operands to be 16-byte aligned.) (see Change 13, https://cdrdv2.intel.com/v1/dl/getContent/671294) So we can use SSE for Intel CPUs with AVX, instead of a loop with LOCK CMPXCHG16B. AMD has no such guarantee (at least for now), so we still need LOCK CMPXCHG16B on old Intel CPUs and (old or new) AMD CPUs.
[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 Jakub Jelinek changed: What|Removed |Added CC||hjl.tools at gmail dot com, ||jakub at gcc dot gnu.org, ||rth at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- So, shall we just handle it in libatomic by adding yet another ifunc selected version for __atomic_load_16 ? Or do we want to expand it back inline if some new -m* option selected by default for -march= of Intel made CPUs with AVX is set?
[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 Xi Ruoyao changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-help/2022-February/1 ||41279.html --- Comment #2 from Xi Ruoyao --- See option 4 of https://gcc.gnu.org/legacy-ml/gcc/2017-01/msg00167.html.
[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #3 from Florian Weimer --- I feel we should give AMD some time to comment here. If they can commit supporting it like Intel did, that alters the design space somewhat.
[Bug target/104683] -march=haswell generates invalid instructions on Celeron Haswell CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104683 --- Comment #4 from Preston Crow --- I can see how it's documented, but many people reading that won't realize that the instruction sets don't match all Haswell chips. Simply adding a line below that to say "Note: Some lower-end Haswell processors do not include all of the above instruction sets." And do the same for the others. Then at least anyone reading the documentation will immediately be aware of the potential problem and hopefully avoid using them in inappropriate situations.
[Bug c/85487] Support '#pragma region' and '#pragma endregion' to allow code folding with Visual Studio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85487 --- Comment #13 from Jonathan Wakely --- (In reply to rsand...@gcc.gnu.org from comment #12) > Jeff said at the end of the thread that he wouldn't mind > if someone else approves it, so it's probably worth posting > again. OK, will do. I think Jeff's desire for a framework to ignore arbitrary pragmas might be nice, but nobody's going to do that, and this small improvement has a working patch. We shouldn't let the perfect be the enemy of the good. > The patch LGTM FWIW: only (very) minor comment is that > the unused argument name in handle_pragma_region can be dropped. Yeah, that was consistent with the rest of the file, but I already changed the rest in r12-7282-g73a118c209fcbb so I'd update my patch in the same way. I'll also update it to ignore the very similar Xcode pragmas described in PR 61593. > I think the patch would need to wait for GCC 13 now though. Indeed.
[Bug c/85487] Support '#pragma region' and '#pragma endregion' to allow code folding with Visual Studio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85487 --- Comment #14 from Jonathan Wakely --- (In reply to Florian Weimer from comment #11) > Clang does not appear to treat this pragma as a statement. Is this also the > MSVC behavior? Yes, I think so. I'll check that my patch is consistent.
[Bug target/104144] [12 Regression] build fails due to: Error: unknown architecture `armv9-a'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104144 --- Comment #8 from Przemyslaw Wirkus --- > Subject: [Bug target/104144] [12 Regression] build fails due to: Error: > unknown > architecture `armv9-a' > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104144 > > --- Comment #7 from Martin Liška --- (In reply to > Przemyslaw Wirkus from comment #6) > > Yes, I will update docs, cheers! > > Any update on this, please? Apologies, Was recently ill (it's that famous thing) and still digging through TODOs. Please give me a day or two to sort this out. Kind regards, Przemyslaw IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
[Bug target/104689] New: aarch64: libgcc: DW_CFA_val_expression is not supported for RA_SIGN_SATE register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104689 Bug ID: 104689 Summary: aarch64: libgcc: DW_CFA_val_expression is not supported for RA_SIGN_SATE register Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: nsz at gcc dot gnu.org Target Milestone: --- gcc emits DW_CFA_AARCH64_negate_ra_state (DW_CFA_window_save) for pac-ret but it's valid to set the RA_SIGN_STATE pseudo register via other dwarf instructions. currently libgcc unwinder can crash if DW_CFA_val_expression is used to set the register value directly. (reportedly the cranelift compiler can generate such code.)
[Bug sanitizer/104690] New: UBSan does not detect undefined behavior on function without a specified return value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104690 Bug ID: 104690 Summary: UBSan does not detect undefined behavior on function without a specified return value Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: vincent-gcc at vinc17 dot net CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- Consider the following C code: #include static int f (void) { } int main (void) { printf ("%d\n", f ()); return 0; } According to ISO C17 6.9.1p12, the behavior is undefined: "If the } that terminates a function is reached, and the value of the function call is used by the caller, the behavior is undefined." I don't know what "used by the caller" means exactly, but in the above code, the value is clearly used, since it is printed. However, when one compiles it with "gcc -std=c17 -fsanitize=undefined" (with or without -O), running the code does not trigger an error. (Well, I hope that UBSan doesn't think that the value isn't necessarily used because the printf may fail before printing the value.) Tested with gcc-12 (Debian 12-20220222-1) 12.0.1 20220222 (experimental) [master r12-7325-g2f59f067610] and some earlier versions. Note: with g++, one gets a "runtime error: execution reached the end of a value-returning function without returning a value" as expected.
[Bug sanitizer/104690] UBSan does not detect undefined behavior on function without a specified return value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104690 --- Comment #1 from Jakub Jelinek --- Unlike C++ where ubsan does detect it because the language makes it easy to do so, I really don't see how ubsan could detect this. It requires that the callee tells the caller that it reached end of non-void function without return and the callee checks if the value is actually used there. That would effectively require an ABI change, so far -fsanitize=undefined is an opt-in instrumentation and allows free combination of callers built with sanitization vs. callers not built with those or vice versa. -fsanitize=undefined certainly doesn't claim to catch all kinds of undefined behavior, only those that it documents...
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #25 from H.J. Lu --- (In reply to cuilili from comment #24) > (In reply to cuilili from comment #23) > > (In reply to Richard Biener from comment #17) > > > I do wonder though how CLX is fine with such access pattern ;) (did you > > > test > > > with just -O2?) > > > Sorry, correct w/ and w/t order. > > Actually CLX also has STLF issues, there is 13.7% regression when comparing > "gcc trunk + -O2" w/t and w/ "-fno-tree-vectorize" Can this be mitigated by removing redundant load and store?
[Bug target/100085] Bad code for union transfer from __float128 to vector types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #21 from Steven Munroe --- Yes I was told by Peter Bergner that the fix from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085#c15 had been back ported top AT15.0-1. But when ran this test with AT15.0-1 I saw: : 0: 20 00 20 39 li r9,32 4: d0 ff 41 39 addir10,r1,-48 8: 57 12 42 f0 xxswapd vs34,vs34 c: 99 4f 4a 7c stxvd2x vs34,r10,r9 10: ce 48 4a 7c lvx v2,r10,r9 14: 20 00 80 4e blr 0030 : 30: 20 00 20 39 li r9,32 34: d0 ff 41 39 addir10,r1,-48 38: 57 12 42 f0 xxswapd vs34,vs34 3c: 99 4f 4a 7c stxvd2x vs34,r10,r9 40: ce 48 4a 7c lvx v2,r10,r9 44: 20 00 80 4e blr 0060 : 60: 20 00 20 39 li r9,32 64: d0 ff 41 39 addir10,r1,-48 68: 57 12 42 f0 xxswapd vs34,vs34 6c: 99 4f 4a 7c stxvd2x vs34,r10,r9 70: 99 4e 4a 7c lxvd2x vs34,r10,r9 74: 57 12 42 f0 xxswapd vs34,vs34 78: 20 00 80 4e blr 0090 : 90: 57 12 42 f0 xxswapd vs34,vs34 94: 20 00 40 39 li r10,32 98: d0 ff 01 39 addir8,r1,-48 9c: f0 ff 21 39 addir9,r1,-16 a0: 99 57 48 7c stxvd2x vs34,r8,r10 a4: 00 00 69 e8 ld r3,0(r9) a8: 08 00 89 e8 ld r4,8(r9) ac: 20 00 80 4e blr So either the patch for AT15.0-1 is not applied correctly or is non-functional because of some difference between GCC11/GCC12. Or regressed because of some other change/patch. In my experience this part of GCC is fragile (based on the long/sad history of IBM long double). So this needs to monitored with each new update.
[Bug target/104681] [9/10/11/12 Regression] ppc64le -mabi=ieeelongdouble ICE since r9-6460
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104681 --- Comment #2 from Segher Boessenkool --- Could you just change the insn condition to test if at least one of the operands is a reg?
[Bug target/104681] [9/10/11/12 Regression] ppc64le -mabi=ieeelongdouble ICE since r9-6460
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104681 --- Comment #3 from Jakub Jelinek --- No. /* The movmisalign pattern cannot fail, else the assignment would silently be omitted. */
[Bug c/93432] variable is used uninitialized, but gcc shows no warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93432 Vincent Lefèvre changed: What|Removed |Added CC||vincent-gcc at vinc17 dot net --- Comment #6 from Vincent Lefèvre --- (In reply to Manuel López-Ibáñez from comment #4) > It warns in gcc 10.1. It may good idea to add this one as a testcase, since > it seems it got fixed without noticing. Has this really been fixed, or does it work now just by chance? This looks like PR18501, where the concerned variable is initialized only at one place in the loop. Here, "z = 1" is followed by "z=z+1", so that this is equivalent to "z = 2". But if in the code, I do this change, the warning disappears (tested with gcc-12 (Debian 12-20220222-1) 12.0.1 20220222 (experimental) [master r12-7325-g2f59f067610] and -O1, -O2, -O3).
[Bug c++/104691] New: SFINAE does not disable static_assert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104691 Bug ID: 104691 Summary: SFINAE does not disable static_assert Product: gcc Version: og10 (devel/omp/gcc-10) Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: github at quantconsulting dot com Target Milestone: --- Created attachment 52516 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52516&action=edit File generated by -save-temps Description: "Substitution failure is not an error" (SFINAE) correctly disables a second instantiation of a templated function and compilation continues to completion -- unless the implementation to be disabled by SFINAE contains a failing static_assert, in which case compilation reports the static_assert failure as an error and halts. Expected behavior: Even when the failing static_assert is present, compilation should continue to completion because there is substitution failure -- which is *earlier* in that function instantiation, no less -- and thus the static_assert failure should be ignored. gcc -v: COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 10.3.0-1ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-10-S4I5Pr/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-S4I5Pr/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-mutex Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1~20.04) Command line that triggers the bug: g++ -save-temps -o test3 test3.cc Error message: test3.cc: In function ‘std::enable_if_t::value, TContainer> MakeFilled(const typename std::remove_reference::type&)’: test3.cc:23:17: error: static assertion failed: Substitution failure so this should never be checked! 23 | static_assert(false, "Substitution failure so this should never be checked!"); | ^
[Bug c++/104691] SFINAE does not disable static_assert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104691 --- Comment #1 from Leengit --- Closely related bug: although not demonstrated in the supplied code, note that compilation should complete successfully even if the static_assert fails to compile because its first argument is not `constexpr` given these template arguments. Whether because of the substitution error for the first argument of the static_assert or the substitution error earlier in the instantiation -- one or both of these should cause the failure to be `constexpr` to be ignored.
[Bug sanitizer/104690] UBSan does not detect undefined behavior on function without a specified return value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104690 --- Comment #2 from Vincent Lefèvre --- (In reply to Jakub Jelinek from comment #1) > It requires that the callee tells the caller that it reached end of non-void > function without return and the callee checks if the value is actually used > there. Note that the rule makes sense only in the same translation unit (otherwise, this is probable out of scope of the standard since functions may be written in different languages). So I think that a part of the check between the caller and the callee can be done at compile time.
[Bug sanitizer/104690] UBSan does not detect undefined behavior on function without a specified return value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104690 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #4 from Jakub Jelinek --- Created attachment 52517 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52517&action=edit gcc12-pr104688.patch Untested patch to handle it so far on the libatomic side only. Not sure about what exactly to use for 16-byte __atomic_store_n, for 8-byte we use a store for relaxed and xchg for seq_cst (haven't checked other models), we don't have any xchg, so I'm using vmovdqa + mfence, but am not sure if that is fastest.
[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel CPUs with AVX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #5 from Jakub Jelinek --- Of course, if AMD confirms the same, we could just revert the __libat_feat1_init change.
[Bug target/104637] [9/10/11/12 Regression] ICE: maximum number of LRA assignment passes is achieved (30) with -Og -fno-forward-propagate -mavx since r9-5221-gd8fcab689435a29d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104637 --- Comment #3 from Vladimir Makarov --- (In reply to Jakub Jelinek from comment #2) > If I change the testcase to following (so that it doesn't rely on > __builtin_convertvector), it started ICEing with > r0-122162-gb7aa4e9afcd3da4f09d6f982a663ea2094b1f2cf > typedef short __attribute__((__vector_size__ (64))) U; > typedef unsigned long long __attribute__((__vector_size__ (32))) V; > typedef long double __attribute__((__vector_size__ (64))) F; > > int i; > U u; > F f; > > void > foo (char a, char b, _Complex char c, V v) > { > u = (U) { u[0] / 0, u[1] / 0, u[2] / 0, u[3] / 0, u[4] / 0, u[5] / 0, u[6] > / 0, u[7] / 0, > u[8] / 0, u[0] / 0, u[9] / 0, u[10] / 0, u[11] / 0, u[12] / 0, > u[13] / > 0, u[14] / 0, u[15] / 0, > u[16] / 0, u[17] / 0, u[18] / 0, u[19] / 0, u[20] / 0, u[21] / 0, > u[22] > / 0, u[23] / 0, > u[24] / 0, u[25] / 0, u[26] / 0, u[27] / 0, u[28] / 0, u[29] / 0, > u[30] > / 0, u[31] / 0 }; > c += i; > f = (F) { v[0], v[1], v[2], v[3] }; > i = (char) (__imag__ c + i); > } > > In any case, I don't see anything wrong on the GIMPLE side and it isn't > clear on reloading which insn it is ICEing. It is a pitfall of LRA hard reg split subpass. It is a small subpass used as the last resort for LRA when it can not assign a hard reg to a reload pseudo by other ways (e.g. by spilling non-reload pseudos). For simplicity the subpass works on one split base (as each split changes pseudo live range info). To solve the problem the subpass should make as many splits as possible. This requires to check overlapping hard reg splits. In other words, the subpass should be considerably modified. I hope to commit the patch on the next week.
[Bug c++/104691] SFINAE does not disable static_assert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104691 --- Comment #2 from Andrew Pinski --- The unincluded source: #include #include #include template std::enable_if_t MakeFilled(const typename std::remove_reference::type & value) { TContainer result{}; std::fill(result.begin(), result.end(), value); return result; } template std::enable_if_t::value, TContainer> MakeFilled(const typename std::remove_reference::type & value) { static_assert(false, "Substitution failure so this should never be checked!"); } int main() { using ArrayWithPositiveSize = std::array; ArrayWithPositiveSize a = MakeFilled(8); std::cout << "a[" << a.size() - 1 << "] = " << a[a.size() - 1] << std::endl; } CUT clang also causes the assert to happen. I have not looked further into why though.
[Bug target/104681] [9/10/11/12 Regression] ppc64le -mabi=ieeelongdouble ICE since r9-6460
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104681 --- Comment #4 from Segher Boessenkool --- We also do the same in define_insn bodies, with a force_reg if needed. But we do indirect via rs6000_emit_move elsewhere, so let's do that here as well; it isn't a great idea, but consistency wins, certainly in stage 4. The patch is okay for trunk. Thank you! Any backports as well, if you want.
[Bug c++/104691] SFINAE does not disable static_assert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104691 Jonathan Wakely changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #3 from Jonathan Wakely --- Not a bug, you need to make your static_assert condition type dependent or value dependent, static_assert(false) will always fire.
[Bug c++/104691] SFINAE does not disable static_assert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104691 --- Comment #4 from Jonathan Wakely --- Reduced: template typename T::foo f(T) { } template typename T::bar f(T) { static_assert(false, ""); } struct X { using foo = void; }; int main() { X x; f(x); } All compilers reject this code. Something as simple as static_assert(sizeof(T) != 0, ""); can be used to make it dependent, and then it will only fail when the template is instantiated.
[Bug c/104692] New: Constant data at fixed address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104692 Bug ID: 104692 Summary: Constant data at fixed address Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: henrique.coser at tex dot com.br Target Milestone: --- Hello, I need a help. I'm trying to solve a problem for weeks. I have a embeded software that is a boot loader. It puts the boot load version at a specific address. My memmory starts at 0x40 with 0x1400 size. My constant version string value must be placed @0x401000 with 8bytes length. If I place this const value into a section like this: const unsigned char Version[8] __attribute__ ((section (".bootversion"))) = "V1.0.1a"; I got this error: section .bootversion LMA [00401000,00401007] overlaps section .text LMA [0040,00401013]collect2.exe(0,0): error: ld returned 1 exit status I have already tried to split flash memmory using linker script but it does not worked. I wish to find something like "automatic" split. For example, this code was compiled using ARM Keil. With ARM Keil I have the attribute that makes all the magic : const unsigned char Version[8] __attribute__((at(0x0401000))) = "V1.0.1a"; I dont know if is possible to have something as pratical as ARM Keil attribute in GCC. I really need make this thing work. If this is not the best channel to ask, please, could you recommend me one? Thank you very much!
[Bug c++/104691] SFINAE does not disable static_assert
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104691 --- Comment #5 from Jonathan Wakely --- N.B. this has nothing to do with SFINAE. The static_assert fails when the template is first parsed *not* when instantiated. You can verify this easily: template typename T::bar f(T) { static_assert(false, ""); } There is no SFINAE here, because there is no call and no substitution. The relevant rule in the standard is: The program is ill-formed, no diagnostic required, if: — no valid specialization can be generated for a template or a substatement of a constexpr if statement (8.5.2) within a template and the template is not instantiated, or [...] No valid specialization can ever be generated for that template. Every possible set of template arguments will result in a failing static assert.
[Bug c/104693] New: Can't disable "comparison between pointer and integer"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104693 Bug ID: 104693 Summary: Can't disable "comparison between pointer and integer" Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: charles.nicholson at gmail dot com Target Milestone: --- The following code causes gcc to emit a warning that does not seem to have a diagnostic name, which makes it hard (impossible?) to disable: = #include #include bool foo(unsigned long not_a_pointer) { return not_a_pointer == NULL; } = : In function 'foo': :5:24: warning: comparison between pointer and integer 5 | return not_a_pointer == NULL; |^~ = Godbolt link: https://gcc.godbolt.org/z/chcxYevvf I don't think anyone would argue that this is "good" (portable, defined, etc) C code; it is vendor code that I'm stuck with. Additionally, on the architecture I'm targeting, the implementation does the right thing: the unsigned long is pointer-sized, and is compared against the NULL literal 0, which is used in this context as a sentinel value. The code should use the literal 0 instead of NULL, but it does not. I have 3 suggestions, if they're helpful: 1. If this warning can be disabled, update the diagnostic message to include the name as a hint on how to disable it. 2. If this warning can not be disabled, consider giving it a formal diagnostic name and making it controllable via the standard methods (-Wno-, pragma, etc) 3. If this is truly heinous / unsafe enough, promote it to an error. Thanks for all of your efforts on gcc! Best, Charles
[Bug c/104693] Can't disable "comparison between pointer and integer"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104693 --- Comment #1 from Charles Nicholson --- Oh, also, this warning appears to go all the way back to gcc 4.1.2, the earliest that godbolt still supports.
[Bug target/104681] [9/10/11/12 Regression] ppc64le -mabi=ieeelongdouble ICE since r9-6460
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104681 --- Comment #5 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:3885a122f817a1b6dca4a84ba9e020d5ab2060af commit r12-7393-g3885a122f817a1b6dca4a84ba9e020d5ab2060af Author: Jakub Jelinek Date: Fri Feb 25 18:58:48 2022 +0100 rs6000: Use rs6000_emit_move in movmisalign expander [PR104681] The following testcase ICEs, because for some strange reason it decides to use movmisaligntf during expansion where the destination is MEM and source is CONST_DOUBLE. For normal mov expanders the rs6000 backend uses rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG and a few other things, but for movmisalign nothing enforced this. The middle-end documents that movmisalign shouldn't fail, so we can't force that through predicates or condition on the expander. 2022-02-25 Jakub Jelinek PR target/104681 * config/rs6000/vector.md (movmisalign): Use rs6000_emit_move. * g++.dg/opt/pr104681.C: New test.
[Bug target/104681] [9/10/11 Regression] ppc64le -mabi=ieeelongdouble ICE since r9-6460
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104681 Jakub Jelinek changed: What|Removed |Added Ever confirmed|0 |1 Summary|[9/10/11/12 Regression] |[9/10/11 Regression] |ppc64le |ppc64le |-mabi=ieeelongdouble ICE|-mabi=ieeelongdouble ICE |since r9-6460 |since r9-6460 Last reconfirmed||2022-02-25 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- Fixed on the trunk so far.
[Bug middle-end/104692] Constant data at fixed address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104692 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- >If this is not the best channel to ask, please, could you recommend me one? Either binut...@sourceware.org is a better place to get help on linker scripts (gcc-h...@gcc.gnu.org is if you need help with GCC but I think you have the GCC side working).
[Bug middle-end/104692] Constant data at fixed address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104692 Martin Sebor changed: What|Removed |Added Last reconfirmed||2022-02-25 Ever confirmed|0 |1 CC||msebor at gcc dot gnu.org Resolution|INVALID |--- Status|RESOLVED|NEW --- Comment #2 from Martin Sebor --- I recommended to open this enhancement request and it looks like it has support from others as well (see the thread at the link below), so with that let me confirm it. https://gcc.gnu.org/pipermail/gcc-help/2022-February/141294.html
[Bug testsuite/104694] New: New test case g++.dg/pr104540.C has excess errors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104694 Bug ID: 104694 Summary: New test case g++.dg/pr104540.C has excess errors Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:a026b67f8f70d6cf35daf42a4b0909f78c9d7f40, r12-7382-ga026b67f8f70d6 make -k check-gcc RUNTESTFLAGS="dg.exp=g++.dg/pr104540.C" FAIL: g++.dg/pr104540.C -std=gnu++98 (test for excess errors) FAIL: g++.dg/pr104540.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/pr104540.C -std=gnu++17 (test for excess errors) FAIL: g++.dg/pr104540.C -std=gnu++20 (test for excess errors) # of unexpected failures4 Excess errors: xg++: error: unrecognized command-line option '-mforce-drap' xg++: error: unrecognized command-line option '-mstackrealign' Are those some sort of target specific options perhaps? commit a026b67f8f70d6cf35daf42a4b0909f78c9d7f40 (HEAD, refs/bisect/bad) Author: Alexandre Oliva Date: Thu Feb 24 22:03:34 2022 -0300 Cope with NULL dw_cfi_cfa_loc
[Bug testsuite/104694] New test case g++.dg/pr104540.C has excess errors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104694 Marek Polacek changed: What|Removed |Added Resolution|--- |FIXED CC||mpolacek at gcc dot gnu.org Status|UNCONFIRMED |RESOLVED --- Comment #1 from Marek Polacek --- Jakub already fixed this in r12-7392-gcc187fbca79ee9.
[Bug c++/104618] [12 Regression] trunk 20220221 on x86_64-linux-gnu ICEs building sh.cc for sh4-linux-gnu (in build_call_a, at cp/call.cc:381)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104618 Jason Merrill changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jason at gcc dot gnu.org CC||jason at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug c++/104695] New: different bit patterns in __builtin_nans and libquadmath::nanq
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104695 Bug ID: 104695 Summary: different bit patterns in __builtin_nans and libquadmath::nanq Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: g.peterh...@t-online.de Target Milestone: --- Hello gcc-Team, the related __builtin_nans return different values and libquadmath::nanq ignores the parameter. Please see my test case https://godbolt.org/z/fda5vevPe regards Gero
[Bug libquadmath/104695] different bit patterns in __builtin_nans and libquadmath::nanq
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104695 Andrew Pinski changed: What|Removed |Added Component|c++ |libquadmath --- Comment #1 from Andrew Pinski --- Hmm: https://en.cppreference.com/w/c/numeric/math/nan Note I don't know if Nan needs to keep its bit pattern all the time and/or when it can change.
[Bug libquadmath/104695] different bit patterns in __builtin_nans and libquadmath::nanq
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104695 --- Comment #2 from g.peterh...@t-online.de --- Yes, that is very vaguely worded. However, the std functions or builtins must always return the same values on the same platform. quiet nan: libquadmath::nanq != __builtin_nanf128 signaling nan: __builtin_nansf64x != __builtin_nansl __builtin_nansf64 != __builtin_nans __builtin_nansf32 != __builtin_nansf
[Bug tree-optimization/104675] [9/10/11/12 Regression] ICE: in expand_expr_real_2, at expr.cc:9773 at -O with __real__ + __imag__ extraction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104675 --- Comment #9 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:f62115c9b770a66c5378f78a2d5866243d560573 commit r12-7394-gf62115c9b770a66c5378f78a2d5866243d560573 Author: Jakub Jelinek Date: Fri Feb 25 21:25:12 2022 +0100 match.pd: Further complex simplification fixes [PR104675] Mark mentioned in the PR further 2 simplifications that also ICE with complex types. For these, eventually (but IMO GCC 13 materials) we could support it for vector types if it would be uniform vector constants. Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs and we can't use bit_and etc. for complex type. 2022-02-25 Jakub Jelinek Marc Glisse PR tree-optimization/104675 * match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1): Restrict simplifications to INTEGRAL_TYPE_P. * gcc.dg/pr104675-3.c : New test.
[Bug fortran/104350] ICE in gfc_array_dimen_size(): Bad dimension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104350 anlauf at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2022-02-25 --- Comment #1 from anlauf at gcc dot gnu.org --- Confirmed. There are multiple ways to avoid the ICE: - replace gfc_internal_error in gfc_array_dimen_size by gfc_error, e.g. diff --git a/gcc/fortran/array.cc b/gcc/fortran/array.cc index f1d92e00c98..a42fb8f59fe 100644 --- a/gcc/fortran/array.cc +++ b/gcc/fortran/array.cc @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see #include "options.h" #include "gfortran.h" #include "parse.h" +#include "intrinsic.h" #include "match.h" #include "constructor.h" @@ -2571,7 +2572,13 @@ gfc_array_dimen_size (gfc_expr *array, int dimen, mpz_t *result) return false; if (dimen < 0 || dimen > array->rank - 1) -gfc_internal_error ("gfc_array_dimen_size(): Bad dimension"); +{ + gfc_error ("DIM argument (%d) to intrinsic %qs at %L out of range " +"(1:%d)", dimen+1, gfc_current_intrinsic, +gfc_current_intrinsic_where, array->rank); + return false; +} +//gfc_internal_error ("gfc_array_dimen_size(): Bad dimension"); switch (array->expr_type) { This however produces two error messages of the kind: pr104350-z1.f90:4:21: 4 | print *, product([(size(x, dim=k), k=0,rank(x))]) | 1 Error: DIM argument (0) to intrinsic 'size' at (1) out of range (1:1) pr104350-z1.f90:4:10: 4 | print *, product([(size(x, dim=k), k=0,rank(x))]) | 1 Error: DIM argument (0) to intrinsic 'product' at (1) out of range (1:1) The second error is misleading. :-( - a similar patch to simplify_size generates a similar error twice. :-(
[Bug tree-optimization/104675] [9/10/11 Regression] ICE: in expand_expr_real_2, at expr.cc:9773 at -O with __real__ + __imag__ extraction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104675 Jakub Jelinek changed: What|Removed |Added Summary|[9/10/11/12 Regression] |[9/10/11 Regression] ICE: |ICE: in expand_expr_real_2, |in expand_expr_real_2, at |at expr.cc:9773 at -O with |expr.cc:9773 at -O with |__real__ + __imag__ |__real__ + __imag__ |extraction |extraction --- Comment #10 from Jakub Jelinek --- Fixed on the trunk so far.
[Bug target/100085] Bad code for union transfer from __float128 to vector types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #22 from Segher Boessenkool --- Well, we do not do anything AT here; but the patch is not on the GCC 11 branch either. Xiong Hu, does it backport there cleanly?
[Bug target/103302] wrong code with -fharden-compares
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103302 --- Comment #16 from Alexandre Oliva --- Created attachment 52518 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52518&action=edit Candidate patch The problem is in undo_optional_reloads. Here's a fix I'm testing.
[Bug fortran/104696] New: [12 Regression][OpenMP] Implicit mapping breaks struct mapping
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104696 Bug ID: 104696 Summary: [12 Regression][OpenMP] Implicit mapping breaks struct mapping Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: cltang at gcc dot gnu.org, jakub at gcc dot gnu.org Target Milestone: --- Created attachment 52519 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52519&action=edit implicit.f90 - compile with -fopenmp and run with a non-shared-memory device. In my understanding, the following is valid and should work. However, it fails – and I bet it is due to the implicit mapping of 'var3'. The code uses: !$omp target map(to: var3%R(2)%d) ... Printing on the host the 'loc (var3%R(2)%d)' shows: 21516C0 7F51E5E001D8 STOP 11 As var%R(2)%d (or in the dump 'var3.r[1].d.data') is now in device address space, accessing it after the target region crashes the program. implicit.f90.005t.original: #pragma omp target map(to:var3.r[1].d [len: 88]) map(to:*(struct t2[0:] *) var3.r[1].d.data [len: D.4243 * 4]) map(always_pointer:(struct t2[0:] *) var3.r[1].d.data [pointer assign, bias: 0]) implicit.f90.006t.gimple: #pragma omp target num_teams(1) thread_limit(0) map(tofrom:var3 [len: 440][implicit]) map(to:var3.r[1].d [len: 88]) map(to:MEM [(struct t2[0:] *)_9] [len: _8]) map(always_pointer:var3.r[1].d.data [pointer assign, bias: 0])
[Bug fortran/104697] New: Memory leak with ALLOCATABLE COMPONENTS and SOURCE= expression and POINTER
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104697 Bug ID: 104697 Summary: Memory leak with ALLOCATABLE COMPONENTS and SOURCE= expression and POINTER Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- The following leaks twice. Once for the SOURCE= expression: allocate(var%D, source=reshape([t2(1), t2(2), t2(3), t2(4)], [2,2])) This seems to create temporaries which do not get deallocated. Secondly, the deep freeing does not seem to work for: deallocate (var%D) This can be fixed by using deallocate (var%D(1,1)%x, var%D(1,2)%x, var%D(2,1)%x, var%D(2,2)%x) deallocate (var%D) instead. Compiling with -fsanitize=address shows the 8 memory leaks - or 4 memory leaks when commenting in the additional deallocate. implicit none type t2 integer, allocatable :: x end type t2 type t type(t2), pointer :: D(:,:) => null() end type t type(t) :: var allocate(var%D, source=reshape([t2(1), t2(2), t2(3), t2(4)], [2,2])) if (any (shape(var%D) /= [2,2])) stop 1 if (var%D(1,1)%x /= 1 .or. var%D(1,2)%x /= 3 .or. & var%D(2,1)%x /= 2 .or. var%D(2,2)%x /= 4) stop 2 ! deallocate (var%D(1,1)%x, var%D(1,2)%x, var%D(2,1)%x, var%D(2,2)%x) deallocate (var%D) end