[Bug gcov-profile/94029] [9 Regression] gcc crash in coverage.c:655 since r9-4216-g390e529e2b98983d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94029 --- Comment #16 from Bernd Edlinger --- Sandra, I am pretty sure it should exist, can you check which git revision you are looking at?
[Bug gcov-profile/94029] [9 Regression] gcc crash in coverage.c:655 since r9-4216-g390e529e2b98983d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94029 --- Comment #19 from Bernd Edlinger --- Okay, forget my previous comment, I overlooked that you say the .c.gcov is missing...
[Bug target/91614] [10 regression][arm] gcc.target/arm/unaligned-memcpy-2.c FAIL since r274986
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91614 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #3 from Bernd Edlinger --- My attempt at fixing this failure https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg00370.html I am not sure if it war rejected or just not reviewe
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #1 from Bernd Edlinger --- Hi, I use a newer binutils versions FWIW, and buit GCC-10 from a few days ago using those binutils. $ readelf -version GNU readelf (GNU Binutils) 2.32 Copyright (C) 2019 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) any later version. This program has absolutely no warranty. instead of #+BEGIN_EXAMPLE <2><8db>: Abbrev Number: 38 (DW_TAG_inlined_subroutine) <8dc> DW_AT_abstract_origin: <0x9cc> <8e0> DW_AT_entry_pc: 0x400545 <8e8> Unknown AT value: 2138: 3 <8e9> DW_AT_ranges : 0x30 <8ed> DW_AT_call_file : 1 <8ee> DW_AT_call_line : 52 <8ef> DW_AT_call_column : 10 <8f0> DW_AT_sibling : <0x92f> #+END_EXAMPLE I see: <2><8b3>: Abbrev Number: 38 (DW_TAG_inlined_subroutine) <8b4> DW_AT_abstract_origin: <0x9a4> <8b8> DW_AT_entry_pc: 0x401165 <8c0> DW_AT_GNU_entry_view: 0 <8c1> DW_AT_ranges : 0x30 <8c5> DW_AT_call_file : 1 <8c6> DW_AT_call_line : 52 <8c7> DW_AT_call_column : 10 <8c8> DW_AT_sibling : <0x907> But as you can see there is a view number where the range begins, but there is no view number where the range ends. I am not sure if there are any view numbers when the inline has multiple ranges, (I dont remember) Therefore I try to chamge gdb to ignore is-stmt breakpoints which are at the end of the inline block. I am not sure if an entirely new DWARF version is necessary, or just a new AT value: like 2139 or so, to give us an idea if the is-stmt is per its view in the subroutine or in the calling program. Bernd.
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #3 from Bernd Edlinger --- (In reply to Andrew Burgess from comment #2) > Sorry for including the wrong DWARF dump output in the bug report. I too > had seen the DW_AT_GNU_entry_view using a more recent binutils. > NP. > When you pose the question: > > I am not sure if there are any view numbers when the inline has multiple > ranges, (I dont remember) > > I think you're asking do we get DW_AT_GNU_entry_view when we have multiple > ranges. And the answer is yes, this case already has multiple ranges (it > has a DW_AT_ranges, which contains multiple ranges), and this is super > confusing, because each range could, I guess, could have a different view > number for its start, right? > I am talking about the end of the range each of the subranges They should have view numbers, gdb should note the view number when parsing the line program. I think we need an DW_AT_GNU_exit_view, you know, young jedi :-) Bernd.
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #4 from Bernd Edlinger --- Can you please approve my patch now? https://sourceware.org/pipermail/gdb-patches/2020-April/167385.html Thanks Bernd.
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #6 from Bernd Edlinger --- Right, #+BEGIN_EXAMPLE 0030 00400545 00400545 (start == end) 0030 00400549 00400553 0030 00400430 00400435 0030 #+END_EXAMPLE I dont see any view numbers here, but I would need them.
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #7 from Bernd Edlinger --- > I don't understand why each range wouldn't need its own view number? Each of the sub ranges end PC can be an exit point. At least how I see it. Please have a look at my patch. It adds each of the ranges end address to the end address list, and there are two cases, one where only one range is there, and one where multiple ranges are there. then the end PC addresses are used to modify the is-stmt bits of the corresponding line entries if there are any. Those cannot be used for break points per line number, possibly. But I dont care. Thanks Bernd.
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #9 from Bernd Edlinger --- Andrew, please update the reproducer, and explain in more detail what you would like to be changed. I still do not understand your idea. But I try hard to do so. Please be patient with me. Bernd.
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #11 from Bernd Edlinger --- Andrew, (In reply to Andrew Burgess from comment #10) > Further, I've seen no mention of exit views anywhere, and I think they > would also be needed. > Yes, that is also my idea, when I say the dwarf2 spec needs to be fixed. > With the addition of these two features we would be able to support (I > believe) fully merged callers and callees, with lines marked as > is-stmt from both the caller and callee appearing at the same address. > > For now, without this, I think GCC needs to restrict itself when > inlining. When an address represents a line from both the caller and > callee, then that address should only be is-stmt true for EITHER lines > from the caller, or lines from the callee. If the address is is-stmt > true for a line from the caller, then the address should NOT be within > the callee's range(s), and if the address is is-stmt true for a line > from the callee, then the address MUST be within the callee's > range(s). > I tried to do something similar, in my original gcc-patch, which was unfortunately not accepted. But what I learned from writing the patch is that gcc cannot easily tell if a range will be empty or not. That is because the assembler does emit the line info and the views, and the assembler decides how many bytes if any a certain construct will take in the binary representation. That is why this empty range appears, because that was supposed to be the start of the inline function, but instructions from the function prologue were moved in there. What causes the problem is the is-stmt line info which is the only remaining thing from the original plan. But on the other hand, it is not a big issue for gdb to ignore those artefacts, when they rarely occur. My gdb-patch does this, by changing the is-stmt bit of this line info. > On a final note. These are just my personal thoughts from the > perspective of a debug consumer, though I use words like "must" or > "should" above this reflects my thoughts on how I believe the debug > should appear, and is not an attempt to prescribe how GCC should > be. I know there are limitations to what GCC can achieve, and also, I > could be totally wrong in my understanding of DWARF. I'm always happy > to be corrected! Yeah, that is true for myself as well. Ping, Alexandre Oliva, are you still with us? I would be curious to know what you think, about this how should we proceed? Thanks Bernd.
[Bug debug/94474] Incorrect DWARF range information for inlined function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94474 --- Comment #13 from Bernd Edlinger --- Hi Andrew, You are right about the instruction re-ordering, that is done in a compiler pass, which simply re-orders RTL instruction lists. But I think when the code motion happens, we just have no easy access to the range markers. And it may be the case that this is-stmt location mentions a register value that is indeed the parameter of the inline function, so it may be no instruction but only a side note, to the debugger, that a certain value would be already here available in a certain register. Also that is only a vague guess, since although I did already a number of gcc patches, I learn new things each time :-) Bernd.
[Bug c++/92365] [10 Regression] ice unexpected expression ‘int16_t()’ of kind cast_expr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92365 --- Comment #4 from Bernd Edlinger --- Created attachment 47180 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47180&action=edit possible fix This seems to fix the issue, although a fix in cxx_eval_constant_expression might be preferrable.
[Bug c++/92024] crash in check_local_shadow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92024 --- Comment #5 from Bernd Edlinger --- (In reply to Arseny Solokha from comment #4) > Is there a backport pending? I cannot reproduce this ICE on release branches. Hmm, interesting, I would have expected this to ICE. However this patch is not complete yet as it seems, since it exposed another hidden bug: #92365 With proposed patch here: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00639.html
[Bug target/91615] [10 regression][armeb] ICEs since r274986
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91615 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #11 from Bernd Edlinger --- (In reply to Jakub Jelinek from comment #9) > Is this fixed now? everything except the regression in arm/unaligned-memcpy-2/3.c the patch was considered too ugly: the last version was here (probably still too ugly): https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00370.html but I did not consider this important enough to send any pings.
[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #26 from Bernd Edlinger --- (In reply to fdlbxtqi from comment #2) > Also find a bug of __memmove > > /* >* A constexpr wrapper for __builtin_memmove. >* @param __num The number of elements of type _Tp (not bytes). >*/ > template > _GLIBCXX14_CONSTEXPR > inline void* > __memmove(_Tp* __dst, const _Tp* __src, size_t __num) > { > #ifdef __cpp_lib_is_constant_evaluated > if (std::is_constant_evaluated()) > { > for(; __num > 0; --__num) > { > if constexpr (_IsMove) > *__dst = std::move(*__src); > else > *__dst = *__src; > ++__src; > ++__dst; > } > return __dst; > } > else > #endif > return __builtin_memmove(__dst, __src, sizeof(_Tp) * __num); > return __dst; > } > > The last 2nd line return __dst is wrong. It should not exist. Sorry, I don't know what this function is all about. But to me the code in the ifdef looks totally bogus. First it returns __dst+__num, while memmove is sopposed to return __dst, and is is somehow clear that __dst and __src do not overlap? because if they do the loop would overwite the __dst buffer before __src is fully copied?
[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 --- Comment #31 from Bernd Edlinger --- Yes, you usually need to make a full bootstrap / make check twice which the same svn revision one with and one without your patch. You also should make sure that the test case actually is able to fail before your patch. You run "$(srcdir)/contrib/test_summary -t" each time and compare the output.
[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 --- Comment #34 from Bernd Edlinger --- (In reply to fdlbxtqi from comment #33) > Created attachment 47574 [details] > copy_backward bug fixed for the last patch > > going to further run testsuite Your test does not contain any test cases.
[Bug c++/92024] crash in check_local_shadow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92024 --- Comment #7 from Bernd Edlinger --- Yes, I guess so.
[Bug bootstrap/93548] New: gcc build tries to modify source tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93548 Bug ID: 93548 Summary: gcc build tries to modify source tree Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de Target Milestone: --- I build gcc with read only source tree, this worked all the time, but now it does no longer: ../gcc-trunk-0/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf --enable-languages=all --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard make Fails, because it attempts to modify the source code; HEADERS="auto-host.h ansidecl.h" DEFINES="" \ /bin/sh ../../gcc-trunk-0/gcc/mkconfig.sh config.h TARGET_CPU_DEFAULT="\"arm10e\"" \ HEADERS="options.h insn-constants.h config/vxworks-dummy.h config/dbxelf.h config/elfos.h config/gnu-user.h config/linux.h config/linux-android.h config/glibc-stdint.h config/arm/elf.h config/arm/linux-gas.h config/arm/linux-elf.h config/arm/bpabi.h config/arm/linux-eabi.h config/arm/aout.h config/arm/arm.h config/initfini-array.h defaults.h" DEFINES="LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 LIBC_MUSL=4 DEFAULT_LIBC=LIBC_GLIBC ANDROID_DEFAULT=0" \ /bin/sh ../../gcc-trunk-0/gcc/mkconfig.sh tm.h TARGET_CPU_DEFAULT="" \ HEADERS="config/arm/arm-flags.h config/arm/arm-protos.h config/arm/aarch-common-protos.h config/linux-protos.h tm-preds.h" DEFINES="" \ /bin/sh ../../gcc-trunk-0/gcc/mkconfig.sh tm_p.h gawk -f ../../gcc-trunk-0/gcc/config/arm/parsecpu.awk -v cmd=md \ ../../gcc-trunk-0/gcc/config/arm/arm-cpus.in > arm-tune.new mv arm-tune.new ../../gcc-trunk-0/gcc/config/arm/arm-tune.md mv: can't rename 'arm-tune.new': Permission denied make[3]: *** [../../gcc-trunk-0/gcc/config/arm/arm-tune.md] Error 1 make[3]: *** Waiting for unfinished jobs make[3]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf/gcc' make[2]: *** [all-stage1-gcc] Error 2 make[2]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf' make: *** [all] Error 2
[Bug bootstrap/93548] gcc build tries to modify source tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93548 --- Comment #1 from Bernd Edlinger --- git Revision c3ccce5b47f85d70127f5bb894bc5e83f8d2510e If absolutely necessary that should only be done in maintainer mode.
[Bug bootstrap/93548] gcc build tries to modify source tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93548 --- Comment #3 from Bernd Edlinger --- Ah, thanks I will do that. Apparently the git conversion is to blame :)
[Bug bootstrap/93548] gcc build tries to modify source tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93548 Bernd Edlinger changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #4 from Bernd Edlinger --- thanks, seems to work now.
[Bug gcov-profile/94029] New: gcc crash in coverage.c:655
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94029 Bug ID: 94029 Summary: gcc crash in coverage.c:655 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de CC: marxin at gcc dot gnu.org Target Milestone: --- This causes a crash in gcc: $ cat test.c #define impl_test(name) void test_##name() { } impl_test(t1 ) impl_test(t2) $ gcc -ftest-coverage -c test.c during IPA pass: profile test.c: In function ‘test_t2’: test.c:2:1: internal compiler error: in coverage_begin_function, at coverage.c:655 2 | impl_test(t1 | ^ 0x66ce7f coverage_begin_function(unsigned int, unsigned int) ../../gcc-trunk/gcc/coverage.c:655 0xd2cc66 branch_prob(bool) ../../gcc-trunk/gcc/profile.c:1307 0xea6687 tree_profiling ../../gcc-trunk/gcc/tree-profile.c:779 0xea6687 execute ../../gcc-trunk/gcc/tree-profile.c:885 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. (this was reduced from openssl-1.1.1d where this also happens, when configured with -ftest-coverage)
[Bug gcov-profile/94029] [9/10 Regression] gcc crash in coverage.c:655 since r9-4216-g390e529e2b98983d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94029 --- Comment #4 from Bernd Edlinger --- Martin, in the gcc-8 branch is the gcov working, or has it the same issue as bug#88045 ?
[Bug gcov-profile/94029] [9/10 Regression] gcc crash in coverage.c:655 since r9-4216-g390e529e2b98983d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94029 --- Comment #6 from Bernd Edlinger --- openssl workaround is here: https://github.com/openssl/openssl/pull/11246
[Bug c/56341] New: GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 Bug #: 56341 Summary: GCC produces unaligned data access Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: major Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: bernd.edlin...@hotmail.de Created attachment 29464 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29464 test program produces alignment faults Hello, The attached test program causes two problems when compiled with GCC 4.6.3 for ARM: 1. test() fails to write the high word of an unaligned volatile struct member. 2. test1() crashes because it uses an unaligned word access. This code did compile and execute correctly with GCC 4.3.2 As a workaround, the bug goes away if the code is compiled with -fno-strict-volatile-bitfields, but this is probably less efficient code. The attached patch is a backport of the following patch, and seems to resolve this issue: http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01449.html the original patch ignored the alignment of the target, and fixed only the first test case, but not the crash in the second test case. Regards Bernd Edlinger
[Bug c/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #1 from Bernd Edlinger 2013-02-15 13:12:56 UTC --- Created attachment 29465 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29465 proposed patch attached is a patch for gcc-4.6.3 that should resolve this issue. volatile packed struct members are accessed in words if structure is aligned by 2 and in bytes if structure is aligned by 1.
[Bug c/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #3 from Bernd Edlinger 2013-02-15 14:46:39 UTC --- (In reply to comment #2) > The test case causes alignment exceptions for me on armv5tel-linux-gnueabi, > when compiled with any one of gcc 4.8, 4.7, or 4.6. Was Sandra's patch ever > applied? apparently not. not in 4.6.x not in 4.7.2. When I used the original patch the assignment in test() was fixed, but the crash in test1() was still there, because the patch did not pay attention to the alignment of the structure. Therefore I added a check for the alignment in both read and write instructions. Regards, Bernd Edlinger.
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #6 from Bernd Edlinger 2013-02-18 18:41:55 UTC --- hhmm... could some one give an example where packedp would be false but the value is packed or unaligned?
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 Bernd Edlinger changed: What|Removed |Added Attachment #29465|0 |1 is obsolete|| --- Comment #7 from Bernd Edlinger 2013-02-20 01:38:04 UTC --- Created attachment 29506 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29506 proposed patch this patch uses packedp only for the warn_misaligned_bitfield() but does always use multiple load or stores. So even if the packedp may be unreliable it will only have influence on the warning text. reason: if packedp == false the code will always use a single but mis-aligned instruction which is known to abort at runtime. So that is always wrong. note: there are two almost identical formula used for packedp. packedp as it is used in extract_bit_field (old code): if (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (exp, 0))) || (TREE_CODE (TREE_OPERAND (exp, 1)) == FIELD_DECL && DECL_PACKED (TREE_OPERAND (exp, 1 packedp = true; packedp as it is used in store_field (new code): if (TREE_CODE(to) == COMPONENT_REF && (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (to, 0))) || (TREE_CODE (TREE_OPERAND (to, 1)) == FIELD_DECL && DECL_PACKED (TREE_OPERAND (to, 1) packedp = true; However if we can not trust the second one why should we trust the first one? Therefore the packedp should not have influence on the code generation at all. That would only take unnecessary risks. Well, I think that should resolve your objections... Right?
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 Bernd Edlinger changed: What|Removed |Added Attachment #29506|0 |1 is obsolete|| --- Comment #8 from Bernd Edlinger 2013-02-26 18:24:58 UTC --- Created attachment 29546 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29546 proposed patch, cleaned up in the last version of this patch the packedp parameter had only an impact on the generated warning. if packedp==true the warning is: "multiple accesses to volatile structure member/bitfield because of packed attribute" if packedp==false the warning is: "mis-aligned access used for structure member/bitfield when a volatile object spans multiple type-sized locations, the compiler must choose between using a single mis-aligned access to preserve the volatility, or using multiple aligned accesses to avoid runtime faults; this code may fail at runtime if the hardware does not allow this access" The second warning says in short: "I am going to generate mis-aligned code, and I know it will fail at runtime." However this patch is supposed to avoid mis-aligned code, at least for ARM. Therefore it is only natural that the second warning is no longer needed. Now I removed all packedp code in extract_bit_field and store_bit_field, including the second warning. Fortunately that makes the patch much smaller. I did boot-strap the patched compiler several times, and everything looks good. TODO: remove translations of the obsolete warnings. (I dont know how to)
[Bug c/56712] New: constuctor function is called twice
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56712 Bug #: 56712 Summary: constuctor function is called twice Classification: Unclassified Product: gcc Version: 4.6.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: bernd.edlin...@hotmail.de Created attachment 29713 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29713 cnt++ is accidentally called twice at -O2 or higher The attached test program has a constructor function with the __attribute__((constructor)) that is split up into two parts construct.part.0 and construct construct.part.0 is the part after "if (xx != 0) return;" The problem is that both are put into the .ctors section first the construct.part.0 and then construct. Unfortunately the construct function is called before construct.part.0 which has the check removed. Therefore basically the constuctor is called twice: cnt=2 at -O2 or -O3, but cnt=1 at -O1 or less.
[Bug middle-end/56712] [4.6 Regression] constructor function is called twice
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56712 --- Comment #4 from Bernd Edlinger 2013-03-26 06:13:55 UTC --- (In reply to comment #2) > Works for me with 4.7/4.8/4.9, and 4.5 and older, but fails with 4.6. > The bug was fixed for 4.7.0 by r180700; that change has no BZ PR entry, but > the > patch submission (http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02486.html) > described a scenario involving cloned constructor functions much like this > one. OK, confirmed. with this fix the bug went away in the test example and in the original more complex context.
[Bug middle-end/56712] [4.6 Regression] constructor function is called twice
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56712 --- Comment #5 from Bernd Edlinger 2013-03-26 06:15:52 UTC --- Created attachment 29724 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29724 backport of the above mentioned fix
[Bug middle-end/56712] [4.6 Regression] constructor function is called twice
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56712 Bernd Edlinger changed: What|Removed |Added Attachment #29724|0 |1 is obsolete|| --- Comment #7 from Bernd Edlinger 2013-03-26 19:18:59 UTC --- Created attachment 29735 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29735 backport of commit r180700 in gcc-4.7 branch OK, now I see... I tested the new patch again. Everything looks good.
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #9 from Bernd Edlinger 2013-03-27 10:36:48 UTC --- Hello GCC-Maintainers, what do you think? Should'nt this patch be in the 4.6.4 release?
[Bug middle-end/56341] GCC produces unaligned data access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 --- Comment #11 from Bernd Edlinger --- Created attachment 30248 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30248&action=edit another example of the alignment faults Hello Sandra, good that you continue to work on that bug again. I agree that there may be two completely different aspects of this bug. Attached you'll find a new test program that came in my mind when I looked at PR 41809, the structure s is aligned 2 and packed. If you make it an array of size 10, each second call of f is given an odd pointer. But the compiler should know that because of the aligned(2) attribute. What is the difference to PR 41809 is this: 1. PR 41809 is not a correct C-program at all, and has never been. While this attached new test program is correct C program. previous GCC versions did compile that correctly, current GCC does not even emit a warning. 2. PR 41809 is not about volatile at all. However if you remove the "volatile" in the test program(s), the code is correct and does no longer use unaligned addresses. On the other hand, "volatile" might mean that the compiler should try not to optimize the read instructions, for instance in loops. But of course not to an extent that the generated code is no longer valid.
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #6 from Bernd Edlinger --- (In reply to Sandra Loosemore from comment #5) > Patch posted here: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00750.html Hi Sandra, I tried your patch, but I dont like the code that it generates: printf("%x\n", (unsigned int)g.b); g.b = 0xAAA; is compiled to invalid code (in ARMv5) ldr r4, .L2 ldr r1, [r4] ldr r3, [r4, #4] and r3, r3, #7 mov r3, r3, asl #25 orr r1, r3, r1, lsr #7 ldr r0, .L2+4 bl printf ldr r2, [r4] ldr r3, .L2+8 and r2, r2, #127 orr r3, r2, r3 str r3, [r4] ldr r3, [r4, #4] bic r3, r3, #7 orr r3, r3, #5 str r3, [r4, #4] code is invalid because: the object "g" is only 5 bytes large, but the first statement reads 2x4 bytes, and ignores the 3 extra bytes. this can fault if g is close to a segment boundary. The second statement reads the 3 bytes beyond g and writes them unmodified back. That is problematic if a task switch occurs between the read and store sequence, and the other task modifies something in the 3 bytes. Previous versions of gcc produced single 5x1 byte read/write sequences for that structure, as does apparently the x86 version. Regards Bernd.
[Bug libstdc++/57691] New: freestanding libstdc++ has compile error
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57691 Bug ID: 57691 Summary: freestanding libstdc++ has compile error Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de Created attachment 30349 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30349&action=edit Proposed fix for this problem Hello, I want to compile the gcc-4.8.1 in a freestanding environment (eCos) but I encountered a compile error in libstdc++-v3/libsupc++/atexit_thread.cc: ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc: In function 'void {anonymous}::key_init()': ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc:87:21: error: no matches converting function 'run' to type 'void (*)(...) std::atexit (run); ^ ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc:66:8: note: candidates are: void {anonymous}::run() void run () ^ ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc:58:8: note: void {anonymous}::run(void*) void run (void *p) ^ ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc: In function 'int __cxxabiv1::__cxa_thread_atexit(void (*)(void*), void*, void*)': ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc:109:20: error: no matches converting function 'run' to type 'void (*)(...)' std::atexit (run); ^ ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc:66:8: note: candidates are: void {anonymous}::run() void run () ^ ../../../../gcc-4.8.1/libstdc++-v3/libsupc++/atexit_thread.cc:58:8: note: void {anonymous}::run(void*) void run (void *p) ^ The used config parameters are: ../gcc-4.8.1/configure --target=arm-eabi --prefix=/home/ed/gnu/arm-eabi --with-newlib --enable-languages=c,c++ --disable-hosted-libstdcxx --disable-__cxa_atexit The compiler is simply right to complain about the ambiguity here: The problem is the function atexit() that is declared in cstdlib to take a parameter 'void (*)()' which means any parameter or nothing can match. now there are two global functions named run in this scope one declared 'void run()' and one declared void run(void*)'. The first one would be the correct choice. To fix that I had to changemthe declaration of atexit() in cstdlib: atexit(void (*)()) => atexit(void (*)(void)) which is consistent with glibc's atexit() declaraion in stdlib.h furthermore the declaration of at_quick_exit() has the same bug.
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 --- Comment #7 from Bernd Edlinger --- aehmm sorry, the object "g" from above code is actually from PR#48784 #pragma pack(1) volatile struct S0 { signed a : 7; unsigned b : 28; } g = {0,-1}; => sizeof(g) = 5 but the code from this example has pretty much the same problems: typedef struct s{ unsigned char Prefix; test_type Type; }__attribute((__packed__)) ss; => sizeof(ss) = 5 foo: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r3, .L2 ldr r2, [r3] and r2, r2, #255 orr r2, r2, r0, asl #8 str r2, [r3] ldr r2, [r3, #4] bic r2, r2, #255 orr r0, r2, r0, lsr #24 str r0, [r3, #4] bx lr accesses 8 bytes
[Bug libstdc++/57691] freestanding libstdc++ has compile error
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57691 --- Comment #9 from Bernd Edlinger --- (In reply to Jonathan Wakely from comment #7) > (In reply to Paolo Carlini from comment #4) > > ... by the way, I'm *very* surprised that nobody noticed this over the > > years: the freestanding atexit is declared like this in in > > 4.0.0!?! > > It only matters on some less-well-tested targets, because > NO_IMPLICIT_EXTERN_C is defined for sane targets. It might also not matter > if the system headers declare: > int atexit(void (*)(void)); > as GNU Libc does, because then the function is declared portably for both C > and C++. > > Presumably the eCos headers either don't declare atexit or declare it > without an abominable (void) parameter list. the eCos stdlib.h declares it the right way: stdlib.h:/* Type of function used by atexit() */ stdlib.h:typedef void (*__atexit_fn_t)( void ); stdlib.h:atexit( __atexit_fn_t /* func_to_register */ ); but because the cstdlib has this guard block it declares every thing by itself: #if !_GLIBCXX_HOSTED // The C standard does not require a freestanding implementation to // provide . However, the C++ standard does still require // -- but only the functionality mentioned in // [lib.support.start.term].
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 --- Comment #9 from Bernd Edlinger --- 1. you should never touch memory that lies outside the struct. 2. if you have to generate multiple accesses you should generate code as if "volatile" was not used at all. 3. if -mno-unaligned-access is given you should not use accesses that are larger than the struct's __attribute__((alignment(x))) 4. otherwise if unaligned accesses are allowed, you may generate an unaligned ldr/str instruction. Note: please do not use ldmia/stmia with unaligned addresses, because that does still segfault even in ARMv7. (that may be handled by a Linux IRQ but not for other O/S like eCos)
[Bug target/56997] Incorrect write to packed field when strict-volatile-bitfields enabled on aarch32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56997 --- Comment #10 from Bernd Edlinger --- incredibly... gcc 4.3.7 was the last version that did only write 5 bytes in foo(). starting with gcc 4.4 all variants read/write 8 bytes in foo(). that applies only to the arm code. the x86 code does not use more than 5 bytes.
[Bug c++/57699] Disable empty parameter list misinterpretation in libstdc++ headers when !defined(NO_IMPLICIT_EXTERN_C)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57699 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #3 from Bernd Edlinger --- The eCos libc headers are 100% C++ compatible, in fact the whole system is based on C++. Is this NO_IMPLICIT_EXTERN_C define set by the configure script? Should I make sure that the configure script sets NO_IMPLICIT_EXTERN_C?
[Bug boehm-gc/57761] New: USE_PROC_FOR_LIBRARIES does not work correctly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57761 Bug ID: 57761 Summary: USE_PROC_FOR_LIBRARIES does not work correctly Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: boehm-gc Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de Created attachment 30410 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30410&action=edit Proposed patch to fix this defect. usually this code is not used, but if the define USE_PROC_FOR_LIBRARIES is set there is linux code in dyn_loac.c at function GC_register_map_entries that does not work correctly. What happens is that this code computes heap bounds, but it may happen that some root segments are in between the heap segments. /* Compute heap bounds. FIXME: Should be done by add_to_heap? */ least_ha = (word)(-1); greatest_ha = 0; for (i = 0; i < GC_n_heap_sects; ++i) { word sect_start = (word)GC_heap_sects[i].hs_start; word sect_end = sect_start + GC_heap_sects[i].hs_bytes; if (sect_start < least_ha) least_ha = sect_start; if (sect_end > greatest_ha) greatest_ha = sect_end; } if (greatest_ha < (word)GC_scratch_last_end_ptr) greatest_ha = (word)GC_scratch_last_end_ptr; Later the map file is parsed, and all segments that fall in between the least_ha and greatest_ha are ignored: if (start >= least_ha && end <= greatest_ha) continue; usually the heap segments start from small addresses and the map file concatenates adjacent segments with identical attributes. If the least_ha is below the data segment and the greatest_ha above the data segment, the gc can discard objects, that are still in use. This leads to all kinds of crashes, and an occasionally an assertion at finalize.c line 648: GC_ASSERT(GC_is_marked(GC_base((ptr_t)curr_fo))); The reason for this is that "fo_head" may reside in a data segment that is removed from the root set by the above if statement. The attached patch fixes this by intersecting the heap segments from the writable data segments which form the root segments.
[Bug tree-optimization/56982] [4.8 Regression] Bad optimization with setjmp()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56982 --- Comment #13 from Bernd Edlinger --- Created attachment 30431 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30431&action=edit another example of wrong compilation This is another example where the optimization can go wrong. The attached program produces expected results if compiled with -O0: x=0, a=1 x=1, a=1 a=1 But if compiled with -O3 and if the value "a" is placed in a register the result is like this: x=0, a=1 x=1, a=0 a=0 That is because longjmp has more semantic than just a branch: It branches to the setjmp, and restores all callee saved registers to the previos value.
[Bug tree-optimization/56982] [4.8 Regression] Bad optimization with setjmp()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56982 --- Comment #15 from Bernd Edlinger --- (In reply to Mikael Pettersson from comment #14) > Your example is invalid C. Referring to WG14 n1494.pdf (there may be more > recent C1x documents, but it's the one I had available right now): > > - you violate 7.13.1.1 which specifies where setjmp() may be called, an > assignment statement is not one of the permitted contexts > > - more importantly, your auto variable a is not volatile-qualified, which > means that its value is indeterminate after the longjmp (7.13.2.1). > > Please fix these issues and check again if it yields wrong results. Thanks for pointing that out. When I add volatile to the auto variable, the code is OK.
[Bug target/57837] ARM function pointer tailcall miscompilation regression
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57837 --- Comment #2 from Bernd Edlinger --- (In reply to Ramana Radhakrishnan from comment #1) > mine. fixed with revision 201240 ?
[Bug c++/57699] Disable empty parameter list misinterpretation in libstdc++ headers when !defined(NO_IMPLICIT_EXTERN_C)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57699 --- Comment #5 from Bernd Edlinger --- Well, if a portable O/S like eCos would need such special treatment, the NO_IMPLICIT_EXTERN_C should not be bound to the target architecture, it would be far more appropriate to define the NO_IMPLICIT_EXTERN_C from the configure command line instead. Actually I would have expected that some fixinclude scripts should be able to fix this kind of coding errors in the header files. Are you sure we still need that kind of hack ?
[Bug c++/57699] Disable empty parameter list misinterpretation in libstdc++ headers when !defined(NO_IMPLICIT_EXTERN_C)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57699 --- Comment #7 from Bernd Edlinger --- (In reply to Jonathan Wakely from comment #6) > (In reply to Bernd Edlinger from comment #5) > > Well, if a portable O/S like eCos would need such special treatment, > > eCos doesn't need it Of course. In that case, it would be much better, to be able to *disable* the implicit extern "C" feature for X86 and ARM and whatever architecture, just because I certainly know it when I call configure. To my surprise a cross compiler for --target=arm-eabi has a completely different syntax in the system headers than one for --target=i686-pc-linux-gnu. In my eyes, configuring that somewhere in gcc/config is awkward in that use case. Do you see my point?
[Bug middle-end/57748] [4.8/4.9 Regression] ICE on ARM with -mfloat-abi=softfp -mfpu=neon
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748 --- Comment #6 from Bernd Edlinger --- (In reply to Martin Jambor from comment #5) > expand_assignment, offset as filled in get_inner_reference is the same, > however get_object_alignment (tem) used to return 64, and now only returns > 32 which then pushes us the wrong path which does not handle this case. So > now I guess I should figure out why get_object_alignment thinks the > alignment is so small... hhmm.. set_ptr_info_alignment is always called with align=4, and by the way, the crash goes away if I change this line (but I cannot tell if the code is correct): --- builtins.c.jj 2013-07-06 11:34:17.0 +0200 +++ builtins.c 2013-07-29 21:50:56.0 +0200 @@ -503,7 +503,7 @@ get_pointer_alignment_1 (tree exp, unsig *bitposp = ptr_misalign * BITS_PER_UNIT; *alignp = ptr_align * BITS_PER_UNIT; /* We cannot really tell whether this result is an approximation. */ - return true; + return false; } else {
[Bug middle-end/57748] [4.8/4.9 Regression] ICE on ARM with -mfloat-abi=softfp -mfpu=neon
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748 --- Comment #8 from Bernd Edlinger --- (In reply to Martin Jambor from comment #7) > In any event, it is clear that > the code in expand_assignment cannot cope with unaligned tem and non-NULL > offset. So currently I'm considering the following patch, although I am not > really sure it is enough (it does fix the ICE, though). If you can run the > testcase on the platform, would you run it with this patch applied, please? No, unfortunately I can only look at the assembler listing. But wait a moment... If the object is assumed to be unaligned here this patch will likely just compute the unaligned address, add the offset, and store the result there without any special precautions. While the code in the if statement seems to store the expression on a register and move that register to the final destination. Well, I believe this unaligned arrays are generally broken. consider this example: struct test { char x; long long y[10]; } __attribute__((packed)); long long foo(struct test *x, long long y, long long z) { long long a = x->y[y]; x->y[y] = z; return a; } gets compiled to: foo: @ Function supports interworking. @ args = 8, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. stmfd sp!, {r4, r5} add r5, sp, #8 ldmia r5, {r4-r5} add r2, r0, r2, asl #3 add r1, r2, #1 ldmia r1, {r0-r1} str r4, [r2, #1] str r5, [r2, #5] ldmfd sp!, {r4, r5} bx lr Won't these ldmia statements statements fault on unaligned addresses, even on a cortex-a9 ? Furthermore str on odd addresses are always there, regardless of the -mno-unaligned-access setting.
[Bug middle-end/57748] [4.8/4.9 Regression] ICE on ARM with -mfloat-abi=softfp -mfpu=neon
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748 --- Comment #12 from Bernd Edlinger --- (In reply to Martin Jambor from comment #11) >> Well, I believe this >> unaligned arrays are generally broken. >> >> consider this example: > With or > without the patch? If without the patch and you are reasonably confident > the output is indeed wrong, please open a new PR (and CC me, I'm interested) > because this particular ICE is certainly caused by trailing zero sized > arrays. I have tried reproducing your problem with x86_64 MMX vectors but > couldn't. I do not have access to an ARM machine to verify myself. Thanks. My example just produces wrong code. I tried everything, trunk with or without the patch does not matter, and it does not hit the ICE at all, but I tried to write the example to go into the if-statement here, and was somehow surprised, it produces wrong code instead. Your example aborts without the patch, and correctly produces 16 strb instructions with the patch. That means that store_field can handle unaligned address in to_rtx in *some* cases. Is this if-statement a work around for something, that should have been fixed in store_field instead? I'm chasing problems with unaligned structures that exist on the ARM GCC but not on Intel. All that started probably in 2011 with GCC 4.6, and meanwhile I'm concerned, because new processors arrive all the time and we must soon fix that or think of alternatives to GCC :-( I started with the discovery that volatile accesses to packed structure members are completely broken: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56341 However this seems to be a similar but completely different bug. I will file it today. PS: If you want to build a cross compiler for ARM, I can help you out with eCos sys-include headers, if that is present in the install tree, the cross-compiler can be built on X86_64 too.
[Bug middle-end/58041] New: Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 Bug ID: 58041 Summary: Unaligned access to arrays in packed structure Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de the attached test case shows unaligned accesses can be generated on arm architecture, despite the -mno-unaligned-access option. This does not happen at -O0 and -Og, but it always happens at at -Os -O1 -O2 and -O3 to name a few. assembler output for foo shows unaligned opcodes: foo: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. add ip, r0, r1, asl #3 add r1, ip, #1 ldmia r1, {r0-r1} str r2, [ip, #1] str r3, [ip, #5] bx lr reproduced with latest trunk: $ ../gcc-4.9-20130728/configure --target=arm-eabi --prefix=/home/ed/gnu/arm-eabi --with-newlib --enable-languages=c,c++ --disable-hosted-libstdcxx --disable-__cxa_atexit
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #1 from Bernd Edlinger --- Created attachment 30579 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30579&action=edit test case to show the bug
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #2 from Bernd Edlinger --- Sandra, this seems to be unrelated to your strict-volatile-bitfields patch, as it happens with or without that patch.
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #13 from Bernd Edlinger --- Hi, just one question, how about the -m[no-]unaligned-access option? If -munaligned-access had been given the code was almost right, I mean AFAIK ldr/str should be handled in hardware but ldmia generates an alignment exception and _may_ be emulated by an IRQ handler, but that would not be very efficient. When -mno-unaligned-access is given any ldr/str on unaligned addresses have to be avoided.
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #16 from Bernd Edlinger --- (In reply to Bill Schmidt from comment #15) > Bernd, Mikael, Martin: Could you please test this on your respective > targets? Congatulations! it works. If I compile with -mno-unaligned-access all accesses are ldrb/strb as it should be. And if I compile with -mcpu=cortex-a9 -munaligned-access the code is also OK, no ldmia's any more. The back end seems to have fixed that for us. foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. add ip, r0, r1, asl #3 ldr r0, [ip, #1]@ unaligned ldr r1, [ip, #5]@ unaligned str r2, [ip, #1]@ unaligned str r3, [ip, #5]@ unaligned bx lr which is perfectly OK for cortex-a9.
[Bug middle-end/57748] [4.8/4.9 Regression] ICE when expanding assignment to unaligned zero-sized array
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748 --- Comment #14 from Bernd Edlinger --- Martin, Your patch is of course OK, but the MALLOC_ABI_ALIGNMENT is probably wrong too. At least in targets with neon processor it should be raised to 64 bits. If the malloc would really return 4 byte aligned pointers, and you pass such a pointer to an external function, that function may assume naturally 8 byte aligned pointers and fault at runtime, right? I've just re-read the relevant ARM ABI document AAPCS: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf I have not found specific alignment requirements for malloc, but they specify alignments of different basic types up to 8 byte. Therefore I assume that the default value for MALLOC_ABI_ALIGNMENT is generally too low for the ARM architecture. The usual Doug Lee Malloc implementation has by design a lowest possible alignment of 8 bytes. What I mean is, maybe the defautlt for MALOC_ABI_ALIGNMENT should changed to BIGGEST_ALIGNMENT. What do You think?
[Bug target/58065] New: ARM MALLOC_ABI_ALIGNMENT is wrong
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58065 Bug ID: 58065 Summary: ARM MALLOC_ABI_ALIGNMENT is wrong Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de Target: arm*-*-* the ARM target architecture does not define the MALLOC_ABI_ALIGNMENT, therefore the default is used as BITS_PER_WORD, 32 in this case. This produces sometimes suboptimal code, because the front-end assumes that the function malloc() returns only word-aligned pointers, which is likely wrong. I have not found any specific requirements on the malloc alignment in the AAPCS document at http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf but I believe that the intention is to align everything including stack pointers to 8 bytes. Therefore I would suggest the attached patch which defines MALLOC_ABI_ALIGNMENT as BIGGEST_ALIGNMENT, which is 64 bits. As a proof that this has indeed some subtle influence on the generated code I attach a test case. The function foo is called by bar, and bar uses malloc to allocate the memory, with compiler options "-O3 -g0 -mfpu=neon -mfloat-abi=softfp" the function foo is inlined into bar, but the inlined version does not use vstr instructions any more, because the front-end does assume that malloc returns 4 byte aligned memory. If that was really true, foo must fail, if it is called without inlining. Therefore this code is just clumsy and less optimal than it could be.
[Bug target/58065] ARM MALLOC_ABI_ALIGNMENT is wrong
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58065 --- Comment #1 from Bernd Edlinger --- Created attachment 30598 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30598&action=edit test case
[Bug target/58065] ARM MALLOC_ABI_ALIGNMENT is wrong
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58065 --- Comment #2 from Bernd Edlinger --- Created attachment 30599 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30599&action=edit compiler output without this patch
[Bug target/58065] ARM MALLOC_ABI_ALIGNMENT is wrong
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58065 --- Comment #3 from Bernd Edlinger --- Created attachment 30600 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30600&action=edit correct compiler output with patch
[Bug target/58065] ARM MALLOC_ABI_ALIGNMENT is wrong
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58065 --- Comment #4 from Bernd Edlinger --- Created attachment 30601 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30601&action=edit Proposed patch
[Bug middle-end/57748] [4.8/4.9 Regression] ICE when expanding assignment to unaligned zero-sized array
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748 --- Comment #16 from Bernd Edlinger --- (In reply to Martin Jambor from comment #15) > Anyway, the policy of GCC > seems to be that the default of MALLOC_ABI_ALIGNMENT is ultra-safe and > targets should override it. So I would suggest, again :-), that you open a > separate bug and CC ARM maintainers that should take care of it. Done. Bug#58065
[Bug testsuite/58070] New: gcc.c-torture: useless check "-O3 -fomit-frame-pointer"
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58070 Bug ID: 58070 Summary: gcc.c-torture: useless check "-O3 -fomit-frame-pointer" Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de The -fomit-frame-pointer is now (since 4.6) the default at -O3. Therefore I would suggest to change that to test "-O3" and "-O3 -fno-omit-frame-pointer" instead.
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #27 from Bernd Edlinger --- (In reply to Martin Jambor from comment #24) > Created attachment 30594 [details] > Proposed patch I think it would be safe to put my initial test case under gcc/testsuite/gcc.target/arm/pr58041.c It passes with your patch at least in my environment.
[Bug testsuite/58070] gcc.c-torture: useless check "-O3 -fomit-frame-pointer"
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58070 --- Comment #2 from Bernd Edlinger --- (In reply to Andreas Schwab from comment #1) > This is target dependent. OK, my target is --target=arm-eabi What exactly is target dependent?
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #30 from Bernd Edlinger --- Hi Martin, I have bootstrapped this patch for i686-pc-linux-gnu and have seen some "excess errors" in your test script: /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c: In function 'foo': /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: note: The ABI for passing parameters with 16-byte alignment has changed in GCC 4.6 /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: warning: SSE vector argument without SSE enabled changes the ABI [enabled by default] /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: warning: SSE vector argument without SSE enabled changes the ABI [enabled by default] output is: /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c: In function 'foo': /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: note: The ABI for passing parameters with 16-byte alignment has changed in GCC 4.6 /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: warning: SSE vector argument without SSE enabled changes the ABI [enabled by default] /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: warning: SSE vector argument without SSE enabled changes the ABI [enabled by default] FAIL: gcc.dg/torture/pr58041.c -O0 (test for excess errors) Excess errors: /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: warning: SSE vector argument without SSE enabled changes the ABI [enabled by default] /home/ed/gnu/gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c:15:11: warning: SSE vector argument without SSE enabled changes the ABI [enabled by default]
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #33 from Bernd Edlinger --- (In reply to Martin Jambor from comment #31) > I can't reproduce this with the -m32 flag on my x86_64... do > you still have the compiler built on an i686? If so, could you try and make > function foo static in that testcase and see if the error goes away? static does not help. If I add -msse the warning goes away, but the compiled executable crashes because of illegal instruction. Dual Pentium II, with mmx but obviously no sse, whatever that may be: flags: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pse36 mmx fxsr
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #34 from Bernd Edlinger --- by the way the initializer of "struct s a = " seems to generate warnings at -Wall, because some brackets are missing: changed that to struct s a = {0,{{0,0},{0,0}}}; but somehow I wonder what forced us to generate sse instructions here? when that same example works on a ARMv5 targe?
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #36 from Bernd Edlinger --- (In reply to Martin Jambor from comment #35) > (In reply to Bernd Edlinger from comment #34) > by the way the initializer > of "struct s a = " > seems to generate warnings at -Wall, because some > brackets are missing: > > changed that to > struct s a = {0,{{0,0},{0,0}}}; > > > but somehow I wonder what forced us to generate sse instructions here? > > when that same example works on a ARMv5 targe? > Strange, does the correct > initializer make the warning go away? > If so, I'll fix it in the testsuite in a moment. no that is just a different warning with -Wall, that one did not make the test case fail however. and in line 6 the "typedef struct S { V v; } P __attribute__((aligned (1)));" is superfluos too. hmm, maybe the problem is I should not say -msse in the first place. do you get the warning if you use -m32 -mno-sse ? what's funny about that warning, that it does not need to be enabled with -Wall like the other warning.
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #37 from Bernd Edlinger --- this version fixes the warning: --- ../gcc-4.9-20130728/gcc/testsuite/gcc.dg/torture/pr58041.c 2013-08-02 20:59:38.0 +0200 +++ pr58041.c 2013-08-06 18:30:51.0 +0200 @@ -3,8 +3,6 @@ typedef long long V __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); -typedef struct S { V v; } P __attribute__((aligned (1))); - struct s { char u; @@ -12,24 +10,24 @@ } __attribute__((packed,aligned(1))); __attribute__((noinline, noclone)) -long long foo(struct s *x, int y, V z) +long long foo(struct s *x, int y, V *z) { V a = x->v[y]; - x->v[y] = z; + x->v[y] = *z; return a[1]; } -struct s a = {0,{0,0}}; +struct s a = {0,{{0,0},{0,0}}}; int main() { V v1 = {0,1}; V v2 = {0,2}; - if (foo(&a,0,v1) != 0) + if (foo(&a,0,&v1) != 0) __builtin_abort(); - if (foo(&a,0,v2) != 1) + if (foo(&a,0,&v2) != 1) __builtin_abort(); - if (foo(&a,1,v1) != 0) + if (foo(&a,1,&v1) != 0) __builtin_abort(); return 0; }
[Bug middle-end/58041] Unaligned access to arrays in packed structure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58041 --- Comment #39 from Bernd Edlinger --- (In reply to Martin Jambor from comment #38) >> (In reply to Bernd Edlinger from comment #37) >> this version fixes the warning: > And I confirm that it still tests the bug. If you want to commit > it yourself, go ahead, otherwise let me now and I'll do it before I leave > today. Thanks a lot! no thanks, just go ahead.
[Bug target/58065] ARM MALLOC_ABI_ALIGNMENT is wrong
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58065 --- Comment #7 from Bernd Edlinger --- Patch was posted here: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00350.html
[Bug rtl-optimization/58048] [4.8/4.9 Regression] internal compiler error: Max. number of generated reload insns per insn is achieved (90)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58048 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #8 from Bernd Edlinger --- I see the same error with recent 4.9 i686-pc-linux-gnu in the following test case: gcc -O2 -msse -mno-avx -S testsuite/gcc.target/i386/intrinsics_4.c intrinsics_4.c: In function 'foo': intrinsics_4.c:14:1: internal compiler error: Max. number of generated reload insns per insn is achieved (90) } ^ 0x849e4c3 lra_constraints(bool) ../../gcc-4.9-20130728/gcc/lra-constraints.c:3724 0x849136c lra(_IO_FILE*) ../../gcc-4.9-20130728/gcc/lra.c:2319 0x8456beb do_reload ../../gcc-4.9-20130728/gcc/ira.c:4689 0x8456beb rest_of_handle_reload ../../gcc-4.9-20130728/gcc/ira.c:4801 Please submit a full bug report, with preprocessed source if appropriate.
[Bug c++/58105] New: wrong code generation for multiversioned functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58105 Bug ID: 58105 Summary: wrong code generation for multiversioned functions Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de the following test cases fail on i686-*: g++.dg/ext/mv2.C g++.dg/ext/mv5.C g++.dg/ext/mv12.C The code is OK on -O0, -O1, but fails on -O2 and -O3. The problem seems to be that for multiversioned functions an internal dispatcher function is generated by ix86_generate_version_dispatcher_body(), which is being inlined in -O2 and above. But the inlined function does no longer call the target-specific function. Instead the return value is the address of the target-specific function.
[Bug rtl-optimization/58048] [4.8/4.9 Regression] internal compiler error: Max. number of generated reload insns per insn is achieved (90)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58048 --- Comment #10 from Bernd Edlinger --- (In reply to Vladimir Makarov from comment #9) so this test case has no chance to pass on a target without avx. maybe this should be added to the test case then? /* { dg-require-effective-target avx } */
[Bug rtl-optimization/58048] [4.8/4.9 Regression] internal compiler error: Max. number of generated reload insns per insn is achieved (90)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58048 --- Comment #11 from Bernd Edlinger --- hmm, this test compiles correctly if -msse2 is used. gcc -O2 -msse2 -mno-avx -S intrinsics_4.c
[Bug target/58115] New: testcase gcc.target/i386/intrinsics_4.c failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58115 Bug ID: 58115 Summary: testcase gcc.target/i386/intrinsics_4.c failure Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de Target: i386-pc-linux-gnu Build: gcc-4.9-20130728 this test case fails on i686-pc-linux with internal error. intrinsics_4.c: In function 'foo': intrinsics_4.c:15:1: internal compiler error: Max. number of generated reload insns per insn is achieved (90) } ^ 0x849e4c3 lra_constraints(bool) ../../gcc-4.9-20130728/gcc/lra-constraints.c:3724 0x849136c lra(_IO_FILE*) ../../gcc-4.9-20130728/gcc/lra.c:2319 0x8456beb do_reload ../../gcc-4.9-20130728/gcc/ira.c:4689 0x8456beb rest_of_handle_reload ../../gcc-4.9-20130728/gcc/ira.c:4801 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. The crash happens only with this command line "gcc -O2 -msse -mno-avx -S intrinsics_4.c" but not with "-O1 -msse -mno-avx" or "-O2 -msse2 -mno-avx" The problem seems to be triggered when the movv8sf_internal is split up to 8 movsi_internal which leads to the crash in the lra. The wrong decision is happening in expr.c in emit_move_insn: #0 emit_move_multi_word(machine_mode, rtx_def*, rtx_def*) () at ../../gcc-4.9-20130728/gcc/expr.c:3344 #1 0x08332390 in emit_move_insn(rtx_def*, rtx_def*) () at ../../gcc-4.9-20130728/gcc/expr.c:3526 #2 0x08311cb1 in force_reg(machine_mode, rtx_def*) [clone .part.2] () #3 0x08833225 in ix86_fixup_binary_operands(rtx_code, machine_mode, rtx_def**) () at ../../gcc-4.9-20130728/gcc/config/i386/i386.c:16729 #4 0x08833250 in ix86_fixup_binary_operands_no_copy(rtx_code, machine_mode, rtx_def**) () at ../../gcc-4.9-20130728/gcc/config/i386/i386.c:16763 #5 0x088d3af0 in gen_andv8sf3(rtx_def*, rtx_def*, rtx_def*) () #6 0x081149db in ix86_expand_args_builtin(builtin_description const*, tree_node*, rtx_def*) () at ../../gcc-4.9-20130728/gcc/config/i386/i386.c:30457 #7 0x0884b762 in ix86_expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) () at ../../gcc-4.9-20130728/gcc/config/i386/i386.c:32769 #8 0x082367cc in expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) () at ../../gcc-4.9-20130728/gcc/builtins.c:5823 #9 0x0832bd9d in expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**) () at ../../gcc-4.9-20130728/gcc/expr.c:10270 #10 0x0833953d in store_expr(tree_node*, rtx_def*, int, bool) () at ../../gcc-4.9-20130728/gcc/expr.c:5287 #11 0x0833bf96 in expand_assignment(tree_node*, tree_node*, bool) () at ../../gcc-4.9-20130728/gcc/expr.c:5073 #12 0x0825831c in expand_gimple_stmt(gimple_statement_d*) () at ../../gcc-4.9-20130728/gcc/cfgexpand.c:2178 #13 0x0825971c in expand_gimple_basic_block(basic_block_def*, bool) () at ../../gcc-4.9-20130728/gcc/cfgexpand.c:4204 #14 0x0825b458 in gimple_expand_cfg() () at ../../gcc-4.9-20130728/gcc/cfgexpand.c:4723 #15 0x084f09eb in execute_one_pass(opt_pass*) () at ../../gcc-4.9-20130728/gcc/passes.c:1965 #16 0x084f0e15 in execute_pass_list(opt_pass*) () at ../../gcc-4.9-20130728/gcc/passes.c:2017 #17 0x0827c09e in expand_function(cgraph_node*) () at ../../gcc-4.9-20130728/gcc/cgraphunit.c:1591 #18 0x0827df4d in compile() () at ../../gcc-4.9-20130728/gcc/cgraphunit.c:1695 #19 0x0827e59a in finalize_compilation_unit() () at ../../gcc-4.9-20130728/gcc/cgraphunit.c:2106 #20 0x08146075 in c_write_global_declarations() () at ../../gcc-4.9-20130728/gcc/c/c-decl.c:10125 #21 0x08595d5d in compile_file() () #22 0x08597d42 in toplev_main(int, char**) () #23 0x08127ebb in main () at ../../gcc-4.9-20130728/gcc/main.c:36 This is probably an error from the back-end optab_handler (?) The failure started from 2013-06-23 and continues till today.
[Bug target/58115] testcase gcc.target/i386/intrinsics_4.c failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58115 --- Comment #1 from Bernd Edlinger --- Hi Sriraman, I'm putting you on CC since you are the author of that test case: I am not sure if the test case should use -msse2 instead of -msse, but running on an assertion is certainly to be avoided in any case. Regards Bernd.
[Bug target/58111] 32-bit gcc.target/i386/pr55342.c FAILs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58111 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #1 from Bernd Edlinger --- I see the same issue on i686-pc-linux-gnu. The simple solution would be to change the test case: The notb instruction would go away if -O3 is used instead of -O2.
[Bug target/58105] wrong code generation for multiversioned functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58105 Bernd Edlinger changed: What|Removed |Added Target||i686*-*-* Component|c++ |target --- Comment #1 from Bernd Edlinger --- That is a pretty weird bug! The backend generates a resolver function and an ifunc stub. but the tree is somehow complete wrong. Everything is good, until the optimizer inlines the resolver function instead of the call to the ifunc stub.
[Bug target/58105] wrong code generation for multiversioned functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58105 --- Comment #2 from Bernd Edlinger --- OK, this seems seems to be a possible fix: --- i386.c.jj 2013-07-23 17:56:37.0 +0200 +++ i386.c 2013-08-11 01:41:38.0 +0200 @@ -29830,7 +29830,7 @@ DECL_IGNORED_P (decl) = 0; /* IFUNC resolvers have to be externally visible. */ TREE_PUBLIC (decl) = 1; - DECL_UNINLINABLE (decl) = 0; + DECL_UNINLINABLE (decl) = 1; /* Resolver is not external, body is generated. */ DECL_EXTERNAL (decl) = 0;
[Bug tree-optimization/58137] [trunk, ICE] full unroll + AVX2 vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58137 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #2 from Bernd Edlinger --- reproduced also with arm-none-eabi: ../arm-eabi/bin/arm-eabi-gcc -O3 -mfpu=neon -mfloat-abi=softfp 1.c
[Bug tree-optimization/58137] [trunk, ICE] full unroll + AVX2 vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58137 --- Comment #3 from Bernd Edlinger --- Created attachment 30639 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30639&action=edit possible fix This seems to be a bug in the constant folding of constant vector values at forwprop4. Could some one check if the generated code is now correct ? Thanks.
[Bug target/58105] wrong code generation for multiversioned functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58105 --- Comment #3 from Bernd Edlinger --- Patch was posted here: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00770.html
[Bug target/58105] wrong code generation for multiversioned functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58105 --- Comment #4 from Bernd Edlinger --- Sorry to bother you... With Richard's E-mail today he approved this patch. Could you as i386-port maintainer please do the check-in for me? Thanks.
[Bug middle-end/58143] wrong code at -O3 on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #4 from Bernd Edlinger --- Created attachment 30674 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30674&action=edit slightly simplified test case attached is a slightly simplified test case. The problem seems to start in the optimizer pass 097t.lim1 where the following expression is identified as loop-invariant of loop 3. Moving statement _23 = -2147483648 - c.1_22; (cost 1) out of loop 3. unfortunatley it is executed unconditionally now. later the optimizer pass 110t.ivcanon uses the possible overflow in this statement as an argument, why the loop must be executed exactly once. Induction variable (int) 2147483647 + 1 * iteration does not wrap in statement _23 = -2147483648 - prephitmp_8; in loop 2. Statement _23 = -2147483648 - prephitmp_8; is executed at most 0 (bounded by 0) + 1 times in loop 2. and shortly after that an apparently pointless loop-exit is removed. Removed pointless exit: if (prephitmp_8 != 0) however that is based on a worng assumption, and causes worng code.
[Bug middle-end/58143] wrong code at -O3 on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 --- Comment #5 from Bernd Edlinger --- Summary: tree-ssa-loop-im.c moves code, out of an if statement inside the loop it it can not cause side effects or faults, but it does not care of integer overflows. this seems to be an optimization! BUT tree-ssa-loop-niter.c (infer_loop_bounds_from_signedness) does assume that the code will never execute integer additions or subtractions with the intention to use the result as modulo 2^32, thus ignoring overflow. It seems that -O3 and -fno-strict-overflow will fix the code. however this comment in tree.h points to another problem: IMPORTANT NOTE: Any optimization based on TYPE_OVERFLOW_UNDEFINED must issue a warning based on warn_strict_overflow. In some cases it will be appropriate to issue the warning immediately, and in other cases it will be appropriate to simply set a flag and let the caller decide whether a warning is appropriate or not. this example does not generate any warnings, not with -Wall and not with -Wstrict-overflow...
[Bug tree-optimization/58137] [trunk, ICE] full unroll + AVX2 vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58137 --- Comment #5 from Bernd Edlinger --- OK, a slightly improved patch was posted at: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01099.html
[Bug fortran/57904] [4.9 Regression] Bogus(?) "invokes undefined behavior" warning with Fortran's finalization wrapper (gfortran.dg/class_48.f90)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57904 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #3 from Bernd Edlinger --- this is somehow very similar to PR58143. also here, the warning goes away at -O2 and -Os if I add -fno-strict-overflow or -ftrapv or -fwrapv...
[Bug middle-end/58143] wrong code at -O3 on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 --- Comment #6 from Bernd Edlinger --- Created attachment 30681 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30681&action=edit possible fix This seems to be a possible fix. What do you think of it, Jan?
[Bug middle-end/58143] wrong code at -O3 on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 --- Comment #7 from Bernd Edlinger --- How can I set the status of this tracker to CONFIRMED ? Should'nt the component be "tree-optimization" instead of "middle-end" ?
[Bug tree-optimization/58143] [4.8/4.9 regression] wrong code at -O3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 --- Comment #9 from Bernd Edlinger --- (In reply to Jakub Jelinek from comment #8) > That patch looks wrong, and would very likely penalize tons of code, this > predicate is used in many places in the compiler and the operations don't > trap. yes, thanks, I agree. This means then the "lim" pass (and probably others like "ifcvt" too) will move code out of the inner loop, as long as it does not trap. But this creates undefined results, and that should not be used by the loop optimization to throw away the loop termination code. In this case I'd say the only other simple solution will be to take out the function infer_loop_bounds_from_signedness() completely at tree-ssa-loop-niter.c, right? To illustrate what this function can do here is another example: loop.c: extern int bar (); int foo () { int i, k; for (i=0; i<4; i++) { k=10*i; if (bar ()) break; } return k; } if you compile this function with -O3 the resulting code is very surprising (with zero warnings): foo: .LFB0: .cfi_startproc subl$12, %esp .cfi_def_cfa_offset 16 callbar testl %eax, %eax jne .L3 callbar testl %eax, %eax .p2align 4,,4 jne .L4 .p2align 4,,6 callbar movl$20, %eax .L2: addl$12, %esp .cfi_remember_state .cfi_def_cfa_offset 4 ret .p2align 4,,7 .p2align 3 .L3: .cfi_restore_state xorl%eax, %eax jmp .L2 .p2align 4,,7 .p2align 3 .L4: movl$10, %eax jmp .L2 Due to the fact, that k will overflow at the forth iteration, the loop is terminated at the third iteration! The reasoning is that the only way to prevent the undefined behaviour of k, one of the first tree invocations of bar must terminate the loop, and thus the loop is only unrolled 3 times. But if the loop is a bit more complex it will not be unrolled, and in this case the normal loop termination conditin "i<4" will not be used at all, resulting in an endless loop. To prevent the loop unrolling I can add a printf: loop.c: extern int bar (); int foo () { int i, k; for (i=0; i<4; i++) { k=10*i; __builtin_printf("loop %d\n", i); if (bar ()) break; } return k; } Now this is an endless loop (bar always returns 0 but the compiler does not know)! foo: .LFB0: .cfi_startproc pushl %ebx .cfi_def_cfa_offset 8 .cfi_offset 3, -8 xorl%ebx, %ebx subl$24, %esp .cfi_def_cfa_offset 32 .L2: movl%ebx, 4(%esp) movl$.LC0, (%esp) callprintf callbar testl %eax, %eax jne .L6 addl$1, %ebx .p2align 4,,3 jmp .L2 .p2align 4,,7 .p2align 3 .L6: addl$24, %esp .cfi_def_cfa_offset 8 imull $10, %ebx, %eax popl%ebx .cfi_restore 3 .cfi_def_cfa_offset 4 ret .cfi_endproc
[Bug fortran/58113] [4.9 Regression] gfortran.dg/round_4.f90 FAILs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58113 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #4 from Bernd Edlinger --- Created attachment 30692 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30692&action=edit determine rounding support at runtime How about this? With this patch the test case should pass most of the time.
[Bug tree-optimization/58143] [4.8/4.9 regression] wrong code at -O3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 Bernd Edlinger changed: What|Removed |Added Attachment #30681|0 |1 is obsolete|| --- Comment #10 from Bernd Edlinger --- Created attachment 30693 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30693&action=edit possible fix, next try... This variant eliminates the infer_loop_bounds_from_signedness function and some of the "invokes undefined behavior" warnings. Bootstrapped, and regression tested on i686-pc-linux-gnu. And by the way, it fixes PR57904 too. How do you like it now ?
[Bug libmudflap/58230] New: mutliple test fail in german language version
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58230 Bug ID: 58230 Summary: mutliple test fail in german language version Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libmudflap Assignee: unassigned at gcc dot gnu.org Reporter: bernd.edlinger at hotmail dot de Hello, multiple libmudflap tests fail because the test scripts use the english warning text, but the compiler prints the german translation. libmudflap.c/pass35-frag.c libmudflap.c++/error1-frag.cxx libmudflap.c++/error2-frag.cxx libmudflap.c++/pass57-frag.cxx Probably the test environment should set the LANG=C somewhere? from libmudflap.log: spawn /home/ed/gnu/gcc-build/gcc/xgcc -B/home/ed/gnu/gcc-build/gcc/ -ggdb3 -DDEBUG_ASSERT -I/home/ed/gnu/gcc-4.9-20130818/libmudflap/testsuite -I/home/ed/gnu/gcc-4.9-20130818/libmudflap/testsuite/.. -I.. -L/home/ed/gnu/gcc-build/i686-pc-linux-gnu/./libmudflap/.libs /home/ed/gnu/gcc-4.9-20130818/libmudflap/testsuite/libmudflap.c/pass35-frag.c -O0 -fmudflap -lmudflap -L/home/ed/gnu/gcc-build/i686-pc-linux-gnu/./libmudflap/testsuite -ldl -lm -o ./pass35-frag.exe^M /home/ed/gnu/gcc-4.9-20130818/libmudflap/testsuite/libmudflap.c/pass35-frag.c:14:1: Warnung: Schmutzfänger kann nicht externes »end« unbekannter Größe verfolgen [-Wmudflap]^M }^M ^^M output is: /home/ed/gnu/gcc-4.9-20130818/libmudflap/testsuite/libmudflap.c/pass35-frag.c:14:1: Warnung: Schmutzfänger kann nicht externes »end« unbekannter Größe verfolgen [-Wmudflap]^M }^M ^^M FAIL: libmudflap.c/pass35-frag.c (-O0) cannot track unknown size extern (test for warnings, line ) FAIL: libmudflap.c/pass35-frag.c (-O0) (test for excess errors) Excess errors: /home/ed/gnu/gcc-4.9-20130818/libmudflap/testsuite/libmudflap.c/pass35-frag.c:14:1: Warnung: Schmutzfänger kann nicht externes »end« unbekannter Größe verfolgen [-Wmudflap]
[Bug tree-optimization/58143] [4.8/4.9 regression] wrong code at -O3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 --- Comment #12 from Bernd Edlinger --- (In reply to Jakub Jelinek from comment #11) > No, that is wrong as well. Because it is too destructive? Maybe. I think this is a general problem here. 1. the undefined behavior warning may be triggered by artefacts from the lim pass or in the class_48.f90 case. 2. surprise optimizations may happen without this warning, see my previous comment #9. 3. in the case of integer overflow, "reliable" does only say that the operation is executed in every iteration, but not that the result is acually used for something, as in Zhedong's example. With array bounds I have not the same problem, here as I'd say if the array is accessed beyond the limit, the guarantee is void anyway, and the lim pass would never move an array access out of the if statement, right? But there are examples where the undefined behavior warning is not emitted after a possible array bounds exception. A nice example for this is gmp-4.3.2/tests/mpz/t-scan.c This example has a array bounds error: static const int offset[] = { -2, -1, 0, 1, 2, 3 }; ... for (oindex = 0; oindex <= numberof (offset); oindex++) // +-1 error here { o = offset[oindex]; ... if (got != want) { ... exit (1); // this cancels the aggressive-loop-optimizations warning } ... } The generated code at -O2 is without the loop termination check, surprise surprise... What do you think?
[Bug tree-optimization/58143] [4.8/4.9 regression] wrong code at -O3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 Bernd Edlinger changed: What|Removed |Added Attachment #30693|0 |1 is obsolete|| --- Comment #14 from Bernd Edlinger --- Created attachment 30699 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30699&action=edit patch to prevent undefined execution in lim OK, this time only the lim pass should be the prevented from introducing undefined behavior that was not there originally. This triggered a minor regression in gcc.target/i386/pr53397-1.c; Here lim used to move the expression "2*step" out of the loop, but this may cause undefined behavior on case of overflow, I propose to resolve this by adding -fno-strict-overflow, The test case looks pretty constructed anyway.