[Bug target/94504] New: On powerpc, -ffunction-sections -fdata-sections is not as effective as expected for PIE executables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94504 Bug ID: 94504 Summary: On powerpc, -ffunction-sections -fdata-sections is not as effective as expected for PIE executables. Product: gcc Version: 9.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gcc-bugzilla at mkarcher dot dialup.fu-berlin.de Target Milestone: --- I try to compile the following test program using gcc -ffunction-sections -fdata-sections -pie -Wl,--gc-sections input.c The linking step fails, because g is not defined. On most architectures except powerpc (32 bit), the garbage collection is able to discard both fptr and f, and so the reference to g vanishes. This is not the case on powerpc, where f is referencing fptr through a GOT entry in the .got2 section. This issue (I don't dare to call it "bug" yet) is the cause of of https://bugs.debian.org/955845. The librsvg build process is quirky and building the tests only works if garbage collection is able to collect a hughe amount of dead code. Garbage collection is able to do that on all architectures Debian tried it on except for powerpc (and possibly ppc64, see https://bugs.debian.org/895723). The attached example program does compile fine on ppc64, though. /* dead code */ extern void g(int x, ...); extern void (*fptr)(); void f() { /* using fptr creates a GOT entry for fptr */ g(0, fptr); } /* fptr is reference from the GOT. Let's reference f from fptr */ void (*fptr)() = f; /* Non-dead code */ int x = 5; int main(void) { /* using x goes through the GOT. This prevents the GC to kill it */ return x; }
[Bug target/94504] On powerpc, -ffunction-sections -fdata-sections is not as effective as expected for PIE executables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94504 --- Comment #3 from Michael Karcher --- (In reply to Richard Biener from comment #2) > Huh, looking at the assembly & the object file this seems to be fully a linker > issue who seems to be responsible for building the GOT. I suggest to move > this over to sourceware. Alan? I'm not so sure about this. I think it might actually be a limitation of ELF on ppc32: If I compare the code generated by gcc on ppc32, gcc does output a GOT fragment in the .got2 section that references all globals that are used in the current object: $ objdump -r -j .got2 bla32.o bla32.o: file format elf32-powerpc RELOCATION RECORDS FOR [.got2]: OFFSET TYPE VALUE R_PPC_ADDR32 fptr 0004 R_PPC_ADDR32 x This section can not be elided, because it is referenced from main: $ objdump -r -j .text.main bla32.o bla32.o: file format elf32-powerpc RELOCATION RECORDS FOR [.text.main]: OFFSET TYPE VALUE 0022 R_PPC_REL16_HA.got2+0x8006 0026 R_PPC_REL16_LO.got2+0x800a The linker has no obvious way to detect which GOT slots are actually used inside main, because the offset(s) of the slot(s) inside .got2 used by main are hardcoded inside main. On the other hand, on ppc64, there is no GOT fragment generated by the compiler, but instead the compiler just asks the linker to create a GOT slot and fill in the necessary information: $ objdump -r -j .text.main bla64.o bla64.o: file format elf64-powerpc RELOCATION RECORDS FOR [.text.main]: OFFSET TYPE VALUE 000e R_PPC64_TOC16_HA x 0012 R_PPC64_TOC16_LO x
[Bug target/94504] On powerpc, -ffunction-sections -fdata-sections is not as effective as expected for non-PIE executables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94504 --- Comment #5 from Michael Karcher --- I got the command line of gcc wrong. "-pie" just sets the linker flags for PIE linking, but it does *not* compile source code as PIE. If I use "-fpie", garbage collection does what it is supposed to do. As I found out, the acutal problem I have is completely unrelated to gcc (sorry for the noise here), because it concerns object files created by rustc (which has an llvm backend) linked by binutil's ld tool. At the moment I was able to reproduce the same symptom I have with those rust objects with a ten-line C program, I considered fixing the problem (if possible) for C first a good idea. As it stands now, the issue in gcc (no effective garbage collection for non-PIE executables with function pointers in global data structures) still stands, so I am not closing the bug right away.
[Bug target/88592] New: Passing of packed structures on sparc64 different than in clang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88592 Bug ID: 88592 Summary: Passing of packed structures on sparc64 different than in clang Product: gcc Version: 8.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gcc-bugzilla at mkarcher dot dialup.fu-berlin.de Target Milestone: --- As noted while filing a bug on rustc because rustc does not correctly implement the Sparc64 ABI regarding floating point fields of "small" structures, I stumbled across a difference between gcc an clang, as shown in https://github.com/rust-lang/rust/issues/57103 (the important piece is quoted below): struct str3 { float f; // passed in most-significant half of %o0 (gcc) or in %f0 (clang) int i;// passed in least-significant half of %o0 } __attribute__((packed)); Without __attribute__((packed)), clang, gcc and the ABI standard agree to pass f in %f0. The ABI standard doesn't contain anything about packed structures, so I don't see a way to decide whether gcc or clang is right. I report a bug on both products to raise awareness.
[Bug go/79037] gccgo: Binaries crash with parforsetup: pos is not aligned on m68k
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79037 --- Comment #5 from Michael Karcher --- The root issue now is that the ABI gcc implements on m68k is incompatible with the Go runtime shipped with gcc. The Go runtime uses the lowest two bits in the type information pointer as flags (called PRECISE and LOOP) and relies on the fact that the actual type information structure is aligned to a 32-bit boundary. The type descriptors are Go literals created by the Go frontend as standard Go structures. In the gcc backend for Go, the layout and alignment rules of Go structures match the rules for C structures (which is most likely necessary for interoperability). On m68k this means we get 16-bit alignment for historical reasons. The current interface between the go frontend and its backend has no interface to request stricter alignment, so the Go frontend is unable to ensure that type descriptors are aligned to 32-bit boundaries. A possible way of solving this would be to extend Backend::struct_type(vector<...> fields) by an (optional?) second parameter to communicate alignment requests, such that Gcc_backend::struct_type can use DECL_ALIGN before calling fill_in_struct. This enables the Go frontend to request the required alignment for its type descriptors. Requesting increased alignment for Go type descriptors is impossible to break legacy ABIs on m68k as there is no ABI involving Go type descriptors yet. This change should possibly go align with adding "__attribute__((aligned(4)))" to several types in libgo/runtime/go-type.h if type descriptors are ever created from the C side of things. Proof for the issue: (sid-m68)root:~# nm --dynamic /usr/lib/m68k-linux-gnu/libgo.so.9 | grep __go_td | head -n 6 00bcc410 V __go_td_AAAN5_uint8ee1e 00bcc440 V __go_td_AAAN5_uint8ee1e$gc 00bd0058 V __go_td_AAAN5_uint8eee 00bd0080 V __go_td_AAAN5_uint8eee$gc 00bfc4e6 V __go_td_AAAN6_uint329e3e16e 00bfc5d2 V __go_td_AAAN6_uint329e3e16e$gc shows unaligned type descriptors.
[Bug go/79037] gccgo: Binaries crash with parforsetup: pos is not aligned on m68k
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79037 --- Comment #8 from Michael Karcher --- The patch looks like it should work fine, I guess John Paul Adrian Glaubitz is going to test it soon. But I wonder whether the determination of alignment is in types.cc really needed, as user-specified alignment directives can only make alignment stricter. So always passing 4 to implicit variable should do no harm and keep the code simpler.
[Bug go/79037] gccgo: Binaries crash with parforsetup: pos is not aligned on m68k
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79037 --- Comment #10 from Michael Karcher --- OK, I got it. I retract my last comment.
[Bug tree-optimization/68008] New: Pessimization of simple non-tail-recursive functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68008 Bug ID: 68008 Summary: Pessimization of simple non-tail-recursive functions Product: gcc Version: 5.1.1 URL: http://stackoverflow.com/questions/32974625 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gcc-bugzilla at mkarcher dot dialup.fu-berlin.de Target Milestone: --- Created attachment 36537 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36537&action=edit preprocessed source code As discussed on Stack Overflow in http://stackoverflow.com/questions/32974625 (Misleading title: "Is gcc's asm volatile equivalent to the gfortran default setting for recursions?"), gcc pessimizes the example fibonacci function, which can be avoided by "-fno-optimize-sibling-calls" (which is the reason I assigned the bug to tree-optimization) The non-optimal code is generated at least by gcc 4.5.3 (Debian 4.5.3-12) gcc version 4.9.2 (Debian 4.9.2-10) gcc version 5.1.1 20150507 (Debian 5.1.1-5) I will focus on gcc 5.1.1 in the remaining report. Configured with: ../src/configure -v --with-pkgversion='Debian 5.1.1-5' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=c++98 --disable-libstdcxx-dual-abi --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu gcc is invoked as: g++-5 -O3 -march=core2 -std=c++0x stack-overflow-32974625.cc; ./a.out compared to g++-5 -O3 -march=core2 -std=c++0x -fno-optimize-sibling-calls stack-overflow-32974625.cc; ./a.out There are no error or warning messages printed by the compiler or linker (even if warning would be enabled by -Wall -Wextra) The file stack-overflow-32974625.ii is attached. Timing with g++ 5.1.1 on a Core2Duo T7200 under 64-bit linux (cpu frequency scaling set to "performance"): With -fno-optimize-sibling-calls: 34.5us/iteration Without -fno-optimize-sibling-calls: 68us/iteration
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 Michael Karcher changed: What|Removed |Added CC||gcc-bugzilla at mkarcher dot dialu ||p.fu-berlin.de --- Comment #3 from Michael Karcher --- Created attachment 33954 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33954&action=edit Remove assumption that "not" is logical negate Indeed the treg combine pass is broken. Thanks to the nicely documented code, it could easily be determined, that this optimization pass contains the wrong assumption that "not" is an assembler instruction that can be used for logical negation of a register. As this is a bitwise negate instruction, it can not be used that way. As I was unable to find a machine instruction that performs logical negation, I prepared a patch that completely removes the parts that rely on a logical negate instruction, while keeping all other aspects on the optimization pass intact.
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #5 from Michael Karcher --- (In reply to Oleg Endo from comment #4) > I'm not sure about this. The first hunk of your patch that removes the > example in the top comment block should be valid, as far as I can see at the > moment. This is because the result of the bitwise not is then again tested > for zero before the conditional jump. As far as I understand the SH4 architecture, "TST r0,r0" sets the T bit if r0 is zero, and clears it otherwise. So MOV r1,r0; store r1 to r0, for exposition only TST r0,r0; T bit set if r1 == 0, cleared if r1 != 0 MOVT r0 ; r0 set to 1 if r1 == 0, cleared if r1 != 0 TST r0,r0; T bit set if r1 != 0, cleared if r1 == 0 while on the other hand NOT r1,r0; bitwise negate r1 to r0 (so r0 gets 0 if r1 == -1) TST r0,r0; T bit set if r1 == -1, cleared if r1 != -1 The first block clear T only if r1 equals zero, the second block clears T for all values of r1 except -1. So they are only equivalent if you can prove that r1 is either zero or -1, which you typically can't, especially as using 1 for true is typical C semantic. Because it seems that it is rare you can successfully prove that r1 either 0 or -1, I decided to remove that transformation alltogether. And to keep the comment accurate, I had to remove it from there, too. For the transformation to be valid, you would need a logical not instruction instead of the bitwise not instruction that sets the desination register to zero if the source register contains any non-zero value, not just if the source register contains -1. Possibly your confusion is created by "NOTT" (if it were available on SH4) being in fact valid in that example to negate the condition. This is due to the fact, that for 1-bit values (only!) bitwise negation and boolean negation are the same. > I haven't looked at the details (RTL dumps etc), but looking at the > problematic code in the description, the problem could be that the variable > 'condition' gets the wrong value assigned. 'condition' should have a value > of either 0 or 1. But the sh_treg_combine pass changes the value. If the > conditional jump is eliminated and the 'condition' value is then added to > 'truecount' directly, it will produce wrong values. Looking at the generated assembly code, I verified that indeed the pattern above is used, so "condition" gets stored in the "T" bit, and a conditional branch (BF) is used to skip the incrementation if it is false, i.e. cleared. After applying the transformation shown above, the value 1 in decision_result gets negated to -2 (0xFFFE), which is non-zero, so truecount gets increased if even though decision_result should be false after boolean negation.
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #7 from Michael Karcher --- (In reply to Oleg Endo from comment #6) > > For the transformation to be valid, you would need a logical not instruction > > instead of the bitwise not instruction that sets the desination register to > > zero if the source register contains any non-zero value, not just if the > > source register contains -1. > > There is a 2 insn sequence to do that on SH: >tst rm,rm >movt rn > > As far as I can see it, emitting this sequence instead of the bitwise not > insn would fix the problem. This can be done by changing the function > sh_treg_combine::make_not_reg_insn. It should be safe to emit that sequence > (which clobbers the T reg) instead of the bitwise not. This seems definitely true. But I wonder whether this is acutally needed, or already caught perfectly in try_eliminate_cstores. Let's have a look: The buggy branch in try_combine_comparisons applies only if (first loop) - All comparisons are the same type - The second operands are either all the same or all registers - Removal of those instruction is safe Furthermore, there need to be mixed cstores and - All minority comparisons need to be comparisons - All minority comparisons need to be against the integral constant zero So if one combines what applies to minority stores only in the test for applicability of the "make_not_reg_insn" idea with the general applicability check of the try_combine_comparisons pass, the end result is: The buggy branch only applies if - All comparisons are equality comparisons - The second operand needs to be the integral constant zero. Now this is exactly what you correctly commented in "example 1": [bb 3] (set (reg:SI 147 t) (eq:SI (reg:SI 173) (const_int 0))) (set (reg:SI 167) (xor:SI (reg:SI 147 t) (const_int 1))) -> bb 5 [bb 4] (set (reg:SI 147 t) (eq:SI (reg:SI 177) (const_int 0))) (set (reg:SI 167) (reg:SI 147 t)) -> bb 5 [bb 5] (set (reg:SI 147 t) (eq:SI (reg:SI 167) (const_int 0))) (set (pc) (if_then_else (ne (reg:SI 147 t) (const_int 0)) (label_ref:SI 50) (pc))) You are going to transform this to, assuming bb 4 is considered minority. (The first RTL instruction in BB4 gets transformed to "test rm, rm", the second RTL instruction gets transformed to "movt rn".) [bb 3] (set (reg:SI 167) (reg:SI 173)) -> bb 5 [bb 4] (set (reg:SI 147 t) (eq:SI (reg:SI 173) (const_int 0))) (set (reg:SI 167) (xor:SI (reg:SI 147 t) (const_int 1))) -> bb 5 [bb 5] (set (reg:SI 147 t) (eq:SI (reg:SI 167) (const_int 0))) (set (pc) (if_then_else (ne (reg:SI 147 t) (const_int 0))) (label_ref:SI 50) (pc))) It is afterwards quite likely that the set instruction in bb3 can be elminated by renaming reg 167 to reg 173. If this transformation is removed, though, the transformation of example 4 applies instead, which will result in [bb 3] (set (reg:SI 147 t) (eq:SI (reg:SI 173) (const_int 0))) -> bb 5 [bb 4] (set (reg:SI 147 t) (eq:SI (reg:SI 176) (const_int 0))) (set (reg:SI 147 t) (xor:SI (reg:SI 147 t) (const_int 1))) -> bb 5 [bb 5] (set (pc) (if_then_else (eq (reg:SI 147 t) (const_int 0)) // inverted (label_ref:SI 50) (pc))) and this gets (except SH2A with nott) transformed to (by define_insn_and_split "nott" in the machine definition) [bb 3] (set (reg:SI 147 t) (eq:SI (reg:SI 173) (const_int 0))) -> bb 5 [bb 4] (set (reg:SI 147 t) (eq:SI (reg:SI 176) (const_int 0))) (set (reg:SI 200) (xor:SI (reg:SI 147 t) (const_int 1))) (set (reg:SI 147 t) (eq:SI (reg:SI 200) (const_int 0))) -> bb 5 [bb 5] (set (pc) (if_then_else (eq (reg:SI 147 t) (const_int 0)) // inverted (label_ref:SI 50) (pc))) which, at least in my example, seems to be transformed by register renaming and common code elimination (combining the to treg setting instructions at the end of bb3 and bb4) to something like [bb 3] -> bb 5 [bb 4] (set (reg:SI 147 t) (eq:SI (reg:SI 176) (const_int 0))) (set (reg:SI 173) (xor:SI (reg:SI 147 t) (const_int 1))) -> bb 5 [bb 5] (set (reg:SI 147 t) (eq:SI (reg:SI 173) (const_int 0))) (set (pc) (if_then_else (eq (reg:SI 147 t) (const_int 0)) // inverted (label_ref:SI 50) (pc))) which is exactly the result you also get with special-casing the compare-to-zero case. > OK, thanks for the confirmation. I had a closer look at the current code > and your patch. I think that the bitwise-not code paths should not be > removed entirely, as they handle the "mixed normal and inverting cstores" > cases. I will prepare a patch in a couple of days. If I am not mistaken in my analysis above, the "mixed cstores" case handling in try_eliminate_cstores does already what you need, while pushing the desicision whether those are "comparisons to zero" and can be further optimized on common subexpression elimination, instead of hand-written code checking for a specia
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #8 from Michael Karcher --- Actually, the whole issue got me curious - I will try prepare a different patch along your suggestions and compare the compiler output. If I don't report back today, I probably won't do that in time, so don't hold your breath, though.
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #10 from Michael Karcher --- Created attachment 33991 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33991&action=edit Fix logical negation of registers, SImode only In fact, it turns out, you were right. I implemented the solution you suggest (quite hacky, as I only implemented it for SImode, I was too lazy to look up whether there are nice patterns for testing DImode operands into T), and compared assembler outputs. In fact, the duplicate test instruction is NOT eliminated in my first patch, but it is with your code after fixing. So the approach of special-casing compare-to-zero seems to have a positive effect. Assembler output: buggy: tst r4,r4 bf .L2 ; initial if, jumps if "flag" is nonzero ; Branch for zero "flag" mov.l .L10,r1 ; loads address of val mov.l @r1,r1 ; loads value of val ; Common code for zero or non-zero flag .L3: tst r1,r1; combined test bt .L8 ; if zero, skip increment mov.l .L11,r2 ; load address of true_count mov.l @r2,r1 ; load, increment, store true_count add #1,r1 mov.l r1,@r2 ; Label for skipping increment of true_count .L8: rts .align 1 ; branch for non-zero "flag" .L2: mov.l .L12,r1 ; load address of decision_result mov.l @r1,r1 ; load data of decision_result tst r1,r1; tst/movt logical negation before bra .L3 ; jumping to common code movtr1 With my earlier patch removing the "compare-to-zero" logic completely, the assembler output changes as following (CAPS comments on changes): ... mov.l @r1,r1 ; loads value of val tst r1,r1; TEST ONLY FOR FLAG == 0 ; Common code for zero or non-zero flag .L3: bt .L8 ; if zero, skip increment ... tst r1,r1; tst/movt logical negation before movtr1 bra .L3 ; jumping to common code tst r1,r1; TEST ONLY FOR FLAG != 0
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #11 from Michael Karcher --- Putting things straight after trying it out: (In reply to Michael Karcher from comment #7) [...] > and this gets (except SH2A with nott) transformed to (by > define_insn_and_split "nott" in the machine definition) > > [bb 3] > (set (reg:SI 147 t) (eq:SI (reg:SI 173) (const_int 0))) > -> bb 5 > > [bb 4] > (set (reg:SI 147 t) (eq:SI (reg:SI 176) (const_int 0))) > (set (reg:SI 200) (xor:SI (reg:SI 147 t) (const_int 1))) This is WRONG. It should read "(set (reg:SI 200) (reg:SI 147 t))" instead! > (set (reg:SI 147 t) (eq:SI (reg:SI 200) (const_int 0))) > -> bb 5 > > [bb 5] > (set (pc) (if_then_else (eq (reg:SI 147 t) (const_int 0)) // inverted > (label_ref:SI 50) (pc))) > > which, at least in my example, seems to be transformed by register renaming > and common code elimination (combining the to treg setting instructions at > the end of bb3 and bb4) to something like This turns out to be wrong too. The treg setting instructions are NOT combined. See the previous comment.
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #12 from Michael Karcher --- Further digging into this showed that there actually is a pass that would merge the two "tst r1,r1" instructions - the jump2 pass in cfgclenup.c. The optimization is called "crossjumping" in gcc, also known as tail merging. For some reason[1], gcc is reluctant to perform crossjumping on small common parts. In this case, the common part is just one instruction, and the parameter min-crossjump-insns is five by default unless optimizing for size. With "-fparam-min-crossjump-insns=1" or "-Os", my first patch removing all the special-casing of zero compares produces the same result as the patch fixing the logical negations. In the current situation, I see no advantage of not cross-jumping in this case, so the minimally invasive solution is definitely the second patch to fix logical negation, but still it somehow feels ugly to me to have a limited reimplementation of crossjumping in the sh-treg-combine pass. Replacing stuff of that pass by improving other optimizations is IMHO beyond the scope of this bug, though. [1] These are actually performance reasons: https://gcc.gnu.org/ml/gcc-patches/2004-07/msg00495.html
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #15 from Michael Karcher --- I did not get around to test your proposed patch yet, but it seems like the new "logical not" operation always compares only the low 32 bit against zero, even if there is a 64 bit operand. If my analysis is correct, the long long test program should fail if you replace "decision = 1;" by "decision = 0x1LL;"
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #18 from Michael Karcher --- As I said, I did not try your patch, but just read the source. The assembly you quoted convinces me that there is no problem in the code actually produced by your patch, which is great. This is caused by the pattern or-then-SImode-compare, as you explained. The or-then-SImode-compare optimization has an adverse effect on the test coverage, it seems. In both cases, GET_MODE(src_reg) and GET_MODE(dst_reg) are SImode, so the DImode output branch is not tested by any of your two example source files. Furthermore, it looks like make_not_reg_insn will actually produce bad code if it were ever called with GET_MODE(src_reg) == DImode. So I would strongly suggest to narrow it to only accept SImode input operands, so it fails to apply instead of generate bad code if something manages to call it with DImode input.
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #20 from Michael Karcher --- (In reply to Oleg Endo from comment #19) > > The or-then-SImode-compare optimization has an adverse effect on the test > > coverage, it seems. In both cases, GET_MODE(src_reg) and GET_MODE(dst_reg) > > are SImode, so the DImode output branch is not tested by any of your two > > example source files. > That is true as it stands now. However, we already anticipate that there > might be something going on with DImode stuff, so just adding the test might > help debugging in the future. Even if it doesn't add any value now, it > doesn't hurt anyone either. The test case is not a problem - but it would be helpful to have a testcase that actually tests the DImode output case. I understand that it likely is not possible with today's gcc to reach that branch, so it seems this has to stay the way it is now. I am fine with it. > > Furthermore, it looks like make_not_reg_insn will > > actually produce bad code if it were ever called with GET_MODE(src_reg) == > > DImode. > Please do explain. Of course. The instructions involving src_reg in make_not_reg_insn dealing with src_reg are completely quoted here: + // On SH we can do only SImode and DImode comparisons. + if (! (GET_MODE (src_reg) == SImode || GET_MODE (src_reg) == DImode)) +return NULL; In this fragment, you accept DImode source operands. So that code may be used to replace a DImode compare. + emit_insn (gen_rtx_SET (VOIDmode, m_ccreg, + gen_rtx_fmt_ee (EQ, SImode, src_reg, const0_rtx))); In this fragment, you are generating the replacement instruction, which is always an SImode compare. Maybe I miss the point, but I fail to undestand how an SImode compare might be acceptable on an DImode operand. Possibly, this even ICEs, I don't know enough about gcc internals to know what happens if src_reg is DImode which is passed to EQ in SImode.
[Bug target/63783] [4.9/5 Regression] [SH] Miscompilation of boolean negation on SH4 using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63783 --- Comment #22 from Michael Karcher --- OK, in that case I retract my objections and I think the patch is fine. I am sorry for that mistake.
[Bug target/67002] [5] [SH]: Bootstrap stage 2/3 comparison failure - gcc/real.o differs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67002 Michael Karcher changed: What|Removed |Added CC||gcc-bugzilla at mkarcher dot dialu ||p.fu-berlin.de --- Comment #15 from Michael Karcher --- Created attachment 36134 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36134&action=edit even further reduced test case I manually reduced the example even more. Interestingly it doesn't contain any C++ specific statements any more, yet it only fails -fcompare-debug with -x c++, but passes with -x c
[Bug target/67002] [5] [SH]: Bootstrap stage 2/3 comparison failure - gcc/real.o differs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67002 --- Comment #16 from Michael Karcher --- The bug seems to be quite similar to the infamous "sloth that was dropped on the head as a baby"-bug Linus discovered (https://lkml.org/lkml/2014/7/24/584 , https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61904). -fcompare-debug passes if -fno-var-tracking-assignments is passed as compiler option.
[Bug c/108483] New: gcc warns about suspicious constructs for unevaluted ?: operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483 Bug ID: 108483 Summary: gcc warns about suspicious constructs for unevaluted ?: operand Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gcc-bugzilla at mkarcher dot dialup.fu-berlin.de Target Milestone: --- Created attachment 54318 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54318&action=edit minimal example A well-known construct to determine array sizes at compile time is #define ARRAY_SIZE(x) (sizeof(x)/sizeof(*(x))) gcc helpfully warns for dangerous mis-use of this macro, as it only works with real arrays, not with pointer, for example. Assuming NULL is defined as ((void*)0), ARRAY_SIZE(NULL) yields a valid C expression, as long as we use the gcc extension that sizeof(void) equals one: ARRAY_SIZE(NULL) is expanded to essentially sizeof(void*)/sizeof(void) which yields 8 on usual 64-bit systems and 4 on usual 32-bit system. While this expression is valid, the result of this expression is likely not what the programmer intended, so the gcc warning "division ‘sizeof (void *) / sizeof (void)’ does not compute the number of array elements" is warranted. The Linux kernel contains a macro essentially being #define ARRAY_SIZE_MAYBENULL(x) ( __builtin_types_compatible_p(typeof(x), void*) ? 0 : (sizeof(x)/sizeof(*x)) ) which is intended to be invocable using actual array operands (returning the array size) or the compile-time constant NULL (returning zero). gcc correctly evaluates ARRAY_SIZE_MAYBENULL(NULL) to zero, but emits about the suspicious pattern in the third operand of the ternary operator. This is not helpful for the programmer, and breaks builds using -Wall -Werror. This is a feature request to omit warnings about dubious constructs like this if it can be statically determined that they are not evaluated. The example in the attachment compiles correctly and initializes x to 1, but emits the spurious warning about the unevaluated sizeof pattern.
[Bug c/108483] gcc warns about suspicious constructs for unevaluted ?: operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483 --- Comment #3 from Michael Karcher --- Thanks for the pointer to #4210. Note that 4210 is slightly different, though. In that report, the condition and the warnable expression are in different statements, and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210#c13 explicitly mentioned using a ternary expression to enable gcc to see the deadness of the code, using a flag called "skip_evaluation". This PR concerns a case that uses ?:, so I wonder whether skip_evaluation still exists, and could be used to gate the sizeof-pointer-div warning.
[Bug c/113973] New: Pleas issue a warning when using plain character values in bitwise operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113973 Bug ID: 113973 Summary: Pleas issue a warning when using plain character values in bitwise operations Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gcc-bugzilla at mkarcher dot dialup.fu-berlin.de Target Milestone: --- This example program compiles without any kind of warning in gcc: static char x = 0xD8; int main(void) { return 0x1200 | x; } The value returned from main is 0xFFD8 on architectures with 32-bit int and signed characters by default. After just fixing a bug that was caused by an unexpected sign expansion when building an int from individual bytes, I'd rather have a warning if 1) A variable of type char is promoted to int. 2) The int value is used in an bitwise expression 3) More than 8 bits of the results are actually used 4) More than 8 bits may be non-zero Because of condition 3, this will yiels no warning on "char y = x | 0x40;" (top bits truncated, so condition 3 fails) and no warning on "int y = x & 0x40;" (all high bits are guaranteed to be zero, so condition 4 fails). The real-world bug that motivates this enhancement proposal is https://github.com/hfst/hfst-ospell/issues/43