[Bug target/85203] cmse_nonsecure_caller intrinsic returns incorrect results
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85203 --- Comment #3 from Thomas Preud'homme --- Author: thopre01 Date: Wed Apr 11 09:47:21 2018 New Revision: 259309 URL: https://gcc.gnu.org/viewcvs?rev=259309&root=gcc&view=rev Log: [ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result __builtin_cmse_nonsecure_caller implementation returns true in almost all cases due to 2 separate bugs: * gen_addsi is used instead of gen_andsi to retrieve the lsb * the lsb boolean value is not negated but the specification [1] says the intrinsic should return true for a nonsecure caller and a nonsecure caller is characterized with LR's lsb being 0 This was not caught due to (1) lack of runtime test and (2) the existing RTL scan not taking into account that '.' matches newline in Tcl regular expressions. This commit fixes the implementation issues and improves testing of cmse_nonsecure_caller by (1) adding a runtime test for the secure caller case and (2) looking for an SET insn of an AND expression in the right function. This leaves the nonsecure caller case only partly tested since the exact value being AND and the negation are not covered by the scan and the existing test infrastructure does not allow 2 separate compilation and link to be performed. It is enough though to catch the current incorrect behavior. The commit also reorganize the scan directives in cmse-1.c to more easily identify what function they are intended to test in the file. 2018-04-11 Thomas Preud'homme Backport from mainline 2018-04-04 Thomas Preud'homme gcc/ PR target/85203 * config/arm/arm-builtins.c (arm_expand_builtin): Change expansion to perform a bitwise AND of the argument followed by a boolean negation of the result. gcc/testsuite/ PR target/85203 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan to match a single insn of the baz function. Move scan directives at the end of the file below the functions they are trying to test for better readability. * gcc.target/arm/cmse/cmse-16.c: New testcase. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/cmse/cmse-16.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/arm/arm-builtins.c branches/gcc-7-branch/gcc/testsuite/ChangeLog branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 --- Comment #4 from Thomas Preud'homme --- Author: thopre01 Date: Wed Apr 11 10:07:25 2018 New Revision: 259310 URL: https://gcc.gnu.org/viewcvs?rev=259310&root=gcc&view=rev Log: [ARM] Fix PR85261: ICE with FPSCR setter builtin Instruction pattern for setting the FPSCR expects the input value to be in a register. However, __builtin_arm_set_fpscr expander does not ensure that this is the case and as a result GCC ICEs when the builtin is called with a constant literal. This commit fixes the builtin to force the input value into a register. It also remove the unneeded volatile in the existing fpscr test and fixes the function prototype. 2018-04-11 Thomas Preud'homme gcc/ PR target/85261 * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand into register. gcc/testsuite/ PR target/85261 * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand into register. Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm-builtins.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/arm/fpscr.c
[Bug middle-end/85344] New: Constant constraint check sign extends unsigned constant input operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 Bug ID: 85344 Summary: Constant constraint check sign extends unsigned constant input operands Product: gcc Version: 8.0.1 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: middle-end Assignee: thopre01 at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi Hi, Compiling the following piece of code with -mthumb -mcpu=cortex-m0 generates an "impossible constraint in 'asm' error" void foo (void) { __asm( "%0" :: "I" ((unsigned char) 0x80)); } It appears that although the operand is unsigned, it gets sign extended to perform the constraint check.
[Bug middle-end/85344] Constant constraint check sign extends unsigned constant input operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-04-11 Ever confirmed|0 |1 Known to fail||6.4.1, 7.3.1, 8.0.1 --- Comment #1 from Thomas Preud'homme --- It believe that the problem is in expand_asm_stmt (). The code calls into expand for the operand which will create a const_int of -128 with the assumption that an outer RTX will tell how to interpret that constant (eg. zero_extend:SI (const_int -128)). However the code uses that value directly calling INTVAL on it. It should instead extend it according to the signedness of the operand so that INTVAL returns the initial value.
[Bug middle-end/85344] Constant constraint check sign extends unsigned constant input operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #2 from Thomas Preud'homme --- I have a patch, starting testing.
[Bug middle-end/85344] Constant constraint check sign extends unsigned constant input operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #3 from Thomas Preud'homme --- More worrying is that this code compiles without error when it should error out: void foo (void) { __asm( "%0" :: "J" ((unsigned char) 0x80)); }
[Bug middle-end/85344] Constant constraint check sign extends unsigned constant input operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 Thomas Preud'homme changed: What|Removed |Added Keywords||wrong-code --- Comment #4 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #3) > More worrying is that this code compiles without error when it should error > out: > > void > foo (void) > { > __asm( "%0" :: "J" ((unsigned char) 0x80)); > } In which case we end up with #-128 in the assembly which is not what the user wrote. Thus adding wrong-code tag.
[Bug middle-end/85344] Constant constraint check sign extends unsigned constant input operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #5 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #4) > (In reply to Thomas Preud'homme from comment #3) > > More worrying is that this code compiles without error when it should error > > out: > > > > void > > foo (void) > > { > > __asm( "%0" :: "J" ((unsigned char) 0x80)); > > } > > In which case we end up with #-128 in the assembly which is not what the > user wrote. Thus adding wrong-code tag. I have difficulty to make up my mind about what is the expected behavior for constant input in inline asm, esp. for immediate constraint. But first let's look at register constraint: asm ("mov %0, %0" : "=r"(result) : "0"((signed char) -128)); will put 0x0080 in the register holding result. It basically set the register in the mode of the immediate, so QImode here. I would have expected for a 32bit register to have the immediate as 2-complement 32bit value, ie 0xFF80 in this case. The documentation says that a general register should be used which is true in both case. Now what about asm ("%0" :: "i"((unsigned char) 128)); This gives -128 in the assembly code, on at least ARM and x86_64 targets but I would expect likewise on all targets. Likewise: asm ("%0" :: "i"(0x8000)); gives #-2147483648 in assembly for the similar reason. Here my expectation is that provided that the constant can be represented in a const_int its value as a C level constant is what should be printed in assembly and the constraint check (eg for constraint I in ARM backend) should be based on that value as well. Thoughts?
[Bug target/85434] New: Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 Bug ID: 85434 Summary: Address of stack protector guard spilled to stack on ARM Product: gcc Version: 8.0.1 Status: UNCONFIRMED Keywords: diagnostic Severity: normal Priority: P3 Component: target Assignee: thopre01 at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org Target Milestone: --- Target: arm-linux-gnueabihf Created attachment 43962 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43962&action=edit Testcase for stack protector spilling guard address When compiling the attached file with -mcpu=cortex-a57 -std=c99 -Os -fpic -fstack-protector-strong the address to the stack gets spilled on the stack where an attacker using buffer overflow could overwrite it and thus control what is the canari checked against: ldr r3, [sp] <--- accessing spilled address of guard from stack mov r0, r4 ldr r2, [sp, #228] ldr r3, [r3] cmp r2, r3 beq .L18 bl __stack_chk_fail(PLT) I can reproduce this on GCC 7 and trunk at least.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-04-17 Ever confirmed|0 |1 --- Comment #1 from Thomas Preud'homme --- This is caused by missing stack_protect_set and stack_protect_test pattern in ARM backend. It would be nice though if the address could be marked such that it doesn't go on the stack to have the default implementation a bit more robust. It might be worth having a warning if the override is not done as well.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #2 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #1) > This is caused by missing stack_protect_set and stack_protect_test pattern > in ARM backend. It would be nice though if the address could be marked such > that it doesn't go on the stack to have the default implementation a bit > more robust. It might be worth having a warning if the override is not done > as well. Nope sorry, the address is put in a register before the test pattern is called.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #3 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #2) > (In reply to Thomas Preud'homme from comment #1) > > This is caused by missing stack_protect_set and stack_protect_test pattern > > in ARM backend. It would be nice though if the address could be marked such > > that it doesn't go on the stack to have the default implementation a bit > > more robust. It might be worth having a warning if the override is not done > > as well. > > Nope sorry, the address is put in a register before the test pattern is > called. This happens when expanding the tree that holds the guard's address. It's a symbol_ref for which the default expand code of loading into a register is used. This happens also for AArch64 and I suspect for x86 as well. What makes it worse on ARM is that cse_local sees 2 SET instructions computing the address of the guard and reuse the register being set in the first instruction instead of recomputing again. Because of the distance between that first SET and the comparison between guard and canari the chances of getting the address spilled are higher. This does not happen for AArch64 because the mov of symbol_ref generates an UNSPEC of a memory address whereas ARM generates a MEM of an UNSPEC symbol_ref. However I suppose with scheduling it's possible for the set of guard address and following test of guard against canari to be moved apart and spill to happen in theory. My feeling is that the target patterns should also do the address computation, ie stack_protect_set and stack_protect_test would take that MEM of symbol_ref instead of expanding it first. Thoughts?
[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 --- Comment #5 from Thomas Preud'homme --- Author: thopre01 Date: Wed Apr 18 11:42:10 2018 New Revision: 259465 URL: https://gcc.gnu.org/viewcvs?rev=259465&root=gcc&view=rev Log: [ARM] Fix PR85261: ICE with FPSCR setter builtin Instruction pattern for setting the FPSCR expects the input value to be in a register. However, __builtin_arm_set_fpscr expander does not ensure that this is the case and as a result GCC ICEs when the builtin is called with a constant literal. This commit fixes the builtin to force the input value into a register. It also remove the unneeded volatile in the existing fpscr test and fixes the function prototype. 2018-04-18 Thomas Preud'homme Backport from mainline 2018-04-11 Thomas Preud'homme gcc/ PR target/85261 * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand into register. gcc/testsuite/ PR target/85261 * gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with literal value. Expect 2 MCR instruction. Fix function prototype. Remove volatile keyword. Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/arm/arm-builtins.c branches/gcc-7-branch/gcc/testsuite/ChangeLog branches/gcc-7-branch/gcc/testsuite/gcc.target/arm/fpscr.c
[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 --- Comment #6 from Thomas Preud'homme --- Author: thopre01 Date: Wed Apr 18 13:17:30 2018 New Revision: 259469 URL: https://gcc.gnu.org/viewcvs?rev=259469&root=gcc&view=rev Log: [ARM] Fix PR85261: ICE with FPSCR setter builtin Instruction pattern for setting the FPSCR expects the input value to be in a register. However, __builtin_arm_set_fpscr expander does not ensure that this is the case and as a result GCC ICEs when the builtin is called with a constant literal. This commit fixes the builtin to force the input value into a register. It also remove the unneeded volatile in the existing fpscr test and fixes the function prototype. 2018-04-18 Thomas Preud'homme Backport from mainline 2018-04-11 Thomas Preud'homme gcc/ PR target/85261 * config/arm/arm-builtins.c (arm_expand_builtin): Force input operand into register. gcc/testsuite/ PR target/85261 * gcc.target/arm/fpscr.c: Add call to __builtin_arm_set_fpscr with literal value. Expect 2 MCR instruction. Fix function prototype. Remove volatile keyword. Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/arm/arm-builtins.c branches/gcc-6-branch/gcc/testsuite/ChangeLog branches/gcc-6-branch/gcc/testsuite/gcc.target/arm/fpscr.c
[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 Thomas Preud'homme changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work||6.4.1 Resolution|--- |FIXED Known to fail|6.4.1 |6.4.0 --- Comment #7 from Thomas Preud'homme --- Fixed in all release branches.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #5 from Thomas Preud'homme --- (In reply to Wilco from comment #4) > Clearly rematerialization isn't working correctly. Immediates and constant > addresses like this should never be spilled (using MOV/MOVK could increase > codesize, but with -Os you should use the literal pool anyway). Check > legitimate_constant_p returns true, see > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00052.html for how it's done > on AArch64. By the time LRA happens, this is what LRA sees of the address computation: (insn 321 6 322 2 (set (reg:SI 274) (unspec:SI [ (symbol_ref:SI ("__stack_chk_guard") [flags 0xc0] ) ] UNSPEC_PIC_SYM)) "test_arm_sp_fpic.c":41 181 {pic_load_addr_32bit} (expr_list:REG_EQUIV (unspec:SI [ (symbol_ref:SI ("__stack_chk_guard") [flags 0xc0] ) ] UNSPEC_PIC_SYM) (nil))) (insn 322 321 11 2 (set (reg/f:SI 175) (mem:SI (plus:SI (reg:SI 176) (reg:SI 274)) [0 S4 A32])) "test_arm_sp_fpic.c":41 876 {*thumb2_movsi_insn} (expr_list:REG_DEAD (reg:SI 274) (expr_list:REG_DEAD (reg:SI 176) (expr_list:REG_EQUIV (symbol_ref:SI ("__stack_chk_guard") [flags 0xc0] ) (nil) It is this 175 which is spilled because LRA sees it doesn't have a register of the right class to allocate that pseudo. Because it's a MEM it doesn't see it as a constant expression and thus does not rematerialize it.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #6 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #3) > > My feeling is that the target patterns should also do the address > computation, ie stack_protect_set and stack_protect_test would take that MEM > of symbol_ref instead of expanding it first. The more I think about it the more I think it's the only way to guarantee the guard's address does not end up in a register that could be spilled. The spilling itself is more likely to happen when reading the GOT entry that holds the guard's address is not represented by an UNSPEC because then the address computed when setting the canari is reused when doing the comparison. UNSPEC seems to prevent that (even though it's not UNSPEC_VOLATILE).
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #7 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #6) > (In reply to Thomas Preud'homme from comment #3) > > > > My feeling is that the target patterns should also do the address > > computation, ie stack_protect_set and stack_protect_test would take that MEM > > of symbol_ref instead of expanding it first. > > The more I think about it the more I think it's the only way to guarantee > the guard's address does not end up in a register that could be spilled. The > spilling itself is more likely to happen when reading the GOT entry that > holds the guard's address is not represented by an UNSPEC because then the > address computed when setting the canari is reused when doing the > comparison. UNSPEC seems to prevent that (even though it's not > UNSPEC_VOLATILE). I'm experimenting with a patch to mark the MEM to access the GOT entry as volatile in ARM backend in order to prevent CSE. It works on this PR's testcase so will give bootstrap and regression testing a try. As I said, this doesn't fully solve the underlying issue IMO but together with implementation of stack_protect_set and stack_protect_test in ARM it should make stack protector as reliable on ARM targets as on AArch64.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #8 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #7) > (In reply to Thomas Preud'homme from comment #6) > > (In reply to Thomas Preud'homme from comment #3) > > > > > > My feeling is that the target patterns should also do the address > > > computation, ie stack_protect_set and stack_protect_test would take that > > > MEM > > > of symbol_ref instead of expanding it first. > > > > The more I think about it the more I think it's the only way to guarantee > > the guard's address does not end up in a register that could be spilled. The > > spilling itself is more likely to happen when reading the GOT entry that > > holds the guard's address is not represented by an UNSPEC because then the > > address computed when setting the canari is reused when doing the > > comparison. UNSPEC seems to prevent that (even though it's not > > UNSPEC_VOLATILE). > > I'm experimenting with a patch to mark the MEM to access the GOT entry as > volatile in ARM backend in order to prevent CSE. It works on this PR's > testcase so will give bootstrap and regression testing a try. As I said, > this doesn't fully solve the underlying issue IMO but together with > implementation of stack_protect_set and stack_protect_test in ARM it should > make stack protector as reliable on ARM targets as on AArch64. Found out the approach needs further changes for Thumb-1 because a PIC access is done in several instructions. I'm addressing those at the moment.
[Bug target/85401] segfault building code for VAX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85401 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-04-25 Ever confirmed|0 |1
[Bug target/85401] segfault building code for VAX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85401 Thomas Preud'homme changed: What|Removed |Added Status|ASSIGNED|UNCONFIRMED CC||m...@3am-software.com Assignee|thopre01 at gcc dot gnu.org|unassigned at gcc dot gnu.org Ever confirmed|1 |0 --- Comment #2 from Thomas Preud'homme --- Actually given where the fault occurs this does not seem to be related to bswap. Might be a missing pattern or cost in the vax backend.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #9 from Thomas Preud'homme --- Managed to reach a state where nothing is spilled on the stack for Thumb-1 either. I want to do 3 more changes before I start full testing: - put some compiler barrier between address computation and canari setting / canari check to ensure no instruction is scheduled between the two - clarify in documentation that it's the target's responsability to ensure guard's address computation is not CSE (in particular for PIC access) - tighten scan pattern for testcase a bit more
[Bug lto/69866] lto1: internal compiler error: in add_symbol_to_partition_1, at lto/lto-partition.c:158
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69866 Thomas Preud'homme changed: What|Removed |Added Known to work||8.1.0 --- Comment #15 from Thomas Preud'homme --- Fixed in r249224 and thus fixed on 8.1
[Bug rtl-optimization/71878] ICE in cselib_record_set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71878 Thomas Preud'homme changed: What|Removed |Added Status|NEW |RESOLVED Known to work||6.3.0 Resolution|--- |FIXED Known to fail|6.2.1 |6.2.0 --- Comment #6 from Thomas Preud'homme --- Backported to GCC 6 as r240257, therefore fixed on all currently supported releases.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #11 from Thomas Preud'homme --- I've started to work on a new patch according to review feedbacks. I've reached the stage where I can compile without -fPIC with the stack protect test being an UNSPEC split after register allocation as suggested. Next steps are: 1) do the same for the stack protect set (ie setting the canari) 2) add support for PIC access to the guard 3) include the conditional branch in the combined stack protect test to ensure the register holding the result of the comparison is not spilled before it's used for the conditional branch 4) clear all registers involved before branching 5) cleanup the patch
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #12 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #11) > I've started to work on a new patch according to review feedbacks. I've > reached the stage where I can compile without -fPIC with the stack protect > test being an UNSPEC split after register allocation as suggested. > > Next steps are: > > 1) do the same for the stack protect set (ie setting the canari) Done > 3) include the conditional branch in the combined stack protect test to > ensure the register holding the result of the comparison is not spilled > before it's used for the conditional branch Done > 5) cleanup the patch In progress > 2) add support for PIC access to the guard > 4) clear all registers involved before branching TODO.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #13 from Thomas Preud'homme --- Remains now: 1) add support for PIC access to the guard 2) finish cleanup of the patch
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #14 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #13) > Remains now: > > 1) add support for PIC access to the guard > 2) finish cleanup of the patch Except for a few missing comments, it's all there. I'll start testing now.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #15 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #14) > (In reply to Thomas Preud'homme from comment #13) > > Remains now: > > > > 1) add support for PIC access to the guard > > 2) finish cleanup of the patch > > Except for a few missing comments, it's all there. I'll start testing now. I'm currently working on fixing the ICE found during testing.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 Thomas Preud'homme changed: What|Removed |Added Alias||CVE-2018-12886 --- Comment #16 from Thomas Preud'homme --- Adding CVE number
[Bug target/84272] [8 Regression] AddressSanitizer: heap-use-after-free ../../gcc/config/aarch64/cortex-a57-fma-steering.c:519 in fma_node::get_parity()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84272 --- Comment #7 from Thomas Preud'homme --- Hi Jakub, First of all, thanks for beating me to it. Wanted to look at it but am fighting strong headache since Wednesday. Regarding the second patch, I believe it is the right approach but found another bug in the surrounding lines which I think should be fixed in the same patch: /* Absence of children might indicate an alternate root of a *chain*. It's ok to skip it here as the chain will be renamed when processing the canonical root for that chain. */ if (node->get_children ()->empty ()) continue; Comment 1: I advocate removing this block altogether. This block prevents node from being deleted when it is a leaf. The comment is also completely outdated and refer to when the dfs was done in the same function as the renaming. Comment 2: Explain why deferred freeing is necessary on top of the line adding the node to to_free. Something along the lines of: /* Defer freeing so that the process_node callback can access the parent and children of the node being processed. */ Best regards, Thomas
[Bug target/84272] [8 Regression] AddressSanitizer: heap-use-after-free ../../gcc/config/aarch64/cortex-a57-fma-steering.c:519 in fma_node::get_parity()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84272 --- Comment #10 from Thomas Preud'homme --- (In reply to Jakub Jelinek from comment #9) > Created attachment 43384 [details] > gcc8-pr84272.patch > > So like this? A few additional changes, Florian Weimer suggested using > preincrement instead of postincrement with the C++ iterators, and I hope > there are no pointers in between different forest, so it should be fine to > free already when processing of the forest is done, rather than waiting > until all forests are processed. > > Could somebody test it on aarch64-linux please? Yes I think it should be fine. Sorry, got to head back home again but Kyrill said he would be testing it.
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #17 from Thomas Preud'homme --- Patch has been posted for review: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00246.html
[Bug lto/69866] lto1: internal compiler error: in add_symbol_to_partition_1, at lto/lto-partition.c:158
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69866 Thomas Preud'homme changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|thopre01 at gcc dot gnu.org|unassigned at gcc dot gnu.org
[Bug lto/69866] lto1: internal compiler error: in add_symbol_to_partition_1, at lto/lto-partition.c:158
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69866 --- Comment #16 from Thomas Preud'homme --- @honza: would you mind backporting to GCC 7? IIRW GCC 6 backport is more tricky. Thanks!
[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673 --- Comment #6 from Thomas Preud'homme --- The following simpler testcase also shows the problem: void fn1() { register const int h asm("r2") = 1; __asm__(".ifnc %0,r2; .err; .endif\n\t" "bl __put_user_4" :: "r"(h)); }
[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673 --- Comment #7 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #6) > The following simpler testcase also shows the problem: > > void fn1() { > register const int h asm("r2") = 1; > __asm__(".ifnc %0,r2; .err; .endif\n\t" > "bl __put_user_4" :: "r"(h)); > } The register label gets optimized during gimple stages.
[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673 --- Comment #8 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #7) > (In reply to Thomas Preud'homme from comment #6) > > The following simpler testcase also shows the problem: > > > > void fn1() { > > register const int h asm("r2") = 1; > > __asm__(".ifnc %0,r2; .err; .endif\n\t" > > "bl __put_user_4" :: "r"(h)); > > } > > The register label gets optimized during gimple stages. Dump for original already shows propagation of the constant has happened which later lead to the removal of the register declaration: ;; Function fn1 (null) ;; enabled by -tree-original { register const int h __asm__ (*r2) = 1; register const int h __asm__ (*r2) = 1; __asm__ __volatile__(".ifnc %0,r2; .err; .endif\n\tbl\t__put_user_4"::"r" 1); }
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #18 from Thomas Preud'homme --- Author: thopre01 Date: Thu Aug 2 09:07:17 2018 New Revision: 263245 URL: https://gcc.gnu.org/viewcvs?rev=263245&root=gcc&view=rev Log: [ARM] Fix PR85434: spilling of stack protector guard's address on ARM In case of high register pressure in PIC mode, address of the stack protector's guard can be spilled on ARM targets as shown in PR85434, thus allowing an attacker to control what the canary would be compared against. This is also known as CVE-2018-12886. ARM does lack stack_protect_set and stack_protect_test insn patterns, defining them does not help as the address is expanded regularly and the patterns only deal with the copy and test of the guard with the canary. This problem does not occur for x86 targets because the PIC access and the test can be done in the same instruction. Aarch64 is exempt too because PIC access insn pattern are mov of UNSPEC which prevents it from the second access in the epilogue being CSEd in cse_local pass with the first access in the prologue. The approach followed here is to create new "combined" set and test standard pattern names that take the unexpanded guard and do the set or test. This allows the target to use an opaque pattern (eg. using UNSPEC) to hide the individual instructions being generated to the compiler and split the pattern into generic load, compare and branch instruction after register allocator, therefore avoiding any spilling. This is here implemented for the ARM targets. For targets not implementing these new standard pattern names, the existing stack_protect_set and stack_protect_test pattern names are used. To be able to split PIC access after register allocation, the functions had to be augmented to force a new PIC register load and to control which register it loads into. This is because sharing the PIC register between prologue and epilogue could lead to spilling due to CSE again which an attacker could use to control what the canary gets compared against. 2018-08-02 Thomas Preud'homme gcc/ PR target/85434 * target-insns.def (stack_protect_combined_set): Define new standard pattern name. (stack_protect_combined_test): Likewise. * cfgexpand.c (stack_protect_prologue): Try new stack_protect_combined_set pattern first. * function.c (stack_protect_epilogue): Try new stack_protect_combined_test pattern first. * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now parameters to control which register to use as PIC register and force reloading PIC register respectively. Insert in the stream of insns if possible. (legitimize_pic_address): Expose above new parameters in prototype and adapt recursive calls accordingly. (arm_legitimize_address): Adapt to new legitimize_pic_address prototype. (thumb_legitimize_address): Likewise. (arm_emit_call_insn): Adapt to new require_pic_register prototype. * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype change. * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address prototype change. (stack_protect_combined_set): New insn_and_split pattern. (stack_protect_set): New insn pattern. (stack_protect_combined_test): New insn_and_split pattern. (stack_protect_test): New insn pattern. * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. (UNSPEC_SP_TEST): Likewise. * doc/md.texi (stack_protect_combined_set): Document new standard pattern name. (stack_protect_set): Clarify that the operand for guard's address is legal. (stack_protect_combined_test): Document new standard pattern name. (stack_protect_test): Clarify that the operand for guard's address is legal. gcc/testsuite/ PR target/85434 * gcc.target/arm/pr85434.c: New test. Added: trunk/gcc/testsuite/gcc.target/arm/pr85434.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfgexpand.c trunk/gcc/config/arm/arm-protos.h trunk/gcc/config/arm/arm.c trunk/gcc/config/arm/arm.md trunk/gcc/config/arm/unspecs.md trunk/gcc/doc/md.texi trunk/gcc/function.c trunk/gcc/target-insns.def trunk/gcc/testsuite/ChangeLog
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #20 from Thomas Preud'homme --- (In reply to Christophe Lyon from comment #19) > Created attachment 44489 [details] > Source file causing ICE on aarch64 > > With your patch, GCC crashes with target aarch64-none-linux-gnu > aarch64-none-linux-gnu-gcc gethnamaddr.i -fstack-protector > > during RTL pass: expand > gethnamaddr.c: In function 'getanswer': > gethnamaddr.c:179:1: internal compiler error: in maybe_gen_insn, at > optabs.c:7307 > getanswer (const querybuf *answer, int anslen, const char *qname, int qtype) > ^ > 0xafcef2 maybe_gen_insn(insn_code, unsigned int, expand_operand*) > /home/christophe.lyon/src/GCC/sources/gcc-fsf/trunk/gcc/optabs.c:7307 > 0xaffb88 maybe_expand_insn(insn_code, unsigned int, expand_operand*) > /home/christophe.lyon/src/GCC/sources/gcc-fsf/trunk/gcc/optabs.c:7351 > 0x7748a0 stack_protect_prologue > > /home/christophe.lyon/src/GCC/sources/gcc-fsf/trunk/gcc/cfgexpand.c:6117 > 0x7748a0 execute > > /home/christophe.lyon/src/GCC/sources/gcc-fsf/trunk/gcc/cfgexpand.c:6357 > Please submit a full bug report, Thanks Christophe, It seems to have impacted x86 as well. I'll look at all those and respin the patch. I've reverted it in the meantime.
[Bug other/86834] [9 regression] several tests fail with ICE starting with r263245
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86834 --- Comment #2 from Thomas Preud'homme --- Thanks for the detailed report.
[Bug other/86834] [9 regression] several tests fail with ICE starting with r263245
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86834 Thomas Preud'homme changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #4 from Thomas Preud'homme --- As pointed by Jakub, commit was reverted in r263252. New iteration of the patch (currently still in testing) has been checked it doesn't have this regression. Best regards, Thomas
[Bug target/87374] [8/9 Regression] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87374 Thomas Preud'homme changed: What|Removed |Added Last reconfirmed|2018-09-21 00:00:00 |2018-9-25 Assignee|unassigned at gcc dot gnu.org |thopre01 at gcc dot gnu.org --- Comment #2 from Thomas Preud'homme --- Can reproduce.
[Bug target/87374] [8/9 Regression] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87374 Thomas Preud'homme changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from Thomas Preud'homme --- The ICE occurs because of a mismatch between the movw/movt splitter for arm_disable_literal_pool (-mslow-flash-data) and the arm_movt instruction pattern. The latter is guarded by -mword-relocations being disabled (via arm_valid_symbolic_address_p) since it requires patching a 16-bit immediate but not the former. Adding a similar check in the splitter causes relocation error at assembly time though as GCC generate symbol+offset references instead of doing the add as a separate instructions.
[Bug target/87374] [8/9 Regression] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87374 --- Comment #4 from Thomas Preud'homme --- My approach was wrong, fundamentally -mslow-flash-data and -mword-relocations cannot both be in effect since there is then no way to load an address: - -mslow-flash-data forbids literal pools - -mword-relocations forbids MOVW/MOVT I've written a patch to make the two options conflicts. Testing it now.
[Bug target/87156] [9 Regression] ICE building libstdc++ for mips64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87156 Thomas Preud'homme changed: What|Removed |Added Last reconfirmed||2018-9-28 CC||thopre01 at gcc dot gnu.org --- Comment #2 from Thomas Preud'homme --- Just hit it again with yesterday's trunk on the compile farm.
[Bug target/87156] [9 Regression] ICE building libstdc++ for mips64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87156 --- Comment #5 from Thomas Preud'homme --- (In reply to Paul Hua from comment #4) > (In reply to Jan Hubicka from comment #3) > > Does the attached patch fix the bootstrap? > > Index: cgraphclones.c > > === > > --- cgraphclones.c (revision 264180) > > +++ cgraphclones.c (working copy) > > @@ -967,6 +967,8 @@ cgraph_node::create_version_clone_with_b > >SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); > >SET_DECL_RTL (new_decl, NULL); > > > > + DECL_VIRTUAL_P (new_decl) = 0; > > + > >/* When the old decl was a con-/destructor make sure the clone isn't. */ > >DECL_STATIC_CONSTRUCTOR (new_decl) = 0; > >DECL_STATIC_DESTRUCTOR (new_decl) = 0; > > Yes, fixed. Thanks. Likewise for me on gcc23.
[Bug c++/79085] [6/7/8 Regression] ICE with placement new to unaligned location
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79085 --- Comment #8 from Thomas Preud'homme --- (In reply to Jakub Jelinek from comment #7) > Created attachment 43668 [details] > gcc8-pr79085.patch > > Untested fix. That fixes the original testcase for me. Thanks!
[Bug c++/79085] [6/7 Regression] ICE with placement new to unaligned location
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79085 --- Comment #11 from Thomas Preud'homme --- (In reply to Jakub Jelinek from comment #10) > Fixed for 8.1+ so far. Hi Jakub, Are you planning to do a backport? Best regards.
[Bug target/85203] New: cmse_nonsecure_caller intrinsic returns incorrect results
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85203 Bug ID: 85203 Summary: cmse_nonsecure_caller intrinsic returns incorrect results Product: gcc Version: 7.3.1 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: thopre01 at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi Hi, The cmse_nonsecure_caller intrinsic for Armv8-M Baseline and Mainline architecture returns true in almost all cases, ie. compiling and running the following test with arm-none-eabi-gcc -Os -mcmse -march=armv8-m.main will return an error: #include int foo (void) { return cmse_nonsecure_caller (); } int main (void) { /* Return success (0) if main is secure, ie if cmse_nonsecure_caller/foo returns false (0). */ return foo (); } Looking at the implementation of the associated __builtin_cmse_nonsecure_caller in gcc/config/arm/arm-builtins.c we can see why: * it performs an add instead of an and to get the lsb of LR * it does not negate the value to return true when the lsb is 0 This means that except for 0x it will always return true. I'm currently testing a patch as I write these lines.
[Bug target/85203] cmse_nonsecure_caller intrinsic returns incorrect results
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85203 --- Comment #1 from Thomas Preud'homme --- Author: thopre01 Date: Wed Apr 4 17:31:46 2018 New Revision: 259097 URL: https://gcc.gnu.org/viewcvs?rev=259097&root=gcc&view=rev Log: [ARM] Fix PR85203: cmse_nonsecure_caller returns wrong result __builtin_cmse_nonsecure_caller implementation returns true in almost all cases due to 2 separate bugs: * gen_addsi is used instead of gen_andsi to retrieve the lsb * the lsb boolean value is not negated but the specification says the intrinsic should return true for a nonsecure caller and a nonsecure caller is characterized with LR's lsb being 0 This was not caught due to (1) lack of runtime test and (2) the existing RTL scan not taking into account that '.' matches newline in Tcl regular expressions. This commit fixes the implementation issues and improves testing of cmse_nonsecure_caller by (1) adding a runtime test for the secure caller case and (2) looking for an SET insn of an AND expression in the right function. This leaves the nonsecure caller case only partly tested since the exact value being AND and the negation are not covered by the scan and the existing test infrastructure does not allow 2 separate compilation and link to be performed. It is enough though to catch the current incorrect behavior. The commit also reorganize the scan directives in cmse-1.c to more easily identify what function they are intended to test in the file. 2018-04-04 Thomas Preud'homme gcc/ PR target/85203 * config/arm/arm-builtins.c (arm_expand_builtin): Change expansion to perform a bitwise AND of the argument followed by a boolean negation of the result. gcc/testsuite/ PR target/85203 * gcc.target/arm/cmse/cmse-1.c: Tighten cmse_nonsecure_caller RTL scan to match a single insn of the baz function. Move scan directives at the end of the file below the functions they are trying to test for better readability. * gcc.target/arm/cmse/cmse-16.c: New testcase. Added: trunk/gcc/testsuite/gcc.target/arm/cmse/cmse-16.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm-builtins.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
[Bug target/85203] cmse_nonsecure_caller intrinsic returns incorrect results
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85203 Thomas Preud'homme changed: What|Removed |Added Known to work||8.0 Known to fail|8.0 | --- Comment #2 from Thomas Preud'homme --- Fixed in trunk
[Bug target/85261] New: __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 Bug ID: 85261 Summary: __builtin_arm_set_fpscr ICEs with constant input Product: gcc Version: 8.0.1 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: target Assignee: thopre01 at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi The following code ICES when compiled with -mcpu=cortex-m4 -mfloat-abi=hard -mfpu=fpv4-sp-d16 on trunk: void test_fpscr (void) { __builtin_arm_set_fpscr (0); }
[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-04-06 Ever confirmed|0 |1 Known to fail||8.0.1 --- Comment #1 from Thomas Preud'homme --- Have a patch, testing it.
[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 Thomas Preud'homme changed: What|Removed |Added Known to fail||6.4.1, 7.3.1 --- Comment #3 from Thomas Preud'homme --- (In reply to Ramana Radhakrishnan from comment #2) > What about earlier branches ? Oh yes, forgot the previous fpscr fix was also backported.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 --- Comment #6 from Thomas Preud'homme --- Happens at expand time. Diving in.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 Thomas Preud'homme changed: What|Removed |Added CC||thopre01 at gcc dot gnu.org --- Comment #7 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #6) > Happens at expand time. Diving in. There's a giant if in expand_expr_real_1 with the following comment: /* In cases where an aligned union has an unaligned object as a field, we might be extracting a BLKmode value from an integer-mode (e.g., SImode) object. Handle this case by doing the extract into an object as wide as the field (which we know to be the width of a basic mode), then storing into memory, and changing the mode to BLKmode. */ The "if" is entered in the big endian unaligned case but not in the other case. In the aligned case, it continues after the if until the call to flip_storage_order which will generate the bswap.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 --- Comment #8 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #7) > (In reply to Thomas Preud'homme from comment #6) > > Happens at expand time. Diving in. > > There's a giant if in expand_expr_real_1 with the following comment: > > /* In cases where an aligned union has an unaligned object >as a field, we might be extracting a BLKmode value from >an integer-mode (e.g., SImode) object. Handle this case >by doing the extract into an object as wide as the field >(which we know to be the width of a basic mode), then >storing into memory, and changing the mode to BLKmode. */ > > The "if" is entered in the big endian unaligned case but not in the other > case. In the aligned case, it continues after the if until the call to > flip_storage_order which will generate the bswap. In the aligned case, the if is not taken because alignment of the memory Vs access is sufficient. There is provision to call flip_storage_order but only if the access is a RECORD and here the mode class is INT. Therefore unaligned access are handled by extract_bit_field. This in turns call extract_bit_field_1 and later extract_integral_bit_field where things are different between little endian and big endian. For little endian, it goes in the following if block: /* If OP0 is a memory, try copying it to a register and seeing if a cheap register alternative is available. */ if (MEM_P (op0) & !reverse) For big endian it continues and calls extract_fixed_bit_field. I'm wondering if an extra call to flip_storage_order when reverse is true would solve the issue. Need to understand better whe is it safe to call flip_storage_order.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 --- Comment #9 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #8) > (In reply to Thomas Preud'homme from comment #7) > > (In reply to Thomas Preud'homme from comment #6) > > > Happens at expand time. Diving in. > > > > There's a giant if in expand_expr_real_1 with the following comment: > > > > /* In cases where an aligned union has an unaligned object > >as a field, we might be extracting a BLKmode value from > >an integer-mode (e.g., SImode) object. Handle this case > >by doing the extract into an object as wide as the field > >(which we know to be the width of a basic mode), then > >storing into memory, and changing the mode to BLKmode. */ > > > > The "if" is entered in the big endian unaligned case but not in the other > > case. In the aligned case, it continues after the if until the call to > > flip_storage_order which will generate the bswap. > > In the aligned case, the if is not taken because alignment of the memory Vs > access is sufficient. There is provision to call flip_storage_order but only > if the access is a RECORD and here the mode class is INT. > > Therefore unaligned access are handled by extract_bit_field. This in turns > call extract_bit_field_1 and later extract_integral_bit_field where things > are different between little endian and big endian. For little endian, it > goes in the following if block: > > /* If OP0 is a memory, try copying it to a register and seeing if a > cheap register alternative is available. */ > if (MEM_P (op0) & !reverse) > > For big endian it continues and calls extract_fixed_bit_field. I'm wondering > if an extra call to flip_storage_order when reverse is true would solve the > issue. Need to understand better whe is it safe to call flip_storage_order. It gives me the expected assembly but I need to convince myself that this is always safe.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-10-09 Assignee|unassigned at gcc dot gnu.org |thopre01 at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #10 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #8) > (In reply to Thomas Preud'homme from comment #7) > > (In reply to Thomas Preud'homme from comment #6) > > > Happens at expand time. Diving in. > > > > There's a giant if in expand_expr_real_1 with the following comment: > > > > /* In cases where an aligned union has an unaligned object > >as a field, we might be extracting a BLKmode value from > >an integer-mode (e.g., SImode) object. Handle this case > >by doing the extract into an object as wide as the field > >(which we know to be the width of a basic mode), then > >storing into memory, and changing the mode to BLKmode. */ > > > > The "if" is entered in the big endian unaligned case but not in the other > > case. In the aligned case, it continues after the if until the call to > > flip_storage_order which will generate the bswap. > > In the aligned case, the if is not taken because alignment of the memory Vs > access is sufficient. There is provision to call flip_storage_order but only > if the access is a RECORD and here the mode class is INT. > > Therefore unaligned access are handled by extract_bit_field. This in turns > call extract_bit_field_1 and later extract_integral_bit_field where things > are different between little endian and big endian. For little endian, it > goes in the following if block: > > /* If OP0 is a memory, try copying it to a register and seeing if a > cheap register alternative is available. */ > if (MEM_P (op0) & !reverse) > > For big endian it continues and calls extract_fixed_bit_field. I'm wondering > if an extra call to flip_storage_order when reverse is true would solve the > issue. Need to understand better whe is it safe to call flip_storage_order. It gives me the expected assembly but I need to convince myself that this is always safe.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 --- Comment #12 from Thomas Preud'homme --- (In reply to Eric Botcazou from comment #11) > > Therefore unaligned access are handled by extract_bit_field. This in turns > > call extract_bit_field_1 and later extract_integral_bit_field where things > > are different between little endian and big endian. For little endian, it > > goes in the following if block: > > > > /* If OP0 is a memory, try copying it to a register and seeing if a > > cheap register alternative is available. */ > > if (MEM_P (op0) & !reverse) > > > > For big endian it continues and calls extract_fixed_bit_field. I'm wondering > > if an extra call to flip_storage_order when reverse is true would solve the > > issue. Need to understand better whe is it safe to call flip_storage_order. > > Where do you want to add a call to flip_storage_order exactly? The correct > thing to do would be to move the !reverse test from the top-level if to the > first inner if (in order to bypass only the extv business) and see what > happens next (and be prepared for adjusting adjust_bit_field_mem_for_reg to > reverse SSO). Forgive my naive question as I'm not too familiar with that part of the compiler: why should the get_best_mem_extraction_insn be guarded with reverse? I thought I'd just ad an if (reverse) if it succeeds and call flip_storage_order there, likewise after the call to extract_bit_field_1 below if successful.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 --- Comment #14 from Thomas Preud'homme --- (In reply to Eric Botcazou from comment #13) > > Forgive my naive question as I'm not too familiar with that part of the > > compiler: why should the get_best_mem_extraction_insn be guarded with > > reverse? I thought I'd just ad an if (reverse) if it succeeds and call > > flip_storage_order there, likewise after the call to extract_bit_field_1 > > below if successful. > > No, the numbering of bits depends on the endianness, i.e. you need to know > the endianness of the source to do a correct extraction. For example, if > you extract bit #2 - bit #9 of a structure in big-endian using HImode, then > you cannot do it in little-endian and just swap the bytes afterwards (as a > matter of fact, there is nothing to swap since the result is byte-sized). > The LE extraction is: > HImode load + HImode right_shift (2) > whereas the BE extraction is: > HImode load + HImode right_shift (6) > > The extv machinery cannot handle reverse SSO for the time being so the guard > is still needed for it in the general case; on the contrary, > extract_bit_field_1 can already and doesn't need an additional call to > flip_storage_order. > > Of course, for specific bitfields, typically verifying > simple_mem_bitfield_p, then you can extract in native order and do > flip_storage_order on the result. > > In other words, the extv path can be used as you envision, but only for > specific bitfields modeled on those accepted by simple_mem_bitfield_p, and > then the call to flip_storage_order will indeed be needed. Right makes sense. So I tried your suggestion (guard the first if with !reverse but not the second) and it didn't work. Problem as you suggested is adjust_bit_field_mem_for_reg which refuses to do an unaligned load (or rather bit_field_mode_iterator's next_mode method refuses). I think get_best_mem_extraction_insn does not have this problem because instead it just queries whether an instruction to do unaligned access exist. Are you aware of a reason why next_mode does not do the same?
[Bug target/87374] [8/9 Regression] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87374 --- Comment #5 from Thomas Preud'homme --- Author: thopre01 Date: Wed Oct 31 10:05:54 2018 New Revision: 265662 URL: https://gcc.gnu.org/viewcvs?rev=265662&root=gcc&view=rev Log: Fix PR87374: ICE with -mslow-flash-data and -mword-relocations GCC ICEs under -mslow-flash-data and -mword-relocations because there is no way to load an address, both literal pools and MOVW/MOVT being forbidden. This patch gives an error message when both options are specified by the user and adds the according dg-skip-if directives for tests that use either of these options. It also explicitely set the option when in PIC mode as per documentation rather than always check for target_word_relocation together with flag_pic. 2018-10-31 Thomas Preud'homme gcc/ PR target/87374 * config/arm/arm.c (arm_option_check_internal): Disable the combined use of -mslow-flash-data and -mword-relocations. (arm_option_override): Enable -mword-relocations if -fpic or -fPIC. * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for flag_pic. * doc/invoke.texi (-mword-relocations): Mention conflict with -mslow-flash-data. (-mslow-flash-data): Reciprocally. gcc/testsuite/ PR target/87374 * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and -mword-relocations would be passed when compiling the test. * gcc.target/arm/movsi_movt.c: Likewise. * gcc.target/arm/pr81863.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. * gcc.target/arm/tls-disable-literal-pool.c: Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.c trunk/gcc/config/arm/arm.md trunk/gcc/doc/invoke.texi trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/arm/movdi_movt.c trunk/gcc/testsuite/gcc.target/arm/movsi_movt.c trunk/gcc/testsuite/gcc.target/arm/pr81863.c trunk/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c trunk/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c trunk/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c trunk/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c trunk/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c trunk/gcc/testsuite/gcc.target/arm/tls-disable-literal-pool.c
[Bug rtl-optimization/64616] Redundant ldr when accessing var inside and outside a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64616 Thomas Preud'homme changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Thomas Preud'homme --- (In reply to Martin Liška from comment #5) > Can the bug be marked as resolved? Yes, fixed in GCC 6 onward.
[Bug rtl-optimization/34503] Issues with constant/copy propagation implementation in gcse.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34503 Bug 34503 depends on bug 64616, which changed state. Bug 64616 Summary: Redundant ldr when accessing var inside and outside a loop https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64616 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug target/87374] [8/9 Regression] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87374 Thomas Preud'homme changed: What|Removed |Added Known to work||9.0 Known to fail|9.0 | --- Comment #7 from Thomas Preud'homme --- Done. Note that the fact that the ICE does not occur for GCC <= 7 suggests that -mslow-flash-data is not working as intended there since -mword-relocations disables relocations on MOVW/MOVT and -mslow-flash-data should disable all literal pool leaving nothing to do a relocatable load into register. These flags are just not meant to be used together.
[Bug target/87867] [7 regression] ICE on virtual destructor (-mlong-calls -ffunction-sections) on arm-none-eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87867 --- Comment #3 from Thomas Preud'homme --- Author: thopre01 Date: Wed Nov 21 16:50:37 2018 New Revision: 266348 URL: https://gcc.gnu.org/viewcvs?rev=266348&root=gcc&view=rev Log: 2018-11-21 Mihail Ionescu gcc/ PR target/87867 Backport from mainiline 2018-09-26 Eric Botcazou * config/arm/arm.c (arm_reorg): Skip Thumb reorg pass for thunks. (arm32_output_mi_thunk): Deal with long calls. gcc/testsuite/ PR target/87867 Backport from mainiline 2018-09-17 Eric Botcazou * g++.dg/other/thunk2a.C: New test. * g++.dg/other/thunk2b.C: Likewise. Added: branches/gcc-7-branch/gcc/testsuite/g++.dg/other/thunk1.C - copied, changed from r266330, branches/gcc-7-branch/gcc/testsuite/g++.dg/other/vthunk1.C branches/gcc-7-branch/gcc/testsuite/g++.dg/other/thunk2a.C branches/gcc-7-branch/gcc/testsuite/g++.dg/other/thunk2b.C Removed: branches/gcc-7-branch/gcc/testsuite/g++.dg/other/vthunk1.C Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/arm/arm.c branches/gcc-7-branch/gcc/testsuite/ChangeLog
[Bug target/85434] Address of stack protector guard spilled to stack on ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434 --- Comment #21 from Thomas Preud'homme --- Author: thopre01 Date: Thu Nov 22 14:46:17 2018 New Revision: 266379 URL: https://gcc.gnu.org/viewcvs?rev=266379&root=gcc&view=rev Log: PR85434: Prevent spilling of stack protector guard's address on ARM In case of high register pressure in PIC mode, address of the stack protector's guard can be spilled on ARM targets as shown in PR85434, thus allowing an attacker to control what the canary would be compared against. ARM does lack stack_protect_set and stack_protect_test insn patterns, defining them does not help as the address is expanded regularly and the patterns only deal with the copy and test of the guard with the canary. This problem does not occur for x86 targets because the PIC access and the test can be done in the same instruction. Aarch64 is exempt too because PIC access insn pattern are mov of UNSPEC which prevents it from the second access in the epilogue being CSEd in cse_local pass with the first access in the prologue. The approach followed here is to create new "combined" set and test standard pattern names that take the unexpanded guard and do the set or test. This allows the target to use an opaque pattern (eg. using UNSPEC) to hide the individual instructions being generated to the compiler and split the pattern into generic load, compare and branch instruction after register allocator, therefore avoiding any spilling. This is here implemented for the ARM targets. For targets not implementing these new standard pattern names, the existing stack_protect_set and stack_protect_test pattern names are used. To be able to split PIC access after register allocation, the functions had to be augmented to force a new PIC register load and to control which register it loads into. This is because sharing the PIC register between prologue and epilogue could lead to spilling due to CSE again which an attacker could use to control what the canary gets compared against. 2018-11-22 Thomas Preud'homme gcc/ PR target/85434 * target-insns.def (stack_protect_combined_set): Define new standard pattern name. (stack_protect_combined_test): Likewise. * cfgexpand.c (stack_protect_prologue): Try new stack_protect_combined_set pattern first. * function.c (stack_protect_epilogue): Try new stack_protect_combined_test pattern first. * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now parameters to control which register to use as PIC register and force reloading PIC register respectively. Insert in the stream of insns if possible. (legitimize_pic_address): Expose above new parameters in prototype and adapt recursive calls accordingly. Use pic_reg if non null instead of cached one. (arm_load_pic_register): Add pic_reg parameter and use it if non null. (arm_legitimize_address): Adapt to new legitimize_pic_address prototype. (thumb_legitimize_address): Likewise. (arm_emit_call_insn): Adapt to require_pic_register prototype change. (arm_expand_prologue): Adapt to arm_load_pic_register prototype change. (thumb1_expand_prologue): Likewise. * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype change. (arm_load_pic_register): Likewise. * config/arm/predicated.md (guard_addr_operand): New predicate. (guard_operand): New predicate. * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address prototype change. (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue prototype change. (stack_protect_combined_set): New expander.. (stack_protect_combined_set_insn): New insn_and_split pattern. (stack_protect_set_insn): New insn pattern. (stack_protect_combined_test): New expander. (stack_protect_combined_test_insn): New insn_and_split pattern. (arm_stack_protect_test_insn): New insn pattern. * config/arm/thumb1.md (thumb1_stack_protect_test_insn): New insn pattern. * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec. (UNSPEC_SP_TEST): Likewise. * doc/md.texi (stack_protect_combined_set): Document new standard pattern name. (stack_protect_set): Clarify that the operand for guard's address is legal. (stack_protect_combined_test): Document new standard pattern name. (stack_protect_test): Clarify that the operand for guard's address is legal. gcc/testsuite/ PR target/85434 * gcc.target/arm/pr85434.c: New test. Added: trunk/gcc/testsuite/gcc.target/arm/pr85434.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfgexpand.c trunk/gcc/config/arm/arm-protos.h trunk/gcc/config/arm/arm.c trunk/gcc/config/arm/arm.md trunk/gcc/config/arm/predicates.md trunk/gcc/config/arm/thumb1.md trunk/gcc/config/arm/unspecs.md trunk/gcc/doc/md.texi trunk/gcc/function.c trunk/gcc/target-insns.def trunk/gcc/testsuite/ChangeLog
[Bug target/87883] [ARM] ICE: Segmentation fault in arm_regno_class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87883 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-11-22 CC||thopre01 at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |mihail.ionescu at arm dot com Ever confirmed|0 |1
[Bug target/88167] [ARM] Function __builtin_return_address returns invalid address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88167 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2018-11-23 CC||thopre01 at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |mihail.ionescu at arm dot com Ever confirmed|0 |1
[Bug testsuite/69586] New: FAIL: gcc.dg/uninit-21.c for target defaulting to short enum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69586 Bug ID: 69586 Summary: FAIL: gcc.dg/uninit-21.c for target defaulting to short enum Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org CC: rguenther at suse dot de Target Milestone: --- Target: arm-none-eabi Created attachment 37537 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37537&action=edit Dump of failing uninit1 pass Hi, The new gcc.dg/uninit-21.c fails on arm-none-eabi with the following error: gcc/testsuite/gcc.dg/uninit-21.c: In function 'yp_update': gcc/testsuite/gcc.dg/uninit-21.c:31:3: warning: 'master' may be used uninitialized in this function [-Wmaybe-uninitialized] FAIL: gcc.dg/uninit-21.c (test for bogus messages, line 31) The test succeeds if passing -fno-short-enum or giving RPC_CANTENCODEARGS a value to big to represent with an unsigned short (65536 or higher) due to the cast creating confusing. See attached dump of uninit1 pass. The second cast using a bitwise and with 255 starts from forwprop1. Best regards.
[Bug rtl-optimization/69764] [5 Regression] ICE on x86_64-linux-gnu at -O0 (in decompose, at rtl.h:2107)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69764 Thomas Preud'homme changed: What|Removed |Added CC||thopre01 at gcc dot gnu.org --- Comment #9 from Thomas Preud'homme --- Also not fixed on arm-none-eabi: gcc/testsuite/c-c++-common/pr69764.c:7:12: internal compiler error: in decompose, at rtl.h:2107 0x845f34e wi::int_traits >::decompose(long long*, unsigned int, std::pair const&) gcc/gcc/rtl.h:2105 0x845f34e wide_int_ref_storage > gcc/wide-int.h:936 0x845f34e generic_wide_int > gcc/wide-int.h:714 0x845f34e convert_modes(machine_mode, machine_mode, rtx_def*, int) gcc/expr.c:697 0x86c018f widen_operand gcc/optabs.c:208 0x86c69bd expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*, rtx_def*, int, optab_methods) gcc/optabs.c:1280 0x843aa46 expand_shift_1 gcc/expmed.c:2458 0x846c6a7 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) gcc/expr.c:9029 0x833a950 expand_gimple_stmt_1 gcc/cfgexpand.c:3642 0x833a950 expand_gimple_stmt gcc/cfgexpand.c:3702 0x833bea0 expand_gimple_basic_block gcc/cfgexpand.c:5708 0x8341e48 execute gcc/cfgexpand.c:6323
[Bug testsuite/69371] UNRESOLVED: special_functions/18_riemann_zeta/check_value.cc compilation failed to produce executable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69371 Thomas Preud'homme changed: What|Removed |Added Status|NEW |RESOLVED Known to work||6.0 Resolution|--- |FIXED --- Comment #7 from Thomas Preud'homme --- Fixed as of r233391.
[Bug testsuite/68632] FAIL: gcc.target/arm/lto/pr65837 c_lto_pr65837_0.o assemble, -flto -mfpu=neon
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68632 Thomas Preud'homme changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Known to fail||6.0 --- Comment #1 from Thomas Preud'homme --- Fixed as of r232403
[Bug testsuite/69076] FAIL: gcc.dg/graphite/id-28.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69076 Thomas Preud'homme changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Known to work||6.0 Resolution|--- |FIXED --- Comment #1 from Thomas Preud'homme --- Fixed as of r231838
[Bug testsuite/69586] [6 Regression] FAIL: gcc.dg/uninit-21.c for target defaulting to short enum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69586 --- Comment #2 from Thomas Preud'homme --- Hi Richard, Any progress on this?
[Bug testsuite/69586] [6 Regression] FAIL: gcc.dg/uninit-21.c for target defaulting to short enum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69586 --- Comment #8 from Thomas Preud'homme --- Thanks!
[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196 Thomas Preud'homme changed: What|Removed |Added CC||thopre01 at gcc dot gnu.org --- Comment #19 from Thomas Preud'homme --- Created attachment 37850 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37850&action=edit vrp1 dump for arm-none-eabi targets Same FAIL on arm-none-eabi targets. Please find attached vrp1 dump.
[Bug target/63503] [AArch64] A57 executes fused multiply-add poorly in some situations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503 Thomas Preud'homme changed: What|Removed |Added Status|WAITING |RESOLVED CC||thopre01 at gcc dot gnu.org Resolution|--- |FIXED --- Comment #26 from Thomas Preud'homme --- Fixed as of r222512
[Bug testsuite/70227] New: pr69589 does not check for -rdynamic availability
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70227 Bug ID: 70227 Summary: pr69589 does not check for -rdynamic availability Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi g++.dg/lto/pr69589 fails on trunk for arm-none-eabi target with the following error: unrecognized command line option '-rdynamic' The reason is that rdynamic is not an available option for bare metal targets. The test could add an effective target requirement to avoid creating a FAIL in such cases or possibly add an xfail for *-none-*.
[Bug target/70232] [6 regression] excessive stack usage with -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232 --- Comment #9 from Thomas Preud'homme --- (In reply to Richard Biener from comment #7) > The bswap pass could learn to split this up into two loads and bswaps and > combine the results with a single shift and or. I'll add it to the bswap TODO list. Thanks.
[Bug testsuite/70553] New: pr70496.c should exclude Thumb only targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70553 Bug ID: 70553 Summary: pr70496.c should exclude Thumb only targets Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org CC: ramana.radhakrishnan at arm dot com Target Milestone: --- Target: arm-none-eabi gcc.target/arm/pr70496.c fails with the following error when targeting Cortex-M3: "selected processor does not support ARM opcodes" Since the test contains a .arm directive, it should skip Thumb only devices.
[Bug testsuite/70553] pr70496.c should exclude Thumb only targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70553 --- Comment #1 from Thomas Preud'homme --- Author: thopre01 Date: Thu Apr 7 16:19:20 2016 New Revision: 234811 URL: https://gcc.gnu.org/viewcvs?rev=234811&root=gcc&view=rev Log: 2016-04-07 Thomas Preud'homme gcc/testsuite/ PR testsuite/70553 * gcc.target/arm/pr70496.c: Also require arm_arm_ok effective target. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/arm/pr70496.c
[Bug testsuite/70553] pr70496.c should exclude Thumb only targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70553 --- Comment #3 from Thomas Preud'homme --- (In reply to Ramana Radhakrishnan from comment #2) > Fixed then ? Yes, sorry.
[Bug libstdc++/71081] New: experimental/memory_resource/1.cc run for targets without atomics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71081 Bug ID: 71081 Summary: experimental/memory_resource/1.cc run for targets without atomics Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org CC: redi at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi Hi, experimental/memory_resource/1.cc test in libstdc++ is run for target without atomic support (such as ARMv6-M devices) and thus fail with the following error: ccAS6AZQ.o: In function `std::__atomic_base::exchange(std::experimental::fundamentals_v2::pmr::memory_resource*, std::memory_order)':^M libstdc++-v3/include/bits/atomic_base.h:730: undefined reference to `__atomic_exchange_4'^M collect2: error: ld returned 1 exit status^M compiler exited with status 1 output is: ccAS6AZQ.o: In function `std::__atomic_base::exchange(std::experimental::fundamentals_v2::pmr::memory_resource*, std::memory_order)':^M libstdc++-v3/include/bits/atomic_base.h:730: undefined reference to `__atomic_exchange_4'^M collect2: error: ld returned 1 exit status^M FAIL: experimental/memory_resource/1.cc (test for excess errors) Excess errors: libstdc++-v3/include/bits/atomic_base.h:730: undefined reference to `__atomic_exchange_4' collect2: error: ld returned 1 exit status UNRESOLVED: experimental/memory_resource/1.cc compilation failed to produce executable The error seems to come from the experimental/memory_resource file included in that test. I believe the test should have an additional dg-require-atomic-builtins dejagnu directive.
[Bug libstdc++/71081] experimental/memory_resource/1.cc run for targets without atomics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71081 --- Comment #3 from Thomas Preud'homme --- Author: thopre01 Date: Fri May 20 12:36:57 2016 New Revision: 236507 URL: https://gcc.gnu.org/viewcvs?rev=236507&root=gcc&view=rev Log: 2016-05-20 Thomas Preud'homme PR libstdc++/71081 * testsuite/experimental/memory_resource/1.cc: Add required argument to dg-require-atomic-builtins. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/testsuite/experimental/memory_resource/1.cc
[Bug tree-optimization/71490] [7 regression] gcc.dg/tree-ssa/slsr-8.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71490 Thomas Preud'homme changed: What|Removed |Added CC||thopre01 at gcc dot gnu.org --- Comment #2 from Thomas Preud'homme --- The FAIL started at r237185.
[Bug tree-optimization/71490] [7 regression] gcc.dg/tree-ssa/slsr-8.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71490 --- Comment #3 from Thomas Preud'homme --- Differences start at sink phase: @@ -46,17 +56,17 @@ f (int s, int * c) _2 = a1.0_1 * 4; _3 = -_2; x1_14 = c_13(D) + _3; - a2_15 = s_11(D) * 4; - a2.1_4 = (unsigned int) a2_15; - _5 = a2.1_4 * 4; - _6 = -_5; - x2_16 = c_13(D) + _6; if (x1_14 != 0B) goto ; else goto ; : + a2_15 = s_11(D) * 4; + a2.1_4 = (unsigned int) a2_15; + _5 = a2.1_4 * 4; + _6 = -_5; + x2_16 = c_13(D) + _6; goto ; :
[Bug c++/81942] New: ICE on empty constexpr constructor with C++14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81942 Bug ID: 81942 Summary: ICE on empty constexpr constructor with C++14 Product: gcc Version: 8.0 Status: UNCONFIRMED Keywords: ice-on-invalid-code Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi Hi, The following testcase ICEs on arm-none-eabi targets when compiled for C++14: ** foo.cc ** class A { public: constexpr A() { return; } }; A mwi; %arm-none-eabi-gcc -c -std=c++14 foo.cc foo.cc:8:3: in constexpr expansion of 'mwi.A::A()' foo.cc:8:3: internal compiler error: in cxx_eval_constant_expression, at cp/constexpr.c:4556 A mwi; ^~~ 0x83d1f1 cxx_eval_constant_expression gcc/cp/constexpr.c:4556 0x839e25 cxx_eval_statement_list gcc/cp/constexpr.c:3770 0x83cdc8 cxx_eval_constant_expression gcc/cp/constexpr.c:4497 0x83ce47 cxx_eval_constant_expression gcc/cp/constexpr.c:4503 0x839e25 cxx_eval_statement_list gcc/cp/constexpr.c:3770 0x83cdc8 cxx_eval_constant_expression gcc/cp/constexpr.c:4497 0x830d1b cxx_eval_call_expression gcc/cp/constexpr.c:1661 0x83ae52 cxx_eval_constant_expression gcc/cp/constexpr.c:4035 0x83d7ec cxx_eval_outermost_constant_expr gcc/cp/constexpr.c:4668 0x83ea0a maybe_constant_init(tree_node*, tree_node*) gcc/cp/constexpr.c:4990 0x90602e expand_default_init gcc/cp/init.c:1869 0x90655a expand_aggr_init_1 gcc/cp/init.c:1972 0x905264 build_aggr_init(tree_node*, tree_node*, int, int) gcc/cp/init.c:1713 0x899e1d build_aggr_init_full_exprs gcc/cp/decl.c:6094 0x89ac40 check_initializer gcc/cp/decl.c:6242 0x89e482 cp_finish_decl(tree_node*, tree_node*, bool, tree_node*, int) gcc/cp/decl.c:6961 0x992889 cp_parser_init_declarator gcc/cp/parser.c:19674 0x985cd0 cp_parser_simple_declaration gcc/cp/parser.c:13059 0x9857dc cp_parser_block_declaration gcc/cp/parser.c:12877 0x98553b cp_parser_declaration gcc/cp/parser.c:12774 Best regards.
[Bug c++/81942] ICE on empty constexpr constructor with C++14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81942 Thomas Preud'homme changed: What|Removed |Added Keywords|ice-on-invalid-code |ice-on-valid-code --- Comment #2 from Thomas Preud'homme --- (In reply to TC from comment #1) > Why is this tagged ice-on-invalid? The test case is valid C++14. My bad, got confused by the error message when building for C++11.
[Bug target/81909] [nvptx] Missing warning in gcc.dg/pr53037-{2,3}.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81909 Thomas Preud'homme changed: What|Removed |Added Target|nvptx |nvptx, arm-none-eabi Status|UNCONFIRMED |NEW Last reconfirmed||2017-08-24 CC||thopre01 at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Thomas Preud'homme --- Same issue on arm-none-eabi targets
[Bug c++/81942] ICE on empty constexpr constructor with C++14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81942 --- Comment #7 from Thomas Preud'homme --- Hi Paolo, Thanks for working on this. (In reply to Paolo Carlini from comment #6) > It would be nice if somebody with a fully functional ARM toolchain could > check whether something like the below at least avoids the ICE without > causing any regressions. I hope it does. > > Index: constexpr.c > === > --- constexpr.c (revision 251553) > +++ constexpr.c (working copy) > @@ -3679,7 +3679,7 @@ breaks (tree *jump_target) > { >return *jump_target > && ((TREE_CODE (*jump_target) == LABEL_DECL > - && LABEL_DECL_BREAK (*jump_target)) > + && (LABEL_DECL_BREAK (*jump_target) || DECL_ARTIFICIAL (*jump_target))) > || TREE_CODE (*jump_target) == EXIT_EXPR); > } The patch works for me on the testcase I provided, that is compilation process returns a success error code instead of ICEing. Best regards.
[Bug c++/81942] ICE on empty constexpr constructor with C++14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81942 --- Comment #15 from Thomas Preud'homme --- (In reply to Jakub Jelinek from comment #14) > The patch LGTM, but I'll defer the final say to Jason/Nathan. FYI I tried the patch on arm-none-eabi toolchain and it fixes the bug reported. Best regards.
[Bug c++/78269] FAIL: FAIL: g++.dg/cpp1z/noexcept-type11.C and FAIL: g++.dg/cpp1z/noexcept-type9.C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78269 --- Comment #6 from Thomas Preud'homme --- (In reply to Paolo Carlini from comment #5) > Thomas, both trunk and gcc-7-brnanch seem fine to me, I'm tempted to close > the bug. Can you double check the released 7.1.0 on aarch64? Hi Paolo, yes I can confirm g++.dg/cpp1z/noexcept-type11.C passes on both ARM and Aarch64 on trunk and latest GCC 7. This bug can be closed.
[Bug testsuite/82120] New: FAIL: gcc.dg/tree-ssa/pr81588.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82120 Bug ID: 82120 Summary: FAIL: gcc.dg/tree-ssa/pr81588.c Product: gcc Version: 7.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: thopre01 at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi Hi, gcc.dg/tree-ssa/pr81588.c scan-tree-dump-times fails on GCC 7 (but not on trunk) for Arm Cortex-M0, Cortex-M3 and Cortex-M4 cores. It does work when targeting Arm Cortex-M7 though. For the affected target, the string searched is not found in the dump. Best regards.
[Bug c++/78269] FAIL: FAIL: g++.dg/cpp1z/noexcept-type11.C and FAIL: g++.dg/cpp1z/noexcept-type9.C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78269 --- Comment #9 from Thomas Preud'homme --- (In reply to Paolo Carlini from comment #8) > Confirmed that 7.1.0 is fine. It is indeed. As can be seen in my comment #4, this was fixed somewhere before 14th of November 2016, well before GCC 7 release. Best regards.
[Bug c++/78269] FAIL: FAIL: g++.dg/cpp1z/noexcept-type11.C and FAIL: g++.dg/cpp1z/noexcept-type9.C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78269 --- Comment #10 from Thomas Preud'homme --- (In reply to Thomas Preud'homme from comment #9) > (In reply to Paolo Carlini from comment #8) > > Confirmed that 7.1.0 is fine. > > It is indeed. As can be seen in my comment #4, this was fixed somewhere > before 14th of November 2016, well before GCC 7 release. > > Best regards. Exact revision that fixed the bug on ARM is: r242026
[Bug testsuite/82120] FAIL: gcc.dg/tree-ssa/pr81588.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82120 --- Comment #3 from Thomas Preud'homme --- (In reply to Jakub Jelinek from comment #2) > This is a total mess. I've copied a boilerplate from other tests that > require branch cost of 2. > I guess > 2017-09-06 Jakub Jelinek > > PR testsuite/82120 > * gcc.dg/tree-ssa/pr81588.c: Don't run on logical_op_short_circuit > targets. > > --- gcc/testsuite/gcc.dg/tree-ssa/pr81588.c.jj2017-08-02 > 12:10:36.0 > +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr81588.c 2017-09-06 16:09:57.373505808 > +0200 > @@ -1,5 +1,5 @@ > /* PR tree-optimization/81588 */ > -/* { dg-do compile { target { ! "m68k*-*-* mmix*-*-* bfin*-*-* v850*-*-* > moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* xtensa*-*-* > hppa*-*-* nios2*-*-*" } } } */ > +/* { dg-do compile { target { ! { logical_op_short_circuit || { m68k*-*-* > bfin*-*-* v850*-*-* moxie*-*-* m32c*-*-* fr30*-*-* mcore*-*-* xtensa*-*-* > hppa*-*-* } } } } } */ > /* { dg-options "-O2 -fdump-tree-reassoc1-details" } */ > /* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* > s390*-*-* i?86-*-* x86_64-*-* } } */ > > could fix that, but that will mean it will be untested even on mips, avr and > s390 when it could be tested there. > > So perhaps better: > 2017-09-06 Jakub Jelinek > > PR testsuite/82120 > * gcc.dg/tree-ssa/pr81588.c: Don't run on logical_op_short_circuit > targets > except for those where -mbranch-cost=2 is supported. > > --- gcc/testsuite/gcc.dg/tree-ssa/pr81588.c.jj2017-08-02 > 12:10:36.0 > +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr81588.c 2017-09-06 16:17:30.563118589 > +0200 > @@ -1,5 +1,5 @@ > /* PR tree-optimization/81588 */ > -/* { dg-do compile { target { ! "m68k*-*-* mmix*-*-* bfin*-*-* v850*-*-* > moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* xtensa*-*-* > hppa*-*-* nios2*-*-*" } } } */ > +/* { dg-do compile { target { ! { { logical_op_short_circuit && { ! > "mips*-*-* avr-*-* s390*-*-*" } } || { m68k*-*-* bfin*-*-* v850*-*-* > moxie*-*-* m32c*-*-* fr30*-*-* mcore*-*-* xtensa*-*-* hppa*-*-* } } } } } */ > /* { dg-options "-O2 -fdump-tree-reassoc1-details" } */ > /* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* > s390*-*-* i?86-*-* x86_64-*-* } } */ > > > Can you test it on the various arm configurations (just make check-gcc > RUNTESTFLAGS=tree-ssa.exp=pr81588.c)? > Wonder about the targets that aren't included in logical_op_short_circuit, > what they actually do and why they are in the lists in the other testcases. I've tested it for a variety of Arm Cortex-M and Cortex-A cores and it either PASS or is marked as UNSUPPORTED so looks good to me, thanks. Best regards.
[Bug rtl-optimization/80352] Improper reload of operands with equiv pseudo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80352 --- Comment #4 from Thomas Preud'homme --- (In reply to Vladimir Makarov from comment #1) > Thomas, it seems from your description the problem really exists. I tried > to reproduce the problem with the test you provided but I've failed. I used > today trunk. > > Could you provide more info (may be -mcpu/-march etc) or/and another test. Hi Vladimir, My phrasing is incorrect, the testcase I provided does not allow to reproduce the issue but allows to reproduce the conditions necessary for the issue. To actually run into the problem of reload loop you need to undo a couple of changes done to LRA recently that hide this problem. Specifically, you need to undo; * the changes made by r237277 and r238010 relating to the reject += 3 * the changes made by r246808 and r246854 relating to the other reject += 3 below the one mentioned above (note: this one seems redundant) Put a breakpoint on the badop = false in the CT_MEMORY case with the condition that the if is true because of equiv_substitution_p (ie the rest of the expression is false). Then the testcase with the modifications will make alternative 5 (memory) to be chosen and you'll see what happens in curr_insn_transform after the call to get_reload_reg. process_alt_operands will be called again and the reg alternative will fail to match for the register created by get_reload_reg because it was NO_REG and the same will happen again 1-2 other time. Note that without the modification above the memory alternative will have lower cost and thus is not chosen, which prevent the bug to occur. Best regards. if (CONST_POOL_OK_P (mode, op) || MEM_P (op) || REG_P (op) /* We can restore the equiv insn by a reload. */ || equiv_substition_p[nop]) badop = false;
[Bug rtl-optimization/80352] Improper reload of operands with equiv pseudo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80352 --- Comment #5 from Thomas Preud'homme --- FWIW, I've configured GCC with --target=arm-none-eabi --with-cpu=cortex-m7 --with-mode=thumb and then ran make all-gcc. I think it would work equally well without the cpu and mode given that these are given on the command line: arm-none-eabi-gcc -S -mcpu=cortex-m7 -mthumb -O2 foo.c -wrapper gdb,--args