[Bug c++/91297] New: ARM Cortex M0+ gets hard fault due to unexpected pointer content
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91297 Bug ID: 91297 Summary: ARM Cortex M0+ gets hard fault due to unexpected pointer content Product: gcc Version: 8.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: murat.ursavas at gmail dot com Target Milestone: --- Hello, I'm not sure I can label this issue as a bug. Let me share my findings and let you decide what it is. The library I was using for three years was working just fine. Then I decided to switch manufacturers and core (from Cortex M3 to Cortex M0+) things started to get weird. The library is now triggering a hard fault, without a proper reason. Hard faults are generally triggered while trying to access a nullptr or some non-existing regions. Thus the main causes are generally dangling pointers. I've checked this for it but no, everything is looking fine. Here's the minimum reproducible example. * #include int main(void) { uint8_t foo[15]; void *buffer = foo; uint16_t calculated = 1234; uint16_t length = sizeof(foo); uint16_t *obtained = reinterpret_cast(&(reinterpret_cast(buffer)[length - 2])); if(calculated == *obtained) { return 0; } else { return -1; } } * The disasembly generated for this simple main is here: (g++ Compiler options: -mcpu=cortex-m0plus -std=gnu++14 -g3 -O0 -ffunction-sections -fdata-sections -fno-exceptions -fno-rtti -fno-threadsafe-statics -fno-use-cxa-atexit -Wall -fstack-usage --specs=nano.specs -mfloat-abi=soft -mthumb) Some irrelevant include and define options have been removed. * main(): 08000108: push{r7, lr} 0800010a: sub sp, #32 0800010c: add r7, sp, #0 6 void *buffer = foo; 0800010e: addsr3, r7, #4 08000110: str r3, [r7, #28] 7 uint16_t calculated = 1234; 08000112: movsr1, #26 08000114: addsr3, r7, r1 08000116: ldr r2, [pc, #52] ; (0x800014c ) 08000118: strhr2, [r3, #0] 8 uint16_t length = sizeof(foo); 0800011a: movsr0, #24 0800011c: addsr3, r7, r0 0800011e: movsr2, #15 08000120: strhr2, [r3, #0] 10 uint16_t *obtained = reinterpret_cast(&(reinterpret_cast(buffer)[length - 2])); 08000122: addsr3, r7, r0 08000124: ldrhr3, [r3, #0] 08000126: subsr3, #2 08000128: ldr r2, [r7, #28] 0800012a: addsr3, r2, r3 0800012c: str r3, [r7, #20] 11 if(calculated == *obtained) 0800012e: ldr r3, [r7, #20] 08000130: ldrhr3, [r3, #0] ;<<<<<< Hard Fault triggering line 08000132: addsr2, r7, r1 08000134: ldrhr2, [r2, #0] 08000136: cmp r2, r3 08000138: bne.n 0x800013e 13 return 0; 0800013a: movsr3, #0 0800013c: b.n 0x8000142 17 return -1; 0800013e: movsr3, #1 08000140: negsr3, r3 19} * I've commented which line is triggering the hard fault. Here's the register contents just before running that line at instruction level: * r0 0x18 (Hex) r1 0x1a (Hex) r2 0x27d8 (Hex) r3 0x27e5 (Hex) r4 0x (Hex) r5 0x (Hex) r6 0x (Hex) r7 0x27d4 (Hex) r8 0x (Hex) r9 0x (Hex) r10 0x (Hex) r11 0x (Hex) r12 0x (Hex) sp 0x27d4 (Hex) lr 0x8000253 (Hex) pc 0x8000130 (Hex) xpsr0x100 (Hex) PRIMASK 0x0 (Hex) BASEPRI 0x0 (Hex) FAULTMASK 0x0 (Hex) CONTROL 0x0 (Hex) MSP 0x27d4 (Hex) PSP 0xfffc (Hex) * r3 is correctly carrying the 'obtained' variables pointing address and it's valid on STM32L011K4. But running it triggers hard fault exception. So from my point of view the generated object code by the G++ compiler is fine. But I'll be glad if you can check my analysis and help me for finding the root cause. Right now I'm suspecting from a weird core errata. Regards,
[Bug c++/91297] ARM Cortex M0+ gets hard fault with a valid LDRH instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91297 Murat Ursavaş changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |WONTFIX --- Comment #2 from Murat Ursavaş --- Richard, Thank you for pointing the cause in a blink. You're right, the problem was indeed the unaligned access. I'm leaving this here for future reference: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473m/dom1359731171041.html Regards,
[Bug c++/87373] New: Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 Bug ID: 87373 Summary: Packed structs are not handled properly on ARM architecture even with misaligned access is enabled Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: murat.ursavas at gmail dot com Target Milestone: --- Created attachment 44731 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44731&action=edit Minimum Test Case Hi, I've faced a weird behavior on my ARM MCU about a year ago and reported it on the launchpad page. In the meantime I tried to report the issue at here but your register process is longer than expected and here I'm reporting this issue almost a year later. We'd discussed the issue on this page: https://bugs.launchpad.net/gcc-arm-embedded/+bug/1738730 I think this looks like a bug, but with my limited internal GCC knowledge, I don't want to be rude and call it like that directly. If I should summarize the discussion; The GCC versions from 5.2 to 7, the compiler generates unexpected code while handling packed structs. Normally the behavior should be like 4.9 but somehow it got broken since then. Misaligned access should help but this does not seem to be the case. I've attached a minimum example which reproduces the issue. About the seriousness; I can not update my compiler and still use 4.9 for our products. This is a annoying drawback for me because I know GCC has received many improvements since then. The IDE which I use is still deployed with 4.9 and this is unacceptable from my point of view. MCU Core: ARM Cortex M3
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #2 from Murat Ursavaş --- Hi Umesh, Could you test it with the following options: -g3 -gdwarf-2 -mcpu=cortex-m3 -mthumb -std=c++1y '-DDEBUG=1' -O0 -pedantic -Wall -Wextra -c -fmessage-length=0 -fno-rtti -fno-exceptions -mno-sched-prolog -fno-builtin -fpack-struct -fshort-enums -ffunction-sections -fdata-sections Thanks in advance.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #6 from Murat Ursavaş --- Hi Jonathan, I just wanted a dramatic entrance :) (There was a discussion about GCC bugzilla on reddit recently) Of course it hasn't took that long. But this is like missing a call. You would answer that at that time but if you miss it, it becomes difficult to call back :)
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #9 from Murat Ursavaş --- Umesh, The reason is step-by-step debugging. I'd like to debug it first with -O0, than pack it with -Os for the release. Otherwise with a low resource MCU, things become messy really fast.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #10 from Murat Ursavaş --- Jonathan, I don't blame any of you and very well aware of the volunteering effort. Please don't get me wrong. It's just me attempted multiple times to open the case but get distracted with something else. Let's concentrate on the case, OK? :) Thanks for your efforts again, I appreciate that and try to make more people appreciate it.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #11 from Murat Ursavaş --- Richard, I don't know about the standards as you are and please accept me as a newbie. The peripheral parameters of the manufacturer library are all defined as volatile structs and accessed with pointers. This is working till 4.9 but not after 5.2. So point me to the right direction, who should fix what and how? I can't fix my software or use a workaround because I need packed structs. I tried many other options but none of them worked. To understand the matter I'd like to ask, the trunk is creating the code Umesh just sent and this doesn't look like I expect. Is the code correct? As far as I know I should see a shorter one with enabled unaligned access, which looks default on Cortex-M3 architecture. Of course I would like to have shorter image but longer image was not helping and assigning wrong values to the variables. Minimum test case has been written to show that.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #12 from Murat Ursavaş --- Richard, Ok I remembered things with reading the old posts on launchpad. The compiler was generating normal code if I use the struct variable directly. But if I use a pointer to access it, it assigns not what I try to assign. Isn't that a compiler bug?
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #13 from Murat Ursavaş --- Richard, Also as far as I remember GNU manual was indeed saying something on this case. It was saying that "if the struct is not packed, it would access to members word by word. But if unaligned access is disabled, it would access the variables byte by byte and create longer and slower images. With unaligned access, it should access to the struct members again word by word." I can't show where it's written but I remember it. With the direct access, it is indeed accessing word by word no matter -pack-struct option is. But if I access it indirectly with a pointer, it tries to access byte by byte and still can't assign the values developer want.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #16 from Murat Ursavaş --- OK I understand conservative action and not wait for word by word access. But the resulting value is not 0x401 on the test case, but it should be. In my original case this was effecting a USART peripheral register and it was activating different switches and making the peripheral useless. To make the case worse, if you assign some important physical pins on the same port, this bug can make them work differently. Let's say, it can pull a trigger without intention, because GPIO peripherals simply waits for bit assignments.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #19 from Murat Ursavaş --- Hi Richard, This source code had been designed to see word by word access and may create expected results. I'm not sure about that. Let me use latest stable and see what happens. It wasn't plug and play last time but like I said I have to make sure I'm not the root cause. (Looks like I'm the usual suspect at this :) Thanks for verifying that this is the expected code, just longer due to more conservative access style. This could take a while as I'm struggling to find some extra time to pursue these.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #20 from Murat Ursavaş --- By the way, the hardware peripheral registers are aligned to 32bits.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #25 from Murat Ursavaş --- (In reply to Eric Gallager from comment #24) > (In reply to Murat Ursavaş from comment #6) > > Hi Jonathan, > > > > I just wanted a dramatic entrance :) (There was a discussion about GCC > > bugzilla on reddit recently) > > Link to the reddit discussion? I searched and can't seem to find it. I've found it via twitter. Here; https://twitter.com/blelbach/status/1032586866162196481
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #26 from Murat Ursavaş --- (In reply to Richard Earnshaw from comment #21) > (In reply to Murat Ursavaş from comment #20) > > By the way, the hardware peripheral registers are aligned to 32bits. > > So why don't you define your struct as > > struct TestStructType > { > volatile unsigned one; > unsigned char two; > unsigned short three __attribute__((packed)); > }; > > And get rid of the pragma entirely? Richard, Some of the structs are not under my control, since they belong to manufacturer libraries. I need pack-1 for some due to storage and communication needs. And I didn't know that I could pack individual struct members. Please correct me if I'm wrong. This structs size is total 10 bytes, one:4, two:4 and three:2, right?
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 Murat Ursavaş changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #28 from Murat Ursavaş --- Hi, I've created the time and added 7.3.1 to my toolchain list. (It is really annoyingly hard to add a new toolchain in my configuration due to a bug in the IDE). Anyway right now I can compare 4.9.3 and 7.3.1 side by side and my application is not working with the 7 series. That is exactly how it started at the beginning. I've cleared some issues that could interfere with this issue but now I can reproduce the issue on my target. I'm not sure this is due to packed structs or not but I've found a difference which should not happen. Please bear with me on this. Here's the disassembly of a problematic part: 4.9.3 121 NVM_SPI->ROUTE = USART_ROUTE_TXPEN | USART_ROUTE_RXPEN | USART_ROUTE_CLKPEN | NVM_SPI_LOCATION; 00029e38: ldr r3,[pc,#0x4c] ; 0x29e84 00029e3a: ldr r2,[r3,#0x54] 00029e3c: movsr2,#0x0 00029e3e: orr r2,r2,#0xb 00029e42: str r2,[r3,#0x54] 7.3.1 121 NVM_SPI->ROUTE = USART_ROUTE_TXPEN | USART_ROUTE_RXPEN | USART_ROUTE_CLKPEN | NVM_SPI_LOCATION; 572e: ldr r3,[pc,#0x70] ; 0x579c 5730: ldrb.w r2,[r3,#0x54] 5734: movsr2,#0x0 5736: orr r2,r2,#0xb 573a: strb.w r2,[r3,#0x54] 573e: ldrb.w r2,[r3,#0x55] 5742: movsr2,#0x0 5744: strb.w r2,[r3,#0x55] 5748: ldrb.w r2,[r3,#0x56] 574c: movsr2,#0x0 574e: strb.w r2,[r3,#0x56] 5752: ldrb.w r2,[r3,#0x57] 5756: movsr2,#0x0 5758: strb.w r2,[r3,#0x57] 4.9.3 sets the ROUTE register as 0xB correctly. But 7.3.1 sets it as 0x30B. The correct value is 0xB (calculated from the bit values). This maps the USART to the wrong pins and makes the peripheral physically useless and also cripples other pins. Like I said, this may not be a bug, could be my error or vendor libraries but something doesn't look right. Please let me know if you need further info. I may need some guidance to collect more data. P.S: I'm trying to improve GCC, otherwise I'm just fine with 4.9.3.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #29 from Murat Ursavaş --- And just out of curiosity, why the compiler loads zero to the register and then OR's with the value? 00029e3c: movsr2,#0x0 00029e3e: orr r2,r2,#0xb Why doesn't it load directly the necessary value? Like, 00029e3c: movsr2,#0xb I know ARM arch needs load/store mechanism for the RAM but why this additional task for a register?
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #30 from Murat Ursavaş --- OK, looks like it is possible like this: ldr r2, =0x000b Source: https://stackoverflow.com/questions/38689886/loading-32-bit-values-to-a-register-in-arm-assembly
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #31 from Murat Ursavaş --- (In reply to Murat Ursavaş from comment #28) > > Here's the disassembly of a problematic part: > > 4.9.3 > > 121 NVM_SPI->ROUTE = USART_ROUTE_TXPEN | USART_ROUTE_RXPEN | > USART_ROUTE_CLKPEN | NVM_SPI_LOCATION; > 00029e38: ldr r3,[pc,#0x4c] ; 0x29e84 > 00029e3a: ldr r2,[r3,#0x54] > 00029e3c: movsr2,#0x0 > 00029e3e: orr r2,r2,#0xb > 00029e42: str r2,[r3,#0x54] > > 7.3.1 > > 121 NVM_SPI->ROUTE = USART_ROUTE_TXPEN | USART_ROUTE_RXPEN | > USART_ROUTE_CLKPEN | NVM_SPI_LOCATION; > 572e: ldr r3,[pc,#0x70] ; 0x579c > 5730: ldrb.w r2,[r3,#0x54] > 5734: movsr2,#0x0 > 5736: orr r2,r2,#0xb > 573a: strb.w r2,[r3,#0x54] > 573e: ldrb.w r2,[r3,#0x55] > 5742: movsr2,#0x0 > 5744: strb.w r2,[r3,#0x55] > 5748: ldrb.w r2,[r3,#0x56] > 574c: movsr2,#0x0 > 574e: strb.w r2,[r3,#0x56] > 5752: ldrb.w r2,[r3,#0x57] > 5756: movsr2,#0x0 > 5758: strb.w r2,[r3,#0x57] My limited assembler knowledge says new one is byte by byte access and should set the register correctly, but somehow it's not. Could actual object code be different than what I see in the disassembly? I'll try to verify it via inspecting the code space.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 Murat Ursavaş changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #32 from Murat Ursavaş --- OK, dug down into Thumb-2 reference manual and verified. The code space shows correct instructions. For this line; strb.w r2,[r3,#0x54] It shows; 0xF883 0x2054 If I've read correctly, this is exactly what the disassembly says. I guess from GCC perspective, this is a perfectly valid situation and this ticket should be closed. If you have any additional ideas what could cause this, I'm all ears (Peripheral, Core, GDB). Otherwise, thanks for your time. It was enlightening for me to chase this issue.
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 --- Comment #33 from Murat Ursavaş --- One thing though. Would you accept this a regression and get back to 4.9 style? Yes, GCC is doing everything by the book but the result is not perfect (due to other undocumented issues not related to GNU team).
[Bug c++/87373] Packed structs are not handled properly on ARM architecture even with misaligned access is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87373 Murat Ursavaş changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #34 from Murat Ursavaş --- I think I've got what's going on. (I know this case turned to a monologue, but I would like to improve it for the future search references. In ARM architecture we have one simple linear address space for everything, flash, RAM and other hardware like peripherals. This makes many things quite easy. If you would like to setup a USART, you just write some information on this address space and you get what you want, like in this case. But If you are like me, trying to make everything deterministic, you may want to enable packed-structs. No problem with that. GCC takes care of the rest. It can access the RAM unaligned anyway as default. But, one thing can stay under the radar. We see peripheral registers as usual RAM addresses, but they are not. They may have limitations like no unaligned access. In this case with GCC 4.9.3, if I access to the register it uses this: ldr r3,[pc,#0x4c] ldr r2,[r3,#0x54] movsr2,#0x0 orr r2,r2,#0xb str r2,[r3,#0x54]; Important instruction This part easily sets a 32bit register and everything works as expected. But after GCC 5+, It uses byte by byte access an uses the instructions below; ldr r3,[pc,#0x70] ldrb.w r2,[r3,#0x54] movsr2,#0x0 orr r2,r2,#0xb strb.w r2,[r3,#0x54]; Important instruction ldrb.w r2,[r3,#0x55] movsr2,#0x0 strb.w r2,[r3,#0x55] ldrb.w r2,[r3,#0x56] movsr2,#0x0 strb.w r2,[r3,#0x56] ldrb.w r2,[r3,#0x57] movsr2,#0x0 strb.w r2,[r3,#0x57] There is nothing wrong, if it was a normal RAM location. It would set the register as 0x000b. But since this is a peripheral location, and has to be accessed as aligned, it takes just the first strb.w instruction into consideration, and leaves further ones useless. 0 - 7bits are OK, but 8-31 bits are left to decide by entropy. In my case the entropy wants to move the physical pins to a different location. I'm not sure whether this is a GCC regression, or must be taken care by the hardware manufacturer, but this is my conclusion at the end. So what will be my workaround; Project wide packed structs are dangerous, I'll remove it from the project settings and limit it down to necessary structs, leaving others relaxed. This should make the peripheral access aligned. P.S: I'm reopening this record for one final evaluation by the GNU team. From my perspective, this looks like a regression, but it's up to you guys.