https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117017
Bug ID: 117017 Summary: ARM code generation for sequentially consistent load generates too many dmb instructions Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: mike.robins at talktalk dot net Target Milestone: --- Created attachment 59296 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59296&action=edit Source code to demo issue I am using g++ version 12.2.0-14 on Raspberry Pi using Raspbian bookworm, currently fully patched to develop a multi-threaded program for the pi. I believe I have found an issue with the code generated for ARM: Looking at the ARM documentation, it seems that when an aligned sequentially consistent load is programmed, only one dmb instruction should be necessary, after the actual load, just as in an aligned acquire load. However it seems that two dmb are being used for the sequentially consistent case. Compiled with: g++ -march=armv7+fp -pthread -Ofast -Wall -Wextra -Werror -pedantic -S dmb.cpp I do not have access to a later version of the compiler to check if this has been fixed in a later release. Please see attached (CPP and S).