https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116299
--- Comment #8 from Ryan <rmaguire314 at gmail dot com> --- The assembly generated on godbolt for ppc64el is indeed different with the "volatile" included. It may be the default that -O3 is supposed to aggressively optimize this away, but the spltting trick works for architectures. -Ryan