reordering of trapping operations and volatile
Hi Richard, I have a question regarding reodering of volatile accesses and trapping operations. My initial assumption (and hope) was that compilers take care to avoid creating traps that are incorrectly ordered relative to observable behavior. I had trouble finding examples, and my cursory glace at the code seemed to confirm that GCC carefully avoids this. But then someone showed me this example, where this can happen in GCC: volatile int x; int foo(int a, int b, _Bool store_to_x) { if (!store_to_x) return a / b; x = b; return a / b; } https://godbolt.org/z/vq3r8vjxr In this example a division is hoisted before the volatile store. (the division by zero which could trap is UB, of course). As Martin Sebor pointed out this is done as part of redundancy elimination in tree-ssa-pre.c and that this might simply be an oversight (and could then be fixed with a small change). Could you clarify whether such reordering is intentional and could be exploited in general also in other optimizations or confirm that this is an oversight that affects only this specific case? If this is intentional, are there examples where this is important for optimization? Martin
Re: reordering of trapping operations and volatile
On January 8, 2022 9:32:24 AM GMT+01:00, Martin Uecker wrote: > >Hi Richard, > >I have a question regarding reodering of volatile >accesses and trapping operations. My initial >assumption (and hope) was that compilers take >care to avoid creating traps that are incorrectly >ordered relative to observable behavior. > >I had trouble finding examples, and my cursory >glace at the code seemed to confirm that GCC >carefully avoids this. But then someone showed >me this example, where this can happen in GCC: > > >volatile int x; > >int foo(int a, int b, _Bool store_to_x) >{ > if (!store_to_x) >return a / b; > x = b; > return a / b; >} > > >https://godbolt.org/z/vq3r8vjxr > >In this example a division is hoisted >before the volatile store. (the division >by zero which could trap is UB, of course). > >As Martin Sebor pointed out this is done >as part of redundancy elimination >in tree-ssa-pre.c and that this might >simply be an oversight (and could then be >fixed with a small change). > >Could you clarify whether such reordering >is intentional and could be exploited in >general also in other optimizations or >confirm that this is an oversight that >affects only this specific case? > >If this is intentional, are there examples >where this is important for optimization? In general there is no data flow information that prevents traps from being reordered with respect to volatile accesses. The specific case could be easily mitigated in PRE. Another case would be A = c / d; X = 1; If (use_a) Bar (a); Where we'd sink a across x into the guarded Bb I suspect. (sorry for the odd formatting, writing this on a mobile device). Richard. > >Martin > > > > > >
Re: reordering of trapping operations and volatile
Am Samstag, den 08.01.2022, 13:41 +0100 schrieb Richard Biener: > On January 8, 2022 9:32:24 AM GMT+01:00, Martin Uecker > wrote: > > Hi Richard, thank you for your quick response! > > I have a question regarding reodering of volatile > > accesses and trapping operations. My initial > > assumption (and hope) was that compilers take > > care to avoid creating traps that are incorrectly > > ordered relative to observable behavior. > > > > I had trouble finding examples, and my cursory > > glace at the code seemed to confirm that GCC > > carefully avoids this. But then someone showed > > me this example, where this can happen in GCC: > > > > > > volatile int x; > > > > int foo(int a, int b, _Bool store_to_x) > > { > > if (!store_to_x) > >return a / b; > > x = b; > > return a / b; > > } > > > > > > https://godbolt.org/z/vq3r8vjxr > > > > In this example a division is hoisted > > before the volatile store. (the division > > by zero which could trap is UB, of course). > > > > As Martin Sebor pointed out this is done > > as part of redundancy elimination > > in tree-ssa-pre.c and that this might > > simply be an oversight (and could then be > > fixed with a small change). > > > > Could you clarify whether such reordering > > is intentional and could be exploited in > > general also in other optimizations or > > confirm that this is an oversight that > > affects only this specific case? > > > > If this is intentional, are there examples > > where this is important for optimization? > > In general there is no data flow information that > prevents traps from being reordered with respect > to volatile accesses. Yes, although I think potentially trapping ops are not moved before calls (as this would be incorrect). So do you think it would be feasable to prevent this for volatile too? > The specific case could be > easily mitigated in PRE. Another case would be > > A = c / d; > X = 1; > If (use_a) > Bar (a); > > Where we'd sink a across x into the guarded Bb I suspect. Yes. Related example: https://godbolt.org/z/5WGhadre3 volatile int x; void bar(int a); void foo(int c, int d) { int a = c / d; x = 1; if (d) bar(a); } foo: mov DWORD PTR x[rip], 1 testesi, esi jne .L4 ret .L4: mov eax, edi cdq idivesi mov edi, eax jmp bar It would be nice to prevent this too, although I am less concerned about this direction, as the UB has already happened so there is not much we could guarantee about this anyway. In the other case, it could affect correct code before the trap. Martin > (sorry for the odd formatting, writing this on a mobile device). > > Richard. > > Martin > > > > > > > > > > > >
Re: reordering of trapping operations and volatile
On Sat, 8 Jan 2022, Martin Uecker via Gcc wrote: Am Samstag, den 08.01.2022, 13:41 +0100 schrieb Richard Biener: On January 8, 2022 9:32:24 AM GMT+01:00, Martin Uecker wrote: Hi Richard, thank you for your quick response! I have a question regarding reodering of volatile accesses and trapping operations. My initial assumption (and hope) was that compilers take care to avoid creating traps that are incorrectly ordered relative to observable behavior. I had trouble finding examples, and my cursory glace at the code seemed to confirm that GCC carefully avoids this. But then someone showed me this example, where this can happen in GCC: volatile int x; int foo(int a, int b, _Bool store_to_x) { if (!store_to_x) return a / b; x = b; return a / b; } https://godbolt.org/z/vq3r8vjxr In this example a division is hoisted before the volatile store. (the division by zero which could trap is UB, of course). As Martin Sebor pointed out this is done as part of redundancy elimination in tree-ssa-pre.c and that this might simply be an oversight (and could then be fixed with a small change). Could you clarify whether such reordering is intentional and could be exploited in general also in other optimizations or confirm that this is an oversight that affects only this specific case? If this is intentional, are there examples where this is important for optimization? In general there is no data flow information that prevents traps from being reordered with respect to volatile accesses. Yes, although I think potentially trapping ops are not moved before calls (as this would be incorrect). So do you think it would be feasable to prevent this for volatile too? The specific case could be easily mitigated in PRE. Another case would be A = c / d; X = 1; If (use_a) Bar (a); Where we'd sink a across x into the guarded Bb I suspect. Yes. Related example: https://godbolt.org/z/5WGhadre3 volatile int x; void bar(int a); void foo(int c, int d) { int a = c / d; x = 1; if (d) bar(a); } foo: mov DWORD PTR x[rip], 1 testesi, esi jne .L4 ret .L4: mov eax, edi cdq idivesi mov edi, eax jmp bar It would be nice to prevent this too, although I am less concerned about this direction, as the UB has already happened so there is not much we could guarantee about this anyway. In the other case, it could affect correct code before the trap. -fnon-call-exceptions helps with the first testcase but not with the second one. I don't know if that's by accident, but the flag seems possibly relevant. -- Marc Glisse
Re: reordering of trapping operations and volatile
> Yes, although I think potentially trapping ops > are not moved before calls (as this would be > incorrect). So do you think it would be feasable > to prevent this for volatile too? Feasible probably, but why would this be desirable in C? It's not Java! -- Eric Botcazou
Re: reordering of trapping operations and volatile
On 08/01/2022 09:32, Martin Uecker via Gcc wrote: > > Hi Richard, > > I have a question regarding reodering of volatile > accesses and trapping operations. My initial > assumption (and hope) was that compilers take > care to avoid creating traps that are incorrectly > ordered relative to observable behavior. > > I had trouble finding examples, and my cursory > glace at the code seemed to confirm that GCC > carefully avoids this. But then someone showed > me this example, where this can happen in GCC: > > > volatile int x; > > int foo(int a, int b, _Bool store_to_x) > { > if (!store_to_x) > return a / b; > x = b; > return a / b; > } > > > https://godbolt.org/z/vq3r8vjxr > > In this example a division is hoisted > before the volatile store. (the division > by zero which could trap is UB, of course). > Doesn't this depend on whether the trap is considered "observable behaviour", or "undefined behaviour" ? If (on the given target cpu and OS, and with any relevant compiler flags) dividing by zero is guaranteed to give a trap with specific known behaviour, then it is observable behaviour and thus should be ordered carefully with respect to the volatile accesses. On the other hand, if division by 0 is considered undefined behaviour (the C and C++ standards explicitly mark it as undefined, but a compiler can of course define its behaviour) then the compiler can assume it does not happen, or you don't care about the result of the program if it happens. Undefined behaviour can be freely re-ordered around volatile accesses, as far as I understand it - though that can come as a surprise to some people. I don't know which of these views gcc takes - I think both are valid. But it might be worth noting in the reference manual. David > As Martin Sebor pointed out this is done > as part of redundancy elimination > in tree-ssa-pre.c and that this might > simply be an oversight (and could then be > fixed with a small change). > > Could you clarify whether such reordering > is intentional and could be exploited in > general also in other optimizations or > confirm that this is an oversight that > affects only this specific case? > > If this is intentional, are there examples > where this is important for optimization? > > > Martin > > > > > > >
Re: reordering of trapping operations and volatile
Am Samstag, den 08.01.2022, 15:41 +0100 schrieb Eric Botcazou: > > Yes, although I think potentially trapping ops > > are not moved before calls (as this would be > > incorrect). So do you think it would be feasable > > to prevent this for volatile too? > > Feasible probably, but why would this be desirable in C? It's not Java! It would allow us to still give at least some guarantees about the observable behavior of programs that later in their execution encounter UB (e.g. that an transaction with an external device is correctly completed). Considering the fact that it is virtually impossible to prove that any realistic C program is completely free of UB, this is would be very useful. As another example, there was recently the a proposal about adding a safe memory erasure function to the standard lib. It was pointed out that volatile stores would not be enough to be sure that the compiler safely erased some sensitive information, because an optimization based on later UB in the program could undo this. There is now also a proposal for C++ to introduce std::observable, which would require similar ordering constraints. But this would require the programmer to annotate the program correctly. Most C programmers would assume that volatile accesses already provides this guarantee, so actually doing so would be good. Or a more practical example: While debugging some embedded device, it would also be very annoying if the compilers reorders some trap before some debugging output. I could easily imagine loosing hours figuring out what happens. Martin
Re: Help with an ABI peculiarity
On 1/7/2022 2:55 PM, Paul Koning via Gcc wrote: On Jan 7, 2022, at 4:06 PM, Iain Sandoe wrote: Hi Folks, In the aarch64 Darwin ABI we have an unusual (OK, several unusual) feature of the calling convention. When an argument is passed *in a register* and it is integral and less than SI it is promoted (with appropriate signedness) to SI. This applies when the function parm is named only. When the same argument would be placed on the stack (i.e. we ran out of registers) - it occupies its natural size, and is naturally aligned (so, for instance, 3 QI values could be passed as 3 registers - promoted to SI .. or packed into three adjacent bytes on the stack).. The key is that we need to know that the argument will be placed in a register before we decide whether to promote it. (similarly, the promotion is not done in the callee for the in-register case). I am trying to figure out where to implement this. I don't remember the MIPS machinery well enough, but is that a similar case? It too has register arguments (4 or 8 of them) along with stack arguments (for the rest). Most targets these days use registers for parameter passing and obviously we can run out of registers on all of them. The key property is the size/alignment of the argument differs depending on if it's pass in a register (get promoted) or passed in memory (not promoted). I'm not immediately aware of another ABI with that feature. Though I haven't really gone looking. jeff
Re: reordering of trapping operations and volatile
Am Samstag, den 08.01.2022, 16:03 +0100 schrieb David Brown: > On 08/01/2022 09:32, Martin Uecker via Gcc wrote: > > Hi Richard, > > > > I have a question regarding reodering of volatile > > accesses and trapping operations. My initial > > assumption (and hope) was that compilers take > > care to avoid creating traps that are incorrectly > > ordered relative to observable behavior. > > > > I had trouble finding examples, and my cursory > > glace at the code seemed to confirm that GCC > > carefully avoids this. But then someone showed > > me this example, where this can happen in GCC: > > > > > > volatile int x; > > > > int foo(int a, int b, _Bool store_to_x) > > { > > if (!store_to_x) > > return a / b; > > x = b; > > return a / b; > > } > > > > > > https://godbolt.org/z/vq3r8vjxr > > > > In this example a division is hoisted > > before the volatile store. (the division > > by zero which could trap is UB, of course). > > > > Doesn't this depend on whether the trap is considered "observable > behaviour", or "undefined behaviour" ? > > If (on the given target cpu and OS, and with any relevant compiler > flags) dividing by zero is guaranteed to give a trap with specific known > behaviour, then it is observable behaviour and thus should be ordered > carefully with respect to the volatile accesses. > > On the other hand, if division by 0 is considered undefined behaviour > (the C and C++ standards explicitly mark it as undefined, but a compiler > can of course define its behaviour) then the compiler can assume it does > not happen, or you don't care about the result of the program if it > happens. Undefined behaviour can be freely re-ordered around volatile > accesses, as far as I understand it - though that can come as a surprise > to some people. In C++ has wording that makes it clear that this reordering is allowed. In C, some people also see it this way. In my opinion, this is not clear and I always read the standard in a different way (i.e. run-time UB happens at a point in time but can not go backwards at change previous defined behavior). But in any case, I would find it much more useful if it is guaranteed to not affect previous observable behavior. This would make volatile more useful, which in my opinion is preferable to introducing another language feature to work around this issue. This of course assumes that this reodering around volatile accesses and I/O is not essential for optimization. Martin > > I don't know which of these views gcc takes - I think both are valid. > But it might be worth noting in the reference manual. > > David > > > > > As Martin Sebor pointed out this is done > > as part of redundancy elimination > > in tree-ssa-pre.c and that this might > > simply be an oversight (and could then be > > fixed with a small change). > > > > Could you clarify whether such reordering > > is intentional and could be exploited in > > general also in other optimizations or > > confirm that this is an oversight that > > affects only this specific case? > > > > If this is intentional, are there examples > > where this is important for optimization? > > > > > > Martin > > > > > > > > > > > > > >
Re: reordering of trapping operations and volatile
> Most C programmers would assume that volatile accesses already > provides this guarantee, so actually doing so would be good. I'm a little skeptical of this statement: if it was true, how come the most recent version of the standard does not provide it 30 years after the language was first standardized? > Or a more practical example: While debugging some embedded > device, it would also be very annoying if the compilers reorders > some trap before some debugging output. I could easily imagine > loosing hours figuring out what happens. The thing to do to avoid losing these hours is to debug the code at -O0. -- Eric Botcazou
Re: reordering of trapping operations and volatile
On Sat, Jan 8, 2022 at 12:33 AM Martin Uecker via Gcc wrote: > > > Hi Richard, > > I have a question regarding reodering of volatile > accesses and trapping operations. My initial > assumption (and hope) was that compilers take > care to avoid creating traps that are incorrectly > ordered relative to observable behavior. > > I had trouble finding examples, and my cursory > glace at the code seemed to confirm that GCC > carefully avoids this. But then someone showed > me this example, where this can happen in GCC: > > > volatile int x; > > int foo(int a, int b, _Bool store_to_x) > { > if (!store_to_x) > return a / b; > x = b; > return a / b; > } > > > https://godbolt.org/z/vq3r8vjxr The question becomes what is a trapping instruction vs an undefined instruction? For floating point types, it is well defined what is a trapping instruction while for integer types it is not well defined. On some (many?) targets dividing by 0 is just undefined and does not trap (powerpc, aarch64, arm and many others; MIPS it depends on the options passed to GCC if the conditional trap should be inserted or not). The other side is if there is undefined code on the path, should observable results happen first (stores to volatile/atomics, etc.)? GCC assumes by default that divide is trappable but stores not are not observable. This is where -fnon-call-exceptions come into play. In the second case, GCC assumes reducing trappable instructions are fine. Note I thought -fno-delete-dead-exceptions would fix the sink but it didn't. Thanks, Andrew Pinski > > In this example a division is hoisted > before the volatile store. (the division > by zero which could trap is UB, of course). > > As Martin Sebor pointed out this is done > as part of redundancy elimination > in tree-ssa-pre.c and that this might > simply be an oversight (and could then be > fixed with a small change). > > Could you clarify whether such reordering > is intentional and could be exploited in > general also in other optimizations or > confirm that this is an oversight that > affects only this specific case? > > If this is intentional, are there examples > where this is important for optimization? > > > Martin > > > > > >
Re: reordering of trapping operations and volatile
Am Samstag, den 08.01.2022, 10:35 -0800 schrieb Andrew Pinski: > On Sat, Jan 8, 2022 at 12:33 AM Martin Uecker via Gcc wrote: > > > > Hi Richard, > > > > I have a question regarding reodering of volatile > > accesses and trapping operations. My initial > > assumption (and hope) was that compilers take > > care to avoid creating traps that are incorrectly > > ordered relative to observable behavior. > > > > I had trouble finding examples, and my cursory > > glace at the code seemed to confirm that GCC > > carefully avoids this. But then someone showed > > me this example, where this can happen in GCC: > > > > > > volatile int x; > > > > int foo(int a, int b, _Bool store_to_x) > > { > > if (!store_to_x) > > return a / b; > > x = b; > > return a / b; > > } > > > > > > https://godbolt.org/z/vq3r8vjxr > > The question becomes what is a trapping instruction vs an undefined > instruction? > For floating point types, it is well defined what is a trapping > instruction while for integer types it is not well defined. > On some (many?) targets dividing by 0 is just undefined and does not > trap (powerpc, aarch64, arm and many others; MIPS it depends on the > options passed to GCC if the conditional trap should be inserted or > not). > The other side is if there is undefined code on the path, should > observable results happen first (stores to volatile/atomics, etc.)? I think for volatile stores and I/O, I think it would be nice of we could guarantee that those happen before the UB ruins the day. (I am not sure about atomics, those are not directly obsevable) For I/O this is probably already the case (?). For volatile, it seems this would need some tweaks. I am trying to figure out whether this is feasible. > GCC assumes by default that divide is trappable but stores not are not > observable. This is where -fnon-call-exceptions come into play. Ok, thanks! I will look at this! > In the second case, GCC assumes reducing trappable instructions are > fine. -fnon-call-exceptions would treat trapping instructions as defined (and trapping) instead of UB? This is then probably even stronger than the requirement above. > Note I thought -fno-delete-dead-exceptions would fix the sink > but it didn't. Martin
gcc-11-20220108 is now available
Snapshot gcc-11-20220108 is now available on https://gcc.gnu.org/pub/gcc/snapshots/11-20220108/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 11 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-11 revision 32d0d5fe3e522c18fc40109325f9dc055619cedc You'll find: gcc-11-20220108.tar.xz Complete GCC SHA256=a433837a85087c2357a456145ae140bd588e75d44a90031ed57c29de66e46468 SHA1=f89942362a87cb9c49f53177e7fcc57c77238197 Diffs from 11-20220101 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-11 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.