On Wed, Jul 14, 2021 at 9:49 AM Matthias Kretz <m.kr...@gsi.de> wrote: > > On Wednesday, 14 July 2021 09:39:42 CEST Richard Biener wrote: > > -ffast-math decomposes to quite some flag_* and those generally are not > > reflected into the IL but can be different per function (and then > > prevent inlining). > > Is there any chance the "and then prevent inlining" can be eliminated? Because > then I could write my own fast<float> class in C++, marking all operators with > __attribute__((optimize("-Ofast")))...
The problem is that it's the function (call) that carries the "-Ofast" information, once you inline it's lost and the flag settings in the caller take effect. If one would be happy with loosing optimization on the inlined parts then one could re-write all FP operations inlined to function calls but I understand that you want the code stmts be still optimized. I also think that how to reflect each of the options enabled by -ffast-math on the IL will depend on the actual feature - like for example -ffinite-math-only could be lowered to promises turning _1 = _2 + _3; into _4 = .ASSUME_FINITE (_2); _5 = .ASSUME_FINITE (_3); _1 = _4 + _5; _6 = .ASSUME_FINITE (_1); which is possibly a bit heavy-weight. -fno-signed-zeros is a bit more difficult - it's documented as the sign of zero having no significance. -fno-signalling-nans could be handled similarly. Note things like above require some simple propagation engine and reflecting the predicates to some on-the-side info on SSA names (just like we have range info for integers). -fno-associative-math can possibly be handled by emitting loads of PAREN_EXPRs. -fno-math-errno could be reflected by a flag on the call statements (but we're low on flags there). -fcx-limited-range - that one is reflected in the IL! Yay! -fno-rounding-math - one of the most difficult, we probably need different operators. Then there's -funsafe-math-optimizations - I fear we'd either need different types / float formats for that or different operators. Note all remaining -funsafe-math-optimizations guarded optimizations should possibly be re-classified to something more specific. > > There's one "related" IL feature used by the Fortran frontend - PAREN_EXPR > > prevents association across it. So for Fortran (when not > > -fno-protect-parens which is enabled by -Ofast), (a + b) - b cannot be > > optimized to a. Eventually this could be used to wrap intrinsic results > > since most of the issues in the end require association. Note PAREN_EXPR > > isn't exposed to the C family frontends but we could of course add a > > builtin-like thing for this _Noassoc ( .... ) or so. Note PAREN_EXPR > > survives -Ofast so it's the frontends that would need to choose to emit or > > not emit it (or always emit it). > > Interesting. I want that builtin in C++. Currently I use inline asm to achieve > a similar effect. But the inline asm hammer is really too big for the problem. I think implementing it similar to how we do __builtin_shufflevector would be easily possible. PAREN_EXPR is a tree code. But as said, re-association is only one part of -ffast-math, other parts include disrespecting singed zeros, NaNs and infinities, etc. - but probably most people run into the rounding issues resulting from re-association. Richard. > > -- > ────────────────────────────────────────────────────────────────────────── > Dr. Matthias Kretz https://mattkretz.github.io > GSI Helmholtz Centre for Heavy Ion Research https://gsi.de > std::experimental::simd https://github.com/VcDevel/std-simd > ──────────────────────────────────────────────────────────────────────────