Michael Collison <michael.colli...@arm.com> writes: > This patch improves additional cases of FP to integer conversions with > -ffast-math enabled. > > Example 1: > > double > f5 (int x) > { > return (double)(float) x; > } > > > At -O2 with -ffast-math > > Trunk generates: > > f5: > scvtf s0, w0 > fcvt d0, s0 > ret > > > With the patch we can merge the conversion to float and float-extend and > reduce the sequence to one instruction at -O2 and -ffast-math > > f5: > scvtf d0, w0 > ret > > Example 2 > > int > f6 (double x) > { > return (int)(float) x; > } > > > At -O2 (even with -ffast-math) trunk generates > > f6: > fcvt s0, d0 > fcvtzs w0, s0 > ret > > We can merge the float_truncate into the fix at the rtl level > > With -ffast-math enabled and -O2 we can now generate: > > f6: > fcvtzs w0, d0 > ret > > Bootstrapped and regression tested on aarch64-linux-gnu. Okay for trunk?
I don't think these folds belong in target-specific code, since if they're valid, they're valid for all targets. The code that handles sequences of two conversions in gimple is the match.pd rule starting: /* Handle cases of two conversions in a row. */ (for ocvt (convert float fix_trunc) (for icvt (convert float) (simplify (ocvt (icvt@1 @0)) In particular, one of the conditions has: /* Two conversions in a row are not needed unless: - some conversion is floating-point (overstrict for now), or ... */ which seems to leave things open for more floating-point folds to be added. Are there any targets for which converting to float then double is cheaper then converting directly to double? If so, then those targets might need a hook to turn the fold off. But if no modern targets are known to be like that then it seems better to do the fold whenever the FP settings make it valid. Thanks, Richard