https://gcc.gnu.org/g:50f8fac5b6e9972bf70764d2e5d0871d38dc518a
commit 50f8fac5b6e9972bf70764d2e5d0871d38dc518a Author: Michael Meissner <[email protected]> Date: Thu Nov 13 11:22:37 2025 -0500 Update ChangeLog.* Diff: --- gcc/ChangeLog.ibm | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/gcc/ChangeLog.ibm b/gcc/ChangeLog.ibm index f0462ec745f4..f74c6a15bd1c 100644 --- a/gcc/ChangeLog.ibm +++ b/gcc/ChangeLog.ibm @@ -1,3 +1,72 @@ +==================== Branch ibm/gcc-16-future-float16, patch #209 ==================== + +Optimize __bfloat16 scalar code. + +Optimize __bfloat16 binary operations. Unlike _Float16 where we +have instructions to convert between HFmode and SFmode as scalar +values, with BFmode, we only have vector conversions. Thus to do: + + __bfloat16 a, b, c; + + a = b + c; + +the GCC compiler generates the following code: + + lxsihzx 0,4,2 // load __bfloat16 value b + lxsihzx 12,5,2 // load __bfloat16 value c + xxsldwi 0,0,0,1 // shift b into bits 16..31 + xxsldwi 12,12,12,1 // shift c into bits 16..31 + xvcvbf16spn 0,0 // vector convert b into V4SFmode + xvcvbf16spn 12,12 // vector convert c into V4SFmode + xscvspdpn 0,0 // convert b into SFmode scalar + xscvspdpn 12,12 // convert c into SFmode scalar + fadds 0,0,12 // add b+c + xscvdpspn 0,0 // convert b+c into SFmode memory format + xvcvspbf16 0,0 // convert b+c into BFmode memory format + stxsihx 0,3,2 // store b+c + +Using the following combiner patterns that are defined in this patch, the code +generated would be: + + + lxsihzx 12,4,2 // load __bfloat16 value b + lxsihzx 0,5,2 // load __bfloat16 value c + xxspltw 12,12,1 // shift b into bits 16..31 + xxspltw 0,0,1 // shift c into bits 16..31 + xvcvbf16spn 12,12 // vector convert b into V4SFmode + xvcvbf16spn 0,0 // vector convert c into V4SFmode + xvaddsp 0,0,12 // vector b+c in V4SFmode + xvcvspbf16 0,0 // convert b+c into BFmode memory format + stxsihx 0,3,2 // store b+c + +We cannot just define insns like 'addbf3' to keep the operation as +BFmode because GCC will not generate these patterns unless the user +uses -Ofast. Without -Ofast, it will always convert BFmode into +SFmode. + +2025-11-13 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/float16.cc (bfloat16_operation_as_v4sf): New function to + optimize __bfloat16 scalar operations. + * config/rs6000/float16.md (bfloat16_binary_op_internal1): New + __bfloat16 scalar combiner insns. + (bfloat16_binary_op_internal2): Likewise. + (bfloat16_fma_internal1): Likewise. + (bfloat16_fma_internal2): Likewise. + (bfloat16_fms_internal1): Likewise. + (bfloat16_fms_internal2): Likewise. + (bfloat16_nfma_internal1): Likewise. + (bfloat16_nfma_internal2): Likewise. + (bfloat16_nfms_internal3): Likewise. + * config/rs6000/predicates.md (fp16_reg_or_constant_operand): New + predicate. + (bfloat16_v4sf_operand): Likewise. + (bfloat16_bf_operand): Likewise. + * config/rs6000/rs6000-protos.h (bfloat16_operation_as_v4sf): New + declaration. + ==================== Branch ibm/gcc-16-future-float16, patch #208 ==================== Add --with-powerpc-float16 and --with-powerpc-float16-disable-warning.
