Andrew Pinski <quic_apin...@quicinc.com> writes: > Instead of waiting to get combine/rtl optimizations fixed here. This fixes the > builtins at the gimple level. It should provide for slightly faster compile > time > since we have a simplification earlier on.
Yeah, and it's more global than combine would be. > Built and tested for aarch64-linux-gnu. > > gcc/ChangeLog: > > PR target/114522 > * config/aarch64/aarch64-builtins.cc (aarch64_fold_aes_op): New > function. > (aarch64_general_gimple_fold_builtin): Call aarch64_fold_aes_op for > crypto_aese > and crypto_aesd. > > Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com> > --- > gcc/config/aarch64/aarch64-builtins.cc | 30 ++++++++++++++++++++++++++ > 1 file changed, 30 insertions(+) > > diff --git a/gcc/config/aarch64/aarch64-builtins.cc > b/gcc/config/aarch64/aarch64-builtins.cc > index 6d5479c2e44..ab67194575d 100644 > --- a/gcc/config/aarch64/aarch64-builtins.cc > +++ b/gcc/config/aarch64/aarch64-builtins.cc > @@ -4722,6 +4722,31 @@ aarch64_fold_combine (gcall *stmt) > return gimple_build_assign (gimple_call_lhs (stmt), ctor); > } > > +/* Fold a call to vaeseq_u8 and vaesdq_u8. > + That is `vaeseq_u8 (x ^ y, 0)` gets folded > + into `vaeseq_u8 (x, y)`.*/ > +static gimple * > +aarch64_fold_aes_op (gcall *stmt) > +{ > + tree arg0 = gimple_call_arg (stmt, 0); > + tree arg1 = gimple_call_arg (stmt, 1); > + if (integer_zerop (arg0)) > + arg0 = arg1; > + else if (!integer_zerop (arg1)) > + return nullptr; > + if (TREE_CODE (arg0) != SSA_NAME) > + return nullptr; > + if (!has_single_use (arg0)) > + return nullptr; I was initially unsure about the single-use check, but in the end I agree it's the right choice. > + gimple *s = SSA_NAME_DEF_STMT (arg0); > + if (!is_gimple_assign (s) > + || gimple_assign_rhs_code (s) != BIT_XOR_EXPR) > + return nullptr; Very minor, but how about the slightly less LISPy: auto *s = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (arg0)); if (!s || gimple_assign_rhs_code (s) != BIT_XOR_EXPR) return nullptr; so that we're always accessing gimple_assign_* through a gassign. OK with that change from my POV, but please give others a day or so to comment. Thanks, Richard > + gimple_call_set_arg (stmt, 0, gimple_assign_rhs1 (s)); > + gimple_call_set_arg (stmt, 1, gimple_assign_rhs2 (s)); > + return stmt; > +} > + > /* Fold a call to vld1, given that it loads something of type TYPE. */ > static gimple * > aarch64_fold_load (gcall *stmt, tree type) > @@ -4983,6 +5008,11 @@ aarch64_general_gimple_fold_builtin (unsigned int > fcode, gcall *stmt, > gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); > break; > > + VAR1 (BINOPU, crypto_aese, 0, DEFAULT, v16qi) > + VAR1 (BINOPU, crypto_aesd, 0, DEFAULT, v16qi) > + new_stmt = aarch64_fold_aes_op (stmt); > + break; > + > /* Lower sqrt builtins to gimple/internal function sqrt. */ > BUILTIN_VHSDF_DF (UNOP, sqrt, 2, FP) > new_stmt = gimple_build_call_internal (IFN_SQRT,