"H.J. Lu" <hjl.to...@gmail.com> writes: > On Thu, May 17, 2018 at 1:56 AM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >> Richard Biener <richard.guent...@gmail.com> writes: >>>> @@ -2698,23 +2703,26 @@ convert_mult_to_fma_1 (tree mul_result, >>>> } >>> >>>> if (negate_p) >>>> - mulop1 = force_gimple_operand_gsi (&gsi, >>>> - build1 (NEGATE_EXPR, >>>> - type, mulop1), >>>> - true, NULL_TREE, true, >>>> - GSI_SAME_STMT); >>>> + mulop1 = gimple_build (&seq, NEGATE_EXPR, type, mulop1); >>> >>>> - fma_stmt = gimple_build_assign (gimple_assign_lhs (use_stmt), >>>> - FMA_EXPR, mulop1, op2, addop); >>>> + if (seq) >>>> + gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT); >>>> + fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, >>> addop); >>>> + gimple_call_set_lhs (fma_stmt, gimple_assign_lhs (use_stmt)); >>>> + gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal >>> (use_stmt)); >>>> + gsi_replace (&gsi, fma_stmt, true); >>>> + /* Valueize aggressively so that we generate FMS, FNMA and FNMS >>>> + regardless of where the negation occurs. */ >>>> + if (fold_stmt (&gsi, aggressive_valueize)) >>>> + update_stmt (gsi_stmt (gsi)); >>> >>> I think it would be nice to be able to use gimple_build () with IFNs so you >>> can >>> gimple_build () the IFN and then use gsi_replace_with_seq () on it. You >>> only need to fold with generated negates, not with negates already in the >>> IL? >>> The the folding implied with gimple_build will take care of it. >> >> The idea was to pick up existing negates that feed the multiplication >> as well as any added by the pass itself. >> >> On IRC yesterday we talked about how this should handle the ECF_NOTHROW >> flag, and whether things like IFN_SQRT and IFN_FMA should always be >> nothrow (like the built-in functions are). But in the end I thought >> it'd be better to keep things as they are. We already handle >> -fnon-call-exceptions for unfused a * b + c and before the patch also >> handled it for FMA_EXPR. It'd seem like a step backwards if the new >> internal functions didn't handle it too. If anything it seems like the >> built-in functions should change to be closer to the tree_code and >> internal_fn way of doing things, if we want to support -fnon-call-exceptions >> properly. >> >> This also surprised me when doing the if-conversion patch I sent yesterday. >> We're happy to vectorise: >> >> for (int i = 0; i < 100; ++i) >> x[i] = ... ? sqrt (x[i]) : 0; >> >> by doing the sqrt unconditionally and selecting on the result, even with >> the default maths flags, but refuse to vectorise the simpler: >> >> for (int i = 0; i < 100; ++i) >> x[i] = ... ? x[i] + 1 : 0; >> >> in the same way. >> >>> Otherwise can you please move aggressive_valueize to gimple-fold.[ch] >>> alongside no_follow_ssa_edges / follow_single_use_edges and maybe >>> rename it as follow_all_ssa_edges? >> >> Ah, yeah, that's definitely a better name. >> >> I also renamed all_scalar_fma to scalar_all_fma, since I realised >> after Andrew's reply that the old name made it sound like it was >> "all scalars", whereas it meant to mean "all fmas". >> >> Tested as before. >> >> Thanks, >> Richard >> >> 2018-05-17 Richard Sandiford <richard.sandif...@linaro.org> >> >> gcc/ >> * doc/sourcebuild.texi (scalar_all_fma): Document. >> * tree.def (FMA_EXPR): Delete. >> * internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions. >> * internal-fn.c (ternary_direct): New macro. >> (expand_ternary_optab_fn): Likewise. >> (direct_ternary_optab_supported_p): Likewise. >> * Makefile.in (build/genmatch.o): Depend on case-fn-macros.h. >> * builtins.c (fold_builtin_fma): Delete. >> (fold_builtin_3): Don't call it. >> * cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling. >> * expr.c (expand_expr_real_2): Likewise. >> * fold-const.c (operand_equal_p): Likewise. >> (fold_ternary_loc): Likewise. >> * gimple-pretty-print.c (dump_ternary_rhs): Likewise. >> * gimple.c (DEFTREECODE): Likewise. >> * gimplify.c (gimplify_expr): Likewise. >> * optabs-tree.c (optab_for_tree_code): Likewise. >> * tree-cfg.c (verify_gimple_assign_ternary): Likewise. >> * tree-eh.c (operation_could_trap_p): Likewise. >> (stmt_could_throw_1_p): Likewise. >> * tree-inline.c (estimate_operator_cost): Likewise. >> * tree-pretty-print.c (dump_generic_node): Likewise. >> (op_code_prio): Likewise. >> * tree-ssa-loop-im.c (stmt_cost): Likewise. >> * tree-ssa-operands.c (get_expr_operands): Likewise. >> * tree.c (commutative_ternary_tree_code, add_expr): Likewise. >> * fold-const-call.h (fold_fma): Delete. >> * fold-const-call.c (fold_const_call_ssss): Handle CFN_FMS, >> CFN_FNMA and CFN_FNMS. >> (fold_fma): Delete. >> * genmatch.c (combined_fn): New enum. >> (commutative_ternary_tree_code): Remove FMA_EXPR handling. >> (commutative_op): New function. >> (commutate): Use it. Handle more than 2 operands. >> (dt_operand::gen_gimple_expr): Use commutative_op. >> (parser::parse_expr): Allow :c to be used with non-binary >> operators if the commutative operand is known. >> * gimple-ssa-backprop.c (backprop::process_builtin_call_use): Handle >> CFN_FMS, CFN_FNMA and CFN_FNMS. >> (backprop::process_assign_use): Remove FMA_EXPR handling. >> * hsa-gen.c (gen_hsa_insns_for_operation_assignment): Likewise. >> (gen_hsa_fma): New function. >> (gen_hsa_insn_for_internal_fn_call): Use it for IFN_FMA, IFN_FMS, >> IFN_FNMA and IFN_FNMS. >> * match.pd: Add folds for IFN_FMS, IFN_FNMA and IFN_FNMS. >> * gimple-fold.h (follow_all_ssa_edges): Declare. >> * gimple-fold.c (follow_all_ssa_edges): New function. >> * tree-ssa-math-opts.c (convert_mult_to_fma_1): Use the >> gimple_build interface and use follow_all_ssa_edges to fold >> the result. >> (convert_mult_to_fma): Use direct_internal_fn_suppoerted_p >> instead of checking for optabs directly. >> * config/i386/i386.c (ix86_add_stmt_cost): Recognize FMAs as calls >> rather than FMA_EXPRs. >> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Create a >> call to IFN_FMA instead of an FMA_EXPR. >> >> gcc/brig/ >> * brigfrontend/brig-function.cc >> (brig_function::get_builtin_for_hsa_opcode): Use BUILT_IN_FMA >> for BRIG_OPCODE_FMA. >> (brig_function::get_tree_code_for_hsa_opcode): Treat BUILT_IN_FMA >> as a call. >> >> gcc/c/ >> * gimple-parser.c (c_parser_gimple_postfix_expression): Remove >> __FMA_EXPR handlng. >> >> gcc/cp/ >> * constexpr.c (cxx_eval_constant_expression): Remove FMA_EXPR >> handling. >> (potential_constant_expression_1): Likewise. >> >> gcc/testsuite/ >> * lib/target-supports.exp (check_effective_target_scalar_all_fma): >> New proc. >> * gcc.dg/fma-1.c: New test. >> * gcc.dg/fma-2.c: Likewise. >> * gcc.dg/fma-3.c: Likewise. >> * gcc.dg/fma-4.c: Likewise. >> * gcc.dg/fma-5.c: Likewise. >> * gcc.dg/fma-6.c: Likewise. >> * gcc.dg/fma-7.c: Likewise. >> * gcc.dg/gimplefe-26.c: Use .FMA instead of __FMA and require >> scalar_all_fma. >> * gfortran.dg/reassoc_7.f: Pass -ffp-contract=off. >> * gfortran.dg/reassoc_8.f: Likewise. >> * gfortran.dg/reassoc_9.f: Likewise. >> * gfortran.dg/reassoc_10.f: Likewise. >> > > This caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85881
Sorry about that. Looking at it, I think it's another instance of the original RA problem in PR 80481. The expand code after this patch is the same as before apart from the numbering of the pseudo registers: after the patch there are fewer unused pseudo registers than before, which in itself should be a good thing. However, it seems to be enough to defeat the fix for PR80481, and in fact reverting that fix makes pr80481.C pass again. Thanks, Richard