> On 29 Oct 2024, at 7:16 PM, Richard Biener <rguent...@suse.de> wrote: > > External email: Use caution opening links or attachments > > > On Mon, 28 Oct 2024, Soumya AR wrote: > >> This patch transforms the following POW calls to equivalent LDEXP calls, as >> discussed in PR57492: >> >> powi (2.0, i) -> ldexp (1.0, i) >> >> a * powi (2.0, i) -> ldexp (a, i) >> >> 2.0 * powi (2.0, i) -> ldexp (1.0, i + 1) >> >> pow (powof2, i) -> ldexp (1.0, i * log2 (powof2)) >> >> powof2 * pow (2, i) -> ldexp (1.0, i + log2 (powof2)) > > For the multiplication cases why not handle powof2 * ldexp (1., i) > to ldexp (1., i + log2 (powof2)) and a * ldexp (1., i) -> ldexp (a, i) > instead? exp2 * ldexp (1., i) is another candidate. > > So please split out the multiplication parts. > > + /* Simplify pow (powof2, i) to ldexp (1, i * log2 (powof2)). */ > > the below pattern handles POWI, not POW. > > + (simplify > + (POWI REAL_CST@0 @1) > + (with { HOST_WIDE_INT tmp = 0; > + tree integer_arg1 = NULL_TREE; } > + (if (integer_valued_real_p (@0) > + && real_isinteger (&TREE_REAL_CST (@0), &tmp) > + && integer_pow2p (integer_arg1 = build_int_cst (integer_type_node, > tmp))) > > && tmp > 0 > && pow2p_hwi (tmp) > > + (LDEXP { build_one_cst (type); } > + (mult @1 { build_int_cst (integer_type_node, > + tree_log2 (integer_arg1)); }))))) > > build_int_cst (integer_type_node, exact_log2 (tmp)) > > + /* Simplify powi (2.0, i) to ldexp (1, i). */ > + (simplify > + (POWI REAL_CST@0 @1) > + (if (real_equal (TREE_REAL_CST_PTR (@0), &dconst2)) > + (LDEXP { build_one_cst (type); } @1))) > + > > You'll have a duplicate pattern here, instead merge them. 2.0 > is power-of-two so I wonder why the pattern is needed.
Thanks for the feedback! I've merged the extra case that was specifically checking for 2.0. Like you suggested, I also added two ldexp specific transforms: • powof2 * ldexp (x, i) -> ldexp (x, i + log2 (powof2)) • a * ldexp(1., i) -> ldexp (a, i) Regarding your suggestion to also fold exp2, a conversion like exp2 (x) -> ldexp (1., x) or exp2 (x) * ldexp (y, i) -> ldexp (y, i + x) is a bit tricky because we'd have to cast it to an integer before passing it to ldexp. real_to_integer only works for constants, which isn't helpful here as exp2 (CST) becomes a power of 2 anyway and matches with the above patterns. We'll have to explicitly convert it for non constants and I'm not sure if that is worth it for this patch. Let me know what you think. Best, Soumya > Richard. > >> >> This is especially helpful for SVE architectures as LDEXP calls can be >> implemented using the FSCALE instruction, as seen in the following patch: >> https://gcc.gnu.org/pipermail/gcc-patches/2024-September/664160.html >> >> SPEC2017 was run with this patch, while there are no noticeable improvements, >> there are no non-noise regressions either. >> >> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. >> OK for mainline? >> >> Signed-off-by: Soumya AR <soum...@nvidia.com> >> >> gcc/ChangeLog: >> PR target/57492 >> * match.pd: Added patterns to fold certain calls to pow to ldexp. >> >> gcc/testsuite/ChangeLog: >> PR target/57492 >> * gcc.dg/tree-ssa/pow-to-ldexp.c: New test. >> >> > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
v2-0001-Match-Fold-pow-calls-to-ldexp-when-possible-PR57492.patch
Description: v2-0001-Match-Fold-pow-calls-to-ldexp-when-possible-PR57492.patch