[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2019-10-09 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #30

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-18 Thread ubizjak at gmail dot com
--- Comment #29 from ubizjak at gmail dot com 2007-06-18 08:56 --- Patch was committed to SVN, so closing as fixed. -- ubizjak at gmail dot com changed: What|Removed |Added ---

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-16 Thread uros at gcc dot gnu dot org
--- Comment #28 from uros at gcc dot gnu dot org 2007-06-16 09:53 --- Subject: Bug 31723 Author: uros Date: Sat Jun 16 09:52:48 2007 New Revision: 125756 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125756 Log: PR middle-end/31723 * hooks.c (hook_tree_tree_bool_null):

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-15 Thread burnus at gcc dot gnu dot org
--- Comment #27 from burnus at gcc dot gnu dot org 2007-06-15 13:23 --- Cross-pointer: see also PR 32352 (Polyhedron aermod.f90 crashes due out-of-bounds problems to numerical differences using rsqrt/-mrecip). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-14 Thread ubizjak at gmail dot com
--- Comment #26 from ubizjak at gmail dot com 2007-06-14 09:18 --- Patch at http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00944.html -- ubizjak at gmail dot com changed: What|Removed |Added ---

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-13 Thread ubizjak at gmail dot com
--- Comment #25 from ubizjak at gmail dot com 2007-06-13 20:20 --- RFC patch at http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00916.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread tbptbp at gmail dot com
--- Comment #24 from tbptbp at gmail dot com 2007-06-11 05:58 --- Yes, but there's some fuss at 0 when you pile up a NR round. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread ubizjak at gmail dot com
--- Comment #23 from ubizjak at gmail dot com 2007-06-11 05:51 --- (In reply to comment #22) > At some point icc did such transformations (for 1/x and sqrt) but, apparently, > they're now removed. It didn't bother to plug every holes (ie wrt infinities) > but at least got the case of 0

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread tbptbp at gmail dot com
--- Comment #22 from tbptbp at gmail dot com 2007-06-11 03:32 --- I'm a bit late to the debate but... At some point icc did such transformations (for 1/x and sqrt) but, apparently, they're now removed. It didn't bother to plug every holes (ie wrt infinities) but at least got the case of

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread rguenth at gcc dot gnu dot org
--- Comment #21 from rguenth at gcc dot gnu dot org 2007-06-10 21:48 --- The other issue is really about this bug, so not splitting. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread rguenth at gcc dot gnu dot org
--- Comment #20 from rguenth at gcc dot gnu dot org 2007-06-10 21:46 --- PR32279 for 1/sqrt(x/y) to sqrt(y/x) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread rguenther at suse dot de
--- Comment #19 from rguenther at suse dot de 2007-06-10 21:39 --- Subject: Re: Use reciprocal and reciprocal square root with -ffast-math On Sun, 10 Jun 2007, ubizjak at gmail dot com wrote: > > > --- Comment #18 from ubizjak at gmail dot com 2007-06-10 17:34 --- > (In re

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread ubizjak at gmail dot com
--- Comment #18 from ubizjak at gmail dot com 2007-06-10 17:34 --- (In reply to comment #14) > The interesting difference between sqrtss, divss and rcpss, rsqrtss is that > the former have throughput of 1/16 while the latter are 1/1 (latencies compare > 21 vs. 3). This is on K10. The o

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread ubizjak at gmail dot com
--- Comment #17 from ubizjak at gmail dot com 2007-06-10 16:49 --- (In reply to comment #0) > /* Mathematically equivalent to 1/sqrt(b*(1/a)) */ > return sqrtf(a/b); Whoa, this one is a little gem, but ATM in the opposite direction. At least for -ffast-math we could optimize (a /

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread ubizjak at gmail dot com
--- Comment #16 from ubizjak at gmail dot com 2007-06-10 16:24 --- (In reply to comment #13) > > x1 = 0.5 X0 (3.0 - A x0 x0 x0) Whops! One x0 too much above. Correct calcualtion reads: rsqrt = 0.5 rsqrt(a) (3.0 - a rsqrt(a) rsqrt(a)). > Well, I suppose it depends on the hardware. IIR

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread rguenth at gcc dot gnu dot org
--- Comment #15 from rguenth at gcc dot gnu dot org 2007-06-10 12:09 --- And of course optimizing division or square root this way violates IEEE 754 which specifies these as intrinsic operations. So a separate flag from -funsafe-math-optimization should be used for this optimization.

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread rguenth at gcc dot gnu dot org
--- Comment #14 from rguenth at gcc dot gnu dot org 2007-06-10 12:07 --- The interesting difference between sqrtss, divss and rcpss, rsqrtss is that the former have throughput of 1/16 while the latter are 1/1 (latencies compare 21 vs. 3). This is on K10. The optimization guide only me

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread jb at gcc dot gnu dot org
--- Comment #13 from jb at gcc dot gnu dot org 2007-06-10 11:06 --- (In reply to comment #11) Thanks for the work. > First, please note that "divss" instruction is quite _fast_, clocking at 23 > cycles, where approximation with NR step would sum up to 20 cycles, not > counting load of

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread ubizjak at gmail dot com
--- Comment #12 from ubizjak at gmail dot com 2007-06-10 10:47 --- Here are the results of mubench insn timings for various x86 processors: http://mubench.sourceforge.net/results.html (target processor can be benchmarked by downloading mubench from http://mubench.sourceforge.net/index.ht

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-06-10 Thread ubizjak at gmail dot com
--- Comment #11 from ubizjak at gmail dot com 2007-06-10 08:28 --- I have experimented a bit with rcpss, trying to measure the effect of additional NR step to the performance. NR step was calculated based on http://en.wikipedia.org/wiki/N-th_root_algorithm, and for N=-1 (1/A) we can simp

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread pinskia at gcc dot gnu dot org
-- pinskia at gcc dot gnu dot org changed: What|Removed |Added Severity|normal |enhancement http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread rguenth at gcc dot gnu dot org
--- Comment #9 from rguenth at gcc dot gnu dot org 2007-04-27 22:03 --- I looked at this at some time and in priciple it doens't require it. For the vectorized call we'd need to support target dependent pattern vectorization, for the scalar case we would need a new optab to handle 1/x e

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread steven at gcc dot gnu dot org
--- Comment #8 from steven at gcc dot gnu dot org 2007-04-27 21:43 --- I suppose this is something that requires new builtins? -- steven at gcc dot gnu dot org changed: What|Removed |Added ---

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread burnus at gcc dot gnu dot org
--- Comment #7 from burnus at gcc dot gnu dot org 2007-04-27 12:41 --- > (float) time for 1.0 / sqrt = 5.96 sec (res = 2.845058125000e+05) > (float) time for rsqrt = 2.49 sec (res = 2.23602250e+05) > (double) time for 1.0 / sqrt = 7.35 sec (res = 5.9926234364635509e+0

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread rguenth at gcc dot gnu dot org
--- Comment #6 from rguenth at gcc dot gnu dot org 2007-04-27 12:09 --- You are right, they are only available for float precision. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31723

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread jb at gcc dot gnu dot org
--- Comment #5 from jb at gcc dot gnu dot org 2007-04-27 12:01 --- With the benchmarks at http://www.hlnum.org/english/doc/frsqrt/frsqrt.html I get ~/src/benchmark/rsqrt% g++ -O3 -funroll-loops -ffast-math -funit-at-a-time -march=k8 -mfpmath=sse frsqrt.cc ~/src/benchmark/rsqrt% ./a.out

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread jb at gcc dot gnu dot org
--- Comment #4 from jb at gcc dot gnu dot org 2007-04-27 11:29 --- (In reply to comment #3) > 1. Convert to single precision > 2. Calculate rcp(s|p)s or rsqrt(p|s)s > 3. Refine with newton iteration > > vs. just using div(p|s)d or sqrt(p|s)d? This should be 1. Convert to single precis

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread jb at gcc dot gnu dot org
--- Comment #3 from jb at gcc dot gnu dot org 2007-04-27 11:27 --- (In reply to comment #2) > Note that SSE can vectorize only the float precision variant, not the double > precision one. So one needs to carefuly either disable vectorization for the > double variant to get reciprocal co

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread rguenth at gcc dot gnu dot org
--- Comment #2 from rguenth at gcc dot gnu dot org 2007-04-27 10:45 --- Note that SSE can vectorize only the float precision variant, not the double precision one. So one needs to carefuly either disable vectorization for the double variant to get reciprocal code or the other way around

[Bug middle-end/31723] Use reciprocal and reciprocal square root with -ffast-math

2007-04-27 Thread burnus at gcc dot gnu dot org
--- Comment #1 from burnus at gcc dot gnu dot org 2007-04-27 10:16 --- Comment by Richard Guenther in the same thread: - I think that even with -ffast-math 12 bits accuracy is not ok. There is the possibility of doing another newton iteration step to improve accuracy, th