http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54680
Bug #: 54680
Summary: [SH] Unnecessary int-float-int conversion of fsca
fixed point input
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: [email protected]
ReportedBy: [email protected]
Target: sh4*-*-*
SH4's fsca insn works on 16.16 fixed point inputs. When a SFmode input is used
it must be first scaled, which is done by multiplying the input with a
constant. It is also possible to use fixed point inputs directly to the
standard sinf / cosf functions, to avoid the scaling.
#include <math.h>
const float pi = 3.14159265359f;
float test02 (int x)
{
return sinf (x * (2*pi) / 65536);
}
This will result in:
lds r4,fpul ! 32 movsi_ie/19
float fpul,fr2 ! 10 floatsisf2_i4
ftrc fr2,fpul ! 11 fix_truncsfsi2_i4
fsca fpul,dr2 ! 12 fsca
rts ! 35 *return_i
fmov fr3,fr0 ! 14 movsf_ie/1
... which is already good, but the int -> float -> int conversions can be
eliminated.
The fsca pattern is actually already prepared to support this case. Looking at
the combine log it seems this issue could be resolved by making the fpul input
a bit more flexible:
Failed to match this instruction:
(parallel [
(set (reg:V2SF 172)
(vec_concat:V2SF (unspec:SF [
(mult:SF (float:SF (fix:SI (float:SF (reg/v:SI 164 [ x
]))))
(const_double:SF
9.58737992428525700000000000000000000000000000002e-5
[0x0.c90fdaa22168be2882b78df4b0b1803b8a353e85p-13]))
] 16)
(unspec:SF [
(mult:SF (float:SF (fix:SI (float:SF (reg/v:SI 164 [ x
]))))
(const_double:SF
9.58737992428525700000000000000000000000000000002e-5
[0x0.c90fdaa22168be2882b78df4b0b1803b8a353e85p-13]))
] 14)))
(use (reg/v:PSI 151 ))
])
Another scenario that does not work however is:
float test03 (int x)
{
return sinf ( x * 2 * pi / 65536 );
}
(Notice the missing ( ) around 2 * pi).
It seems this is caused by the fact that the fsca pattern checks for a valid
scaling constant by doing:
&& operands[2] == sh_fsca_int2sf ()"
.. instead of looking at the values of the const_double rtx.