On Thu, Jun 05, 2025 at 09:19:19PM +0900, Takayuki 'January June' Suwa wrote:
> On 2025/06/05 5:09, Max Filippov wrote:
> > On Tue, Jun 3, 2025 at 7:44 AM Takayuki 'January June' Suwa
> > <jjsuwa_sys3...@yahoo.co.jp> wrote:
> > > 
> > > By using the previously unused CEIL|FLOOR|ROUND.S floating-point 
> > > coprocessor
> > > instructions.  In addition, two instruction operand format codes are added
> > > to output the scale value as assembler source.
> > > 
> > >       /* example */
> > >       int test0(float a) {
> > >         return __builtin_lceilf(a);
> > >       }
> > >       int test1(float a) {
> > >         return __builtin_lfloorf(a * 2);
> > >       }
> > >       int test2(float a) {
> > >         /* __builtin_lroundf() requires -fno-math-errno */
> > >         return __builtin_lroundf(a * 32768);
> > >       }
> > > 
> > >       ;; result
> > >       test0:
> > >          entry   sp, 32
> > >          wfr     f0, a2
> > >          ceil.s  a2, f0, 0
> > >          retw.n
> > >       test1:
> > >          entry   sp, 32
> > >          wfr     f0, a2
> > >          floor.s a2, f0, 1
> > >          retw.n
> > >       test2:
> > >          entry   sp, 32
> > >          wfr     f0, a2
> > >          round.s a2, f0, 15
> > >          retw.n
> > > 
> > > gcc/ChangeLog:
> > > 
> > >          * config/xtensa/xtensa.cc (printx, print_operand):
> > >          Add two instruction operand format codes 'U' and 'V',
> > >          whose represent scale factors of 0 to 15th positive/negative
> > >          power of two.
> > >          * gcc/config/xtensa/xtensa.md (c_enum "unspec"):
> > >          Add UNSPEC_CEIL, UNSPEC_FLOOR and UNSPEC_ROUND.
> > >          (int_iterator ANY_ROUND, int_attr m_round):
> > >          New integer iterator and its attribute.
> > >          (fix<s_fix>_truncsfsi2, *fix<s_fix>_truncsfsi2_2x,
> > >          *fix<s_fix>_truncsfsi2_scaled, float<s_float>sisf2,
> > >          *float<s_float>sisf2_scaled):
> > >          Use output templates with the operand formats added above,
> > >          instead of individual output statements.
> > >          (l<m_round>sfsi2, *l<m_round>sfsi2_2x, *l<m_round>sfsi2_scaled):
> > >          New insn patterns.
> > > ---
> > >    gcc/config/xtensa/xtensa.cc | 16 ++++++++++++
> > >    gcc/config/xtensa/xtensa.md | 52 ++++++++++++++++++++++++++++---------
> > >    2 files changed, 56 insertions(+), 12 deletions(-)
> > 
> > This passes without new regressions on targets without FPU, but I see
> > a few new failures in the gfortran testsuite on a target with FPU:
> > 
> > +FAIL: gfortran.dg/nint_2.f90   -O0  execution test
> > +FAIL: gfortran.dg/out_of_range_1.f90   -O0  execution test
> > +FAIL: gfortran.dg/out_of_range_1.f90   -O1  execution test
> > +FAIL: gfortran.dg/out_of_range_1.f90   -O2  execution test
> > +FAIL: gfortran.dg/out_of_range_1.f90   -O3 -fomit-frame-pointer
> > -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
> > test
> > +FAIL: gfortran.dg/out_of_range_1.f90   -O3 -g  execution test
> > +FAIL: gfortran.dg/out_of_range_1.f90   -Os  execution test
> > +FAIL: gfortran.dg/out_of_range_2.f90   -O0  execution test
> > +FAIL: gfortran.dg/out_of_range_2.f90   -O1  execution test
> > +FAIL: gfortran.dg/out_of_range_2.f90   -O2  execution test
> > +FAIL: gfortran.dg/out_of_range_2.f90   -O3 -fomit-frame-pointer
> > -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
> > test
> > +FAIL: gfortran.dg/out_of_range_2.f90   -O3 -g  execution test
> > +FAIL: gfortran.dg/out_of_range_2.f90   -Os  execution test
> > 
> > At first glance they may be related to the rounding mode settings,
> > let me take a closer look.
> > 
> gccint says l(ceil/floor/round)sfsi2 are not interested in the current
> rounding mode.

I couldn't find anything about how lroundsfsi2 should behave in the
gccint, was looking for how exactly the rounding is supposed to happen.

> Similary, Xtensa ISA refman says CEIL/FLOOR/ROUND.S will
> not affect the current rounding mode.
>
> IMHO I suspect that the behavior for out-of-range input values, e.g. values
> ​​that do not fit in signed 32 bits, infinity or NaN, may be different.

The issue is that round.s opcode rounds to the nearest multiple of 2,
not away from zero as expected by the failing tests, AFAICT.
So e.g. -128.5 is rounded by round.s to -128.
I observe this with both FPU2000 and DFPU Xtensa options.

-- 
Thanks.
-- Max

Reply via email to