Thanks Jeff and Richard S.
Not sure if I followed up the discussion correct, but this patch only try to
fix the vxrm insn
deleted during late-combine (same scenario as frm) by adding it to global_regs.
If global_regs is not the right place according to the sematic of vxrm, we may
need other fix up to a point.
AFAIK, the most difference between vxrm and frm may look like below, take rvv
intrinsic as example:
13 │ void vxrm ()
14 │ {
15 │ size_t vl = __riscv_vsetvl_e16m1 (N);
16 │ vuint16m1_t va = __riscv_vle16_v_u16m1 (a, vl);
17 │ vuint16m1_t vb = __riscv_vle16_v_u16m1 (b, vl);
18 │ vuint16m1_t vc = __riscv_vaaddu_vv_u16m1 (va, vb, __RISCV_VXRM_RDN,
vl);
19 │
20 │ __riscv_vse16_v_u16m1 (c, vc, vl);
21 │
22 │ call_external ();
23 │ }
24 │
25 │ void frm ()
26 │ {
27 │ size_t vl = __riscv_vsetvl_e16m1 (N);
28 │
29 │ vfloat16m1_t va = __riscv_vle16_v_f16m1(af, vl);
30 │ va = __riscv_vfnmadd_vv_f16m1_rm(va, va, va, __RISCV_FRM_RDN, vl);
31 │ __riscv_vse16_v_f16m1(bf, va, vl);
32 │
33 │ call_external ();
34 │ }
With option "-march=rv64gcv_zvfh -O3"
10 │ vxrm:
11 │ csrwi vxrm,2 // Just set rm directly
...
17 │ vle16.v v2,0(a4)
18 │ vle16.v v1,0(a3)
...
21 │ vaaddu.vv v1,v1,v2
22 │ vse16.v v1,0(a4)
23 │ tail call_external
28 │ frm:
29 │ frrm a2 // backup
30 │ fsrmi 2 // set rm
...
35 │ vle16.v v1,0(a3)
36 │ addi a5,a5,%lo(bf)
37 │ vfnmadd.vv v1,v1,v1
38 │ vse16.v v1,0(a5)
39 │ fsrm a2 // restore
40 │ tail call_external
However, I would like to wait Jeff, or other RISC-V ports for a while before
any potential action to take.
Pan
-----Original Message-----
From: Richard Sandiford <[email protected]>
Sent: Wednesday, February 12, 2025 5:03 PM
To: Jeff Law <[email protected]>
Cc: Andrew Waterman <[email protected]>; Li, Pan2 <[email protected]>;
[email protected]; [email protected]; [email protected];
[email protected]
Subject: Re: [PATCH v1] RISC-V: Make VXRM as global register [PR118103]
Jeff Law <[email protected]> writes:
> On 2/11/25 3:17 PM, Richard Sandiford wrote:
>> Jeff Law <[email protected]> writes:
>>> On 2/11/25 9:08 AM, Richard Sandiford wrote:
>>>> Jeff Law <[email protected]> writes:
>>>>> On 2/7/25 5:59 AM, Andrew Waterman wrote:
>>>>>> This patch runs counter to the ABI spec, which states that vxrm is not
>>>>>> preserved across calls and is volatile upon function entry [1]. vxrm
>>>>>> does not play the same role as frm plays in the calling convention.
>>>>>> (I won't get into the rationale in this email, but the rationale isn't
>>>>>> especially important: we should follow the ABI.)
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/3a79e936eec5491078b1133ac943f91ef5fd75fd/riscv-cc.adoc?plain=1#L119-L120
>>>>> Pan's patch doesn't change the basic property that VXRM has no known
>>>>> state at function entry or upon return from a function call.
>>>>
>>>> I think it will. global_regs[X] means that X is defined on entry,
>>>> defined on exit, and can be changed by calls. If the register is
>>>> call-clobbered/volatile/caller-saved, then I agree with Andrew that
>>>> this doesn't look like the right fix.
>>> But the LCM code we use to manage vxrm assignments makes no assumption
>>> about incoming state and assumes no state is preserved across calls.
>>
>> In that case, I wonder what the patch is fixing. Like you say,
>> the initial mode seems to be VXRM_MODE_NONE, and it looks like
>> riscv_vxrm_mode_after correctly models calls as clobbering the mode.
> Just realized I didn't answer this part of your message. It's not
> really fixing any known issue. Just felt like the right thing to do as
> VXRM is roughly similar to (but clearly not 100% the same) FRM.
But it sounds from the discussion like one of the differences between
FRM and VXRM is also the key difference between marking something as a
global register and marking it as a call-clobbered register.
Rounding modes and exception modes are usually global, because that's
necessary for things like fesetround and fesetexceptflag to work properly.
Like you said in your other reply, there are restrictions about what
the rounding mode can be on entry to certain functions, but that's more
of an API precondition.
It sounds like the ABI defines FRM to be such a global register but that
it defines VXRM (which isn't bound to C library restrictions) to be a
call-clobbered register.
If we want to set a call-clobbered fixed register to a specific local
value, between calls to foo and bar, the sequence would be:
call foo
FIXED_REG := ...
...use FIXED_REG...
call bar
It sounds like this is the correct sequence for VXRM and that it's what
the port generates. If, after introducing the FIXED_REG assignment,
we later delete the uses as dead, the FIXED_REG assignment will also
become dead, since its value is clobbered by the call to bar.
If instead we want to set a global register to a specific local value,
the sequence would be:
call foo
TMP := FIXED_REG
FIXED_REG := ...
...use FIXED_REG...
FIXED_REG := TMP
call bar
It sounds like this is the correct sequence for FRM and it seemed to be
what the port was generating in the PR. If, after introducing the FIXED_REG
assignment, we later delete the uses as dead, the first FIXED_REG assignment
will become dead due to the later FIXED_REG := TMP. Then the TMP := FIXED_REG
and FIXED_REG := TMP collapse into a no-op.
But if we pretend that a call-clobbered register is a global register,
we'd still generate the first sequence above:
call foo
FIXED_REG := ...
...use FIXED_REG...
call bar
but the dataflow would not be as accurate. If we later delete the use
of the fixed register as dead, the assignment would still be kept live
by its assumed use in bar. (Or, if there is no later call, by its
assumed use in the caller.)
Obviously it's not the port I work on, or my call, but if the patch
isn't fixing a known issue then I wonder if it should be reverted.
The justification in the commit message -- that VXRM is a cooperatively-
managed global register -- seems from what Andrew said to be inaccurate.
So it seems like the only effect of the patch is to make the dataflow
less correct than it was before.
But like I say, I realise I'm sticking my oar in here.
Thanks,
Richard