On Wed, 25 Oct 2023 at 10:29, Song Gao <[email protected]> wrote:
>
> The result of the LSX instruction is in the low 128 bits
> of the vreg register. We use clean_lsx_result() to clean up
> the high 128 bits of the vreg register.
>
> Signed-off-by: Song Gao <[email protected]>
> ---
> loongarch64.risu | 2121 ++++++++++++++++++++++++++--------------
> risugen | 2 +-
> risugen_loongarch64.pm | 20 +
> 3 files changed, 1405 insertions(+), 738 deletions(-)
>
> diff --git a/loongarch64.risu b/loongarch64.risu
> index 596de90..531470d 100644
> --- a/loongarch64.risu
> +++ b/loongarch64.risu
> @@ -64,26 +64,26 @@ mulw_d_wu LA64 0000 00000001 11111 rk:5 rj:5 rd:5 \
> #div.{w[u]/d[u]} rd,rj,rk
> # div.w{u}, mod.w[u] rk, rj, need in [0x0 ~0x7FFFFFFF]
> # use function set_reg_w($reg)
> -div_w LA64 0000 00000010 00000 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> - !memory { set_reg_w($rj); set_reg_w($rk); }
> -div_wu LA64 0000 00000010 00010 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> - !memory { set_reg_w($rj); set_reg_w($rk); }
> -div_d LA64 0000 00000010 00100 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> -div_du LA64 0000 00000010 00110 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> -mod_w LA64 0000 00000010 00001 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> - !memory { set_reg_w($rj); set_reg_w($rk); }
> -mod_wu LA64 0000 00000010 00011 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> - !memory { set_reg_w($rj); set_reg_w($rk); }
> -mod_d LA64 0000 00000010 00101 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> -mod_du LA64 0000 00000010 00111 rk:5 rj:5 rd:5 \
> - !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> +#div_w LA64 0000 00000010 00000 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> +# !memory { set_reg_w($rj); set_reg_w($rk); }
> +#div_wu LA64 0000 00000010 00010 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> +# !memory { set_reg_w($rj); set_reg_w($rk); }
> +#div_d LA64 0000 00000010 00100 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> +#div_du LA64 0000 00000010 00110 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> +#mod_w LA64 0000 00000010 00001 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> +# !memory { set_reg_w($rj); set_reg_w($rk); }
> +#mod_wu LA64 0000 00000010 00011 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; } \
> +# !memory { set_reg_w($rj); set_reg_w($rk); }
> +#mod_d LA64 0000 00000010 00101 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> +#mod_du LA64 0000 00000010 00111 rk:5 rj:5 rd:5 \
> +# !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
Why do we comment out all these patterns ? The commit
message doesn't say anything about this.
>
> alsl_w LA64 0000 00000000 010 sa2:2 rk:5 rj:5 rd:5 \
> !constraints { $rk != 2 && $rj != 2 && $rd != 2; }
> @@ -615,665 +615,1248 @@ fstx_d LA64 0011 10000011 11000 rk:5 rj:5 fd:5 \
> # LSX instructions
> #
>
> -vadd_b LSX 0111 00000000 10100 vk:5 vj:5 vd:5
> -vadd_h LSX 0111 00000000 10101 vk:5 vj:5 vd:5
> +vadd_b LSX 0111 00000000 10100 vk:5 vj:5 vd:5 \
> + !clean { clean_lsx_result($vd); }
> +vadd_h LSX 0111 00000000 10101 vk:5 vj:5 vd:5 \
> + !clean { clean_lsx_result($vd); }
If there are patterns that need the clean_lsx_result
handling, then add the support for that first, rather
than adding a broken pattern and then editing it to
add the extra block to every pattern, please.
> diff --git a/risugen b/risugen
> index fa94a39..8a67fdf 100755
> --- a/risugen
> +++ b/risugen
> @@ -43,7 +43,7 @@ my @pattern_re = (); # include pattern
> my @not_pattern_re = (); # exclude pattern
>
> # Valid block names (keys in blocks hash)
> -my %valid_blockname = ( constraints => 1, memory => 1, safefloat =>1 );
> +my %valid_blockname = ( constraints => 1, memory => 1, safefloat =>1, clean
> => 1 );
>
> sub parse_risu_directive($$@)
> {
> diff --git a/risugen_loongarch64.pm b/risugen_loongarch64.pm
> index b3f901d..97f00f3 100644
> --- a/risugen_loongarch64.pm
> +++ b/risugen_loongarch64.pm
> @@ -81,6 +81,18 @@ sub nanbox_s($)
> return $fpreg;
> }
>
> +sub clean_lsx_result($)
> +{
> + my ($vreg) = @_;
> +
> + # xvinsgr2vr.d vd, r0, 2;
> + insn32(0x76ebe000 | 2 << 10 | $vreg);
> + # xvinsgr2vr.d vd, r0, 3;
> + insn32(0x76ebe000 | 3 << 10 | $vreg);
> +
> + return $vreg;
> +}
> +
> sub align($)
> {
> my ($a) = @_;
> @@ -405,6 +417,7 @@ sub gen_one_insn($$)
> my $constraint = $rec->{blocks}{"constraints"};
> my $memblock = $rec->{blocks}{"memory"};
> my $safefloat = $rec->{blocks}{"safefloat"};
> + my $clean = $rec->{blocks}{"clean"};
>
> $insn &= ~$fixedbitmask;
> $insn |= $fixedbits;
> @@ -448,6 +461,13 @@ sub gen_one_insn($$)
> $resultreg = eval_with_fields($insnname, $insn, $rec,
> "safefloat", $safefloat);
> }
>
> + if (defined $clean) {
> + # LSX insns only care about low 128 bit,
> + # so we use clean_lsx_result() make sure that high 128bit is 0x0;
> + my $cleanreg;
> + $cleanreg = eval_with_fields($insnname, $insn, $rec, "clean",
> $clean);
> + }
> +
> if (defined $memblock) {
> # Clean up following a memory access instruction:
> # we need to turn the (possibly written-back) basereg
> --
The handling here is exactly identical to the existing "safefloat"
block, and 'safefloat' does nothing floating point specific and
the new 'clean' block does nothing LSX specific. I think we would
do better to have a single block type for "do anything you need to
do to clean up after the insn", and use it for both the patterns
that need to tidy up a float register and those that need to do
this new LSX related tidying.
Open to suggestions for better naming, but my suggestion is that
we call this block type "!post" (because it is the hook for
doing things after emitting the instruction). In future if
we ever needed it we could add a "!pre".
That is:
* rename "!safefloat" to "!post"
* add the clean_lsx_result() function before adding the LSX
etc patterns, so we don't need to add the patterns and then
change them later to add a "!post" block
thanks
-- PMM