Samuel Iglesias Gonsálvez <[email protected]> writes:

> Previously we were emitting two MOV_INDIRECT instructions by calculating
> source's indirect offsets for each 32-bit half of a DF source. However,
> this is not needed as we can just emit two 32-bit MOV INDIRECT without
> doing that calculation.
>

Maybe mention here the main motivation for fixing this and nominating it
for stable: The lowered BSW/BXT indirect move instructions had incorrect
source types, which luckily wasn't causing incorrect assembly to be
generated due to the bug fixed in the next patch, but would have
confused the remaining back-end IR infrastructure due to the mismatch
between the IR source types and the emitted machine code.

> Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
> Cc: "17.0" <[email protected]>
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 23 ++++++++++-------------
>  1 file changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 991c20fad62..8975940e10b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -3878,31 +3878,28 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
>           unsigned read_size = instr->const_index[1] -
>              (instr->num_components - 1) * type_sz(dest.type);
>  
> -         fs_reg indirect_chv_high_32bit;
> -         bool is_chv_bxt_64bit =
> -            (devinfo->is_cherryview || devinfo->is_broxton) &&
> -            type_sz(dest.type) == 8;
> -         if (is_chv_bxt_64bit) {
> -            indirect_chv_high_32bit = vgrf(glsl_type::uint_type);
> -            /* Calculate indirect address to read high 32 bits */
> -            bld.ADD(indirect_chv_high_32bit, indirect, brw_imm_ud(4));
> -         }
> +         bool supports_64bit_indirects =
> +            !devinfo->is_cherryview && !devinfo->is_broxton;
>  
>           for (unsigned j = 0; j < instr->num_components; j++) {
> -            if (!is_chv_bxt_64bit) {
> +            if (type_sz(dest.type) != 8 || supports_64bit_indirects) {
>                 bld.emit(SHADER_OPCODE_MOV_INDIRECT,
>                          offset(dest, bld, j), offset(src, bld, j),
>                          indirect, brw_imm_ud(read_size));
>              } else {
> +               /* We are going to read 64-bit data in two 32-bit MOV 
> INDIRECTS,
> +                * each one reading half of the data.
> +                */
> +               read_size /= 2;

I don't think this is right, data for both halves is interleaved so the
read size will be roughly the same as for the 64 bit indirect move.

>                 bld.emit(SHADER_OPCODE_MOV_INDIRECT,
>                          subscript(offset(dest, bld, j), 
> BRW_REGISTER_TYPE_UD, 0),
> -                        offset(src, bld, j),
> +                        subscript(offset(src, bld, j), BRW_REGISTER_TYPE_UD, 
> 0),
>                          indirect, brw_imm_ud(read_size));
>  
>                 bld.emit(SHADER_OPCODE_MOV_INDIRECT,
>                          subscript(offset(dest, bld, j), 
> BRW_REGISTER_TYPE_UD, 1),
> -                        offset(src, bld, j),
> -                        indirect_chv_high_32bit, brw_imm_ud(read_size));
> +                        subscript(offset(src, bld, j), BRW_REGISTER_TYPE_UD, 
> 1),
> +                        indirect, brw_imm_ud(read_size));

Given that this makes both emit statements nearly identical except for
the 0/1 indices passed to subscript(), wouldn't it make sense to remove
one of them and wrap it in a loop?

>              }
>           }
>        }
> -- 
> 2.11.0
>
> _______________________________________________
> mesa-stable mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to