Samuel Iglesias Gonsálvez <[email protected]> writes: > From: Iago Toral Quiroga <[email protected]> > > We are going to need the same logic for anything that reads > doubles via untyped messages (CS shared variables and SSBOs). Add a > helper function with that logic so that we can reuse it. > > Reviewed-by: Kenneth Graunke <[email protected]> > --- > src/mesa/drivers/dri/i965/brw_fs.h | 6 +++ > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 64 > ++++++++++++++++++++++++++++++++ > 2 files changed, 70 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h > b/src/mesa/drivers/dri/i965/brw_fs.h > index 1970ad0..1aeacae 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.h > +++ b/src/mesa/drivers/dri/i965/brw_fs.h > @@ -110,6 +110,12 @@ public: > const fs_reg src, > uint32_t components); > > + void do_untyped_vector_read(const brw::fs_builder &bld, > + const fs_reg surf_index, > + const fs_reg offset_reg, > + const fs_reg dest, > + unsigned num_components); > + > bool run_fs(bool do_rep_send); > bool run_vs(gl_clip_plane *clip_planes); > bool run_tcs_single_patch(); > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 02f1e81..4af5979 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -3106,6 +3106,70 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder > &bld, > } > > void > +fs_visitor::do_untyped_vector_read(const fs_builder &bld, > + const fs_reg surf_index, > + const fs_reg offset_reg, > + const fs_reg dest,
Usually we put the destination register before any source registers in
function argument lists.
> + unsigned num_components)
> +{
> + if (type_sz(dest.type) <= 4) {
The code below isn't going to work for type_sz() < 4, maybe make the
condition 'type_sz(..) == 4'?
> + fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,
> + 1 /* dims */,
> + num_components,
> + BRW_PREDICATE_NONE);
> + read_result.type = dest.type;
> + for (unsigned i = 0; i < num_components; i++)
> + bld.MOV(offset(dest, bld, i), offset(read_result, bld, i));
> + } else {
This can only work for type_sz() == 8, how about you make this an 'else
if' statement and put an else statement at the end with a single
unreachable() call in it?
> + /* Reading a dvec, so we need to:
> + *
> + * 1. Multiply num_components by 2, to account for the fact that we
> + * need to read 64-bit components.
> + * 2. Shuffle the result of the load to form valid 64-bit elements
> + * 3. Emit a second load (for components z/w) if needed.
> + */
> + fs_reg read_offset = vgrf(glsl_type::uint_type);
> + bld.MOV(read_offset, offset_reg);
> +
> + int iters = num_components <= 2 ? 1 : 2;
> +
> + /* Load the dvec, the first iteration loads components x/y, the second
> + * iteration, if needed, loads components z/w
> + */
> + for (int it = 0; it < iters; it++) {
> + /* Compute number of components to read in this iteration */
> + int iter_components = MIN2(2, num_components);
> + num_components -= iter_components;
> +
> + /* Read. Since this message reads 32-bit components, we need to
> + * read twice as many components.
> + */
> + fs_reg read_result = emit_untyped_read(bld, surf_index, read_offset,
> + 1 /* dims */,
> + iter_components * 2,
> + BRW_PREDICATE_NONE);
> +
You'd save some retypes in the code below (actually all of them -- and
also avoid inadvertent type conversion if the destination type is an
integer) if you allocated a separate virtual register here for the
packed 64bit result, like:
| const fs_reg packed_result = bld.vgrf(dest.type, iter_components);
and then use it as destination for the shuffle function. With the above
taken into account:
Reviewed-by: Francisco Jerez <[email protected]>
> + /* Shuffle the 32-bit load result into valid 64-bit data */
> + shuffle_32bit_load_result_to_64bit_data(
> + bld,
> + retype(read_result, BRW_REGISTER_TYPE_DF),
> + retype(read_result, BRW_REGISTER_TYPE_F),
> + iter_components);
> +
> + /* Move each component to its destination */
> + read_result = retype(read_result, BRW_REGISTER_TYPE_DF);
> + for (int c = 0; c < iter_components; c++) {
> + bld.MOV(offset(dest, bld, it * 2 + c),
> + offset(read_result, bld, c));
> + }
> +
> + bld.ADD(read_offset, read_offset, brw_imm_ud(16));
> + }
> + }
> +}
> +
> +
> +void
> fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr
> *instr)
> {
> fs_reg dest;
> --
> 2.5.0
>
> _______________________________________________
> mesa-dev mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
