On Fri, 2017-02-10 at 10:44 -0800, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez <[email protected]> writes: > > > On Thu, 2017-02-09 at 18:28 -0800, Francisco Jerez wrote: > > > Francisco Jerez <[email protected]> writes: > > > > > > > --- > > > > This replaces "[PATCH v2 09/20] i965/fs: indirect addressing > > > > with > > > > doubles is not supported in IVB/BYT". > > > > > > > > > > Note that some of the fp64 indirect addressing test-cases still > > > fail > > > on > > > IVB even with this patch applied, but the reason doesn't seem to > > > have > > > anything to do with indirect addressing. Apparently the > > > remaining > > > failures are caused by illegal code like: > > > > > > > sel(8) g30<1>D g9<0,2,1>DF g8.3<0,2,1>DF > > > > { > > > > align1 1Q }; > > > > > > The problem is that the SEL instruction doesn't support most > > > datatype > > > conversions, so this leads to data corruption on at least IVB. > > > According to the hardware docs, the only valid destination type > > > of > > > SEL > > > is DF if the execution type is DF, and F (or HF on CHV+) if the > > > execution type is F. Integer 16 or 32-bit execution types seem > > > to > > > allow > > > converting to other single-precision integer types. > > > > > > > Some weeks ago I saw this issue while working in a solution for > > IVB's > > MOV INDIRECT. That SEL is added by opt_peephole_sel(): > > > > https://lists.freedesktop.org/archives/mesa-dev/2017-January/142023 > > .htm > > l > > > > However, I did it very restrictive by just don't allowing any > > conversion between different data type sizes. I am going to improve > > that patch to include all the allowed cases. > > > > I wouldn't worry too much about the couple of supported type > conversions, it doesn't seem terribly important, my concern with your > patch is that it does seem to allow several type conversions which > are > invalid, like D->F, F->D, DF->Q, etc. >
Right. I can just disallowing all the conversions (src.type !=
dst.type) in that pass... but I prefer the solutions suggested below.
> Aside from that I'm not particularly confident that your patch will
> fix
> all potential sources of invalid conversions. E.g. do we know that
> no
> other optimization pass will ever change the destination type of the
> SEL
> assuming that the initial type was correct? What about other
> instructions with double-to-single or single-to-double type
> conversion
> restrictions? (A quick look at the PRMs would likely reveal
> instructions
> with similar restrictions) -- I don't think we can answer these
> questions without auditing the rest of the compiler back-end (while
> doing this in a legalization pass would be pretty much obviously
> correct). Alternatively you could fix the EU validator to recognize
> this and other cases of invalid conversions, which would make sure we
> get an assertion failure in the likely case that we've missed
> something
> so we can find the problem quickly instead of spending hours
> debugging
> non-deterministic data corruption issues. I'm okay either way,
> validation or legalization pass, do it as you like.
>
OK, thanks.
Sam
> > > I'm not sure why we haven't seen this cause massive breakage on
> > > HSW+,
> > > but I think we need some sort of legalization pass to do the type
> > > conversion in a separate MOV for any instruction with an invalid
> > > destination conversion. You may be able to do it within the d2x
> > > pass
> > > but then it would make sense to rename it to something more
> > > appropriate
> > > like destination conversion lowering (lower_cvt? :P).
> > >
> >
> > :)
> >
> > > In addition there seem to be other issues with your d2x lowering
> > > code
> > > (I'm bringing this up here because you don't seem to have sent
> > > the
> > > d2x
> > > changes for review to the mailing list yet) -- It removes the
> > > original
> > > instruction and creates a new one with the same opcode and first
> > > few
> > > sources, which will miscompile the program if the original
> > > instruction
> > > had a larger number of sources or any other instruction controls
> > > set. I
> > > suggest you modify the original instruction in-place to point it
> > > to
> > > the
> > > temporary you've allocated, and then copy the data into the
> > > original
> > > destination by using a strided move.
> > >
> >
> > OK.
> >
> > Thanks,
> >
> > Sam
> >
> > > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27
> > > > ++++++++++++++++++++++++--
> > > > 1 file changed, 25 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > index ea4a3fe1399..0e2dbe23571 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > > @@ -440,7 +440,7 @@ fs_generator::generate_mov_indirect(fs_inst
> > > > *inst,
> > > > brw_MOV(p, dst, reg);
> > > > } else {
> > > > /* Prior to Broadwell, there are only 8 address
> > > > registers.
> > > > */
> > > > - assert(inst->exec_size == 8 || devinfo->gen >= 8);
> > > > + assert(inst->exec_size <= 8 || devinfo->gen >= 8);
> > > >
> > > > /* We use VxH indirect addressing, clobbering a0.0
> > > > through
> > > > a0.7. */
> > > > struct brw_reg addr = vec8(brw_address_reg(0));
> > > > @@ -478,7 +478,30 @@
> > > > fs_generator::generate_mov_indirect(fs_inst
> > > > *inst,
> > > > * code, using it saves us 0 instructions and would
> > > > require
> > > > quite a bit
> > > > * of case-by-case work. It's just not worth it.
> > > > */
> > > > - brw_ADD(p, addr, indirect_byte_offset,
> > > > brw_imm_uw(imm_byte_offset));
> > > > + if (devinfo->gen >= 8 || devinfo->is_haswell ||
> > > > type_sz(reg.type) < 8) {
> > > > + brw_ADD(p, addr, indirect_byte_offset,
> > > > brw_imm_uw(imm_byte_offset));
> > > > + } else {
> > > > + /* IVB reads two address register components per
> > > > channel
> > > > for
> > > > + * indirectly addressed 64-bit sources, so we need to
> > > > initialize
> > > > + * adjacent address components to consecutive dwords
> > > > of
> > > > the source
> > > > + * region by emitting two separate ADD
> > > > instructions. Found
> > > > + * empirically.
> > > > + */
> > > > + assert(inst->exec_size <= 4);
> > > > + brw_push_insn_state(p);
> > > > + brw_set_default_exec_size(p, cvt(inst->exec_size) -
> > > > 1);
> > > > +
> > > > + brw_ADD(p, spread(addr, 2), indirect_byte_offset,
> > > > + brw_imm_uw(imm_byte_offset));
> > > > + brw_inst_set_no_dd_clear(devinfo, brw_last_inst,
> > > > true);
> > > > +
> > > > + brw_ADD(p, spread(suboffset(addr, 1), 2),
> > > > indirect_byte_offset,
> > > > + brw_imm_uw(imm_byte_offset + 4));
> > > > + brw_inst_set_no_dd_check(devinfo, brw_last_inst,
> > > > true);
> > > > +
> > > > + brw_pop_insn_state(p);
> > > > + }
> > > > +
> > > > struct brw_reg ind_src = brw_VxH_indirect(0, 0);
> > > >
> > > > brw_inst *mov = brw_MOV(p, dst, retype(ind_src,
> > > > dst.type));
> > > > --
> > > > 2.11.0
> > > >
> > > > _______________________________________________
> > > > mesa-dev mailing list
> > > > [email protected]
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: This is a digitally signed message part
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
