On Wed, 2017-02-15 at 11:41 -0800, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez <[email protected]> writes: > > > From: "Juan A. Suarez Romero" <[email protected]> > > > > In IVB and BYT, both regioning parameters and execution sizes are > > measured as > > 32-bits element size. > > > > So when we have something like: > > > > mov(8) g2<1>DF g3<4,4,1>DF > > > > We are not actually moving 8 doubles (our intention), but 4 > > doubles. > > > > We need to double the parameters to cope with this issue. However, > > horizontal strides don't behave as they're supposed to on IVB > > for DF regions, they will cause each 32-bit half of DF sources to > > be > > strided individually, and doubling the value won't make any > > difference. > > > > v2: > > - Use devinfo directly (Matt). > > - Use Baytrail instead of Valleview (Matt). > > - Use IvyBridge instead of Ivy (Matt) > > - Double the exec_size in code emission (Curro) > > > > v3: > > - Change hstride doubling by an assert and fix commit log (Curro). > > - Substitute remaining compiler->devinfo by devinfo (Curro). > > > > Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> > > --- > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 51 > > ++++++++++++++++++++++---- > > 1 file changed, 44 insertions(+), 7 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > index 26ffbb169d2..b0d5732ac5c 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > @@ -54,13 +54,14 @@ brw_file_from_reg(fs_reg *reg) > > } > > > > static struct brw_reg > > -brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned gen, bool > > compressed) > > +brw_reg_from_fs_reg(const struct gen_device_info *devinfo, fs_inst > > *inst, > > + fs_reg *reg, bool compressed) > > { > > struct brw_reg brw_reg; > > > > switch (reg->file) { > > case MRF: > > - assert((reg->nr & ~BRW_MRF_COMPR4) < BRW_MAX_MRF(gen)); > > + assert((reg->nr & ~BRW_MRF_COMPR4) < BRW_MAX_MRF(devinfo- > > >gen)); > > /* Fallthrough */ > > case VGRF: > > if (reg->stride == 0) { > > @@ -93,6 +94,35 @@ brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, > > unsigned gen, bool compressed) > > const unsigned width = MIN2(reg_width, phys_width); > > brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), > > reg->nr, 0); > > brw_reg = stride(brw_reg, width * reg->stride, width, > > reg->stride); > > + /* From the IvyBridge PRM (EU Changes by Processor > > Generation, page 13): > > + * "Each DF (Double Float) operand uses an element size > > of 4 rather > > + * than 8 and all regioning parameters are twice what > > the values > > + * would be based on the true element size: ExecSize, > > Width, > > + * HorzStride, and VertStride. Each DF operand uses a > > pair of > > + * channels and all masking and swizzing should be > > adjusted > > + * appropriately." > > + * > > + * From the IvyBridge PRM (Special Requirements for > > Handling Double > > + * Precision Data Types, page 71): > > + * "In Align1 mode, all regioning parameters like > > stride, execution > > + * size, and width must use the syntax of a pair of > > packed > > + * floats. The offsets for these data types must be 64- > > bit > > + * aligned. The execution size and regioning parameters > > are in terms > > + * of floats." > > + * > > + * Summarized: when handling DF-typed arguments, > > ExecSize, > > + * VertStride, and Width must be doubled, and HorzStride > > must be > > + * doubled when the region is not scalar. > > + * > > The comment above seems misleading still, doubling the HorzStride > value > is almost guaranteed not to give the intended result, regardless of > whether the region is scalar. With that clarified patch is: > > Reviewed-by: Francisco Jerez <[email protected]> >
OK. Thanks!
Sam
> > + * It applies to BayTrail too.
> > + */
> > + if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > + type_sz(reg->type) == 8) {
> > + brw_reg.width++;
> > + if (brw_reg.vstride > 0)
> > + brw_reg.vstride++;
> > + assert(brw_reg.hstride == BRW_HORIZONTAL_STRIDE_1);
> > + }
> > }
> >
> > brw_reg = retype(brw_reg, reg->type);
> > @@ -1586,9 +1616,8 @@ fs_generator::generate_code(const cfg_t *cfg,
> > int dispatch_width)
> > brw_set_default_group(p, inst->group);
> >
> > for (unsigned int i = 0; i < inst->sources; i++) {
> > - src[i] = brw_reg_from_fs_reg(inst, &inst->src[i],
> > devinfo->gen,
> > - compressed);
> > -
> > + src[i] = brw_reg_from_fs_reg(devinfo, inst,
> > + &inst->src[i], compressed);
> > /* The accumulator result appears to get used for the
> > * conditional modifier generation. When negating a UD
> > * value, there is a 33rd bit generated for the sign in
> > the
> > @@ -1599,7 +1628,8 @@ fs_generator::generate_code(const cfg_t *cfg,
> > int dispatch_width)
> > inst->src[i].type != BRW_REGISTER_TYPE_UD ||
> > !inst->src[i].negate);
> > }
> > - dst = brw_reg_from_fs_reg(inst, &inst->dst, devinfo->gen,
> > compressed);
> > + dst = brw_reg_from_fs_reg(devinfo, inst,
> > + &inst->dst, compressed);
> >
> > brw_set_default_access_mode(p, BRW_ALIGN_1);
> > brw_set_default_predicate_control(p, inst->predicate);
> > @@ -1608,7 +1638,14 @@ fs_generator::generate_code(const cfg_t
> > *cfg, int dispatch_width)
> > brw_set_default_saturate(p, inst->saturate);
> > brw_set_default_mask_control(p, inst->force_writemask_all);
> > brw_set_default_acc_write_control(p, inst-
> > >writes_accumulator);
> > - brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
> > +
> > + unsigned exec_size = inst->exec_size;
> > + if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > + (get_exec_type_size(inst) == 8 || type_sz(inst-
> > >dst.type) == 8)) {
> > + exec_size *= 2;
> > + }
> > +
> > + brw_set_default_exec_size(p, cvt(exec_size) - 1);
> >
> > assert(inst->force_writemask_all || inst->exec_size >= 4);
> > assert(inst->force_writemask_all || inst->group % inst-
> > >exec_size == 0);
> > --
> > 2.11.0
> >
> > _______________________________________________
> > mesa-dev mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: This is a digitally signed message part
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
