Samuel Iglesias Gonsálvez <[email protected]> writes: > From: Iago Toral Quiroga <[email protected]> > > Reviewed-by: Kenneth Graunke <[email protected]> > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 4827dea..2383d2c 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -194,8 +194,15 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder > &bld, > else > op = FS_OPCODE_VARYING_PULL_CONSTANT_LOAD; > > + /* The pull load message will load a vec4 (16 bytes). If we are loading > + * a double this means we are only loading 2 elements worth of data. > + * We also want to use a 32-bit data type for the dst of the load > operation > + * so other parts of the driver don't get confused about the size of the > + * result. > + */ > int regs_written = 4 * (bld.dispatch_width() / 8) * scale; > - fs_reg vec4_result = fs_reg(VGRF, alloc.allocate(regs_written), dst.type); > + fs_reg vec4_result = fs_reg(VGRF, alloc.allocate(regs_written), > + BRW_REGISTER_TYPE_F); > fs_inst *inst = bld.emit(op, vec4_result, surf_index, vec4_offset); > inst->regs_written = regs_written; > > @@ -208,7 +215,15 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder > &bld, > inst->mlen = 1 + bld.dispatch_width() / 8; > } > > - bld.MOV(dst, offset(vec4_result, bld, ((const_offset & 0xf) / 4) * > scale)); > + if (type_sz(dst.type) == 8) {
Add 'assert(scale == 1)' here because you're not taking it into account. > + shuffle_32bit_load_result_to_64bit_data( > + bld, retype(vec4_result, dst.type), vec4_result, 2); > + } > + > + vec4_result.type = dst.type; > + int type_slots = MAX2(type_sz(dst.type) / 4, 1); This code is definitely not going to work for type sizes other than 8 or 4, the MAX2(..., 1) will only conceal the problem. It seems weird that you convert the type size into an integer number of 32-bit slots (potentially introducing rounding errors) only to convert it back to bytes in the next line (because const_offset is expressed in bytes regardless). How about: | bld.MOV(dst, offset(vec4_result, bld, | (const_offset & 0xf) / type_sz(vec4_result.type) * scale)); and remove the type_slots variable. With that fixed: Reviewed-by: Francisco Jerez <[email protected]> (It would be nice to do the shuffle_32bit_load_result_to_64bit_data() directly into the destination and clean up the retyping slightly, but I guess that can be done as a follow-up). > + bld.MOV(dst, offset(vec4_result, bld, > + ((const_offset & 0xf) / (4 * type_slots)) * scale)); > } > > /** > -- > 2.5.0 > > _______________________________________________ > mesa-dev mailing list > [email protected] > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
