On 04/03/17 00:32, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez <[email protected]> writes: > >> We need to split DF instructions in two on IVB/BYT as it needs an >> execsize 8 to process 4 DF values (one GRF in total). >> >> v2: >> - Rename helper and make it static inline function (Matt). >> - Fix indention and add braces (Matt). >> >> Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> >> --- >> src/mesa/drivers/dri/i965/brw_ir_vec4.h | 14 ++++++++++++++ >> src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 ++++++- >> src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 15 +++++++++++++-- >> 3 files changed, 33 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h >> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h >> index 57fc6be8f89..9d29c3fb944 100644 >> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h >> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h >> @@ -405,6 +405,20 @@ regs_read(const vec4_instruction *inst, unsigned i) >> reg_size); >> } >> >> +static inline unsigned >> +get_exec_type_size(const vec4_instruction *inst) >> +{ >> + unsigned exec_type_size = 0; >> + >> + for (int i = 0; i < 3; i++) { >> + if (inst->src[i].type != BAD_FILE) { >> + exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type)); >> + } >> + } >> + >> + return exec_type_size; >> +} >> + >> } /* namespace brw */ >> >> #endif >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> index 5e60eb657a7..7080c93e550 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> @@ -2092,6 +2092,10 @@ get_lowered_simd_width(const struct gen_device_info >> *devinfo, >> if (inst->opcode == BRW_OPCODE_SEL && type_sz(inst->dst.type) == 8) >> lowered_width = MIN2(lowered_width, 4); >> > > Maybe add a short comment here explaining why you need to do this? >
OK
>> + if (devinfo->gen == 7 && !devinfo->is_haswell &&
>> + (get_exec_type_size(inst) == 8 || type_sz(inst->dst.type) == 8))
>> + lowered_width = MIN2(lowered_width, 4);
>> +
>> /* HSW PRM, 3D Media GPGPU Engine, Region Alignment Rules for Direct
>> * Register Addressing:
>> *
>> @@ -2199,7 +2203,8 @@ vec4_visitor::lower_simd_width()
>> inst->insert_before(block, copy);
>> }
>> } else {
>> - dst = horiz_offset(inst->dst, channel_offset);
>> + if (inst->dst.file != ARF)
>> + dst = horiz_offset(inst->dst, channel_offset);
>
> This doesn't look right, you need to give the same treatment to ARF
> registers as to other registers. If what you're trying to avoid here is
> shifting the null register incorrectly, I suggest you fix horiz_offset()
> to return the argument unchanged if it's the null register, because the
> null register logically behaves like a scalar register (this is also
> consistent with the way the FS back-end handles the same situation).
>
OK, thanks.
>> }
>> linst->dst = dst;
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> index 847a01bd43c..7bb1ab1879c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> @@ -1522,14 +1522,25 @@ generate_code(struct brw_codegen *p,
>> brw_set_default_saturate(p, inst->saturate);
>> brw_set_default_mask_control(p, inst->force_writemask_all);
>> brw_set_default_acc_write_control(p, inst->writes_accumulator);
>> - brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
>>
>> - assert(inst->group % inst->exec_size == 0);
>> + bool is_ivb_df = devinfo->gen == 7 &&
>> + !devinfo->is_haswell &&
>> + (get_exec_type_size(inst) == 8 ||
>> + inst->dst.type == BRW_REGISTER_TYPE_DF);
>> +
>> + assert(inst->group % inst->exec_size == 0 ||
>> + is_ivb_df);
>> +
>> assert(inst->group % 8 == 0 ||
>> inst->dst.type == BRW_REGISTER_TYPE_DF ||
>> inst->src[0].type == BRW_REGISTER_TYPE_DF ||
>> inst->src[1].type == BRW_REGISTER_TYPE_DF ||
>> inst->src[2].type == BRW_REGISTER_TYPE_DF);
>> +
>> + if (is_ivb_df && inst->exec_size < 8)
>> + inst->exec_size *= 2;
>> + brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
>> +
>
> Same comment here as for its FS counterpart... Please let's not modify
> the IR from the generator.
>
Right. I am going to fix it.
Sam
>> if (!inst->force_writemask_all)
>> brw_set_default_group(p, inst->group);
>>
>> --
>> 2.11.0
>>
>> _______________________________________________
>> mesa-dev mailing list
>> [email protected]
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
