The pass was discovered to cause problems with the MUL+MACH combination we emit for nir_[iu]mul_high. In an experimental branch of mine, I ran into issues where the MUL+MACH ended up using a strided source due to working on half of a uint64_t and the new lowering pass helpfully tried to fix the multiply which wrote to an unstriated accumulator. Not only did the multiply not need to be fixed but the "fix" ended up breaking it because a MOV to the accumulator is not the same as using it as a multiply destination due to the magic way the 33/64 bits of the accumulator are handled for different instruction types.
Fixes: efa4e4bc5fc "intel/fs: Introduce regioning lowering pass" Cc: Francisco Jerez <[email protected]> --- src/intel/compiler/brw_fs_lower_regioning.cpp | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp b/src/intel/compiler/brw_fs_lower_regioning.cpp index cc4163b4c2c..b8a89e82272 100644 --- a/src/intel/compiler/brw_fs_lower_regioning.cpp +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp @@ -53,7 +53,13 @@ namespace { unsigned required_dst_byte_stride(const fs_inst *inst) { - if (type_sz(inst->dst.type) < get_exec_type_size(inst) && + if (inst->dst.is_accumulator()) { + /* Even though it's not explicitly documented in the PRMs or the + * BSpec, writes to the accumulator appear to not need any special + * treatment with respect too their destination stride alignment. + */ + return inst->dst.stride * type_sz(inst->dst.type); + } else if (type_sz(inst->dst.type) < get_exec_type_size(inst) && !is_byte_raw_mov(inst)) { return get_exec_type_size(inst); } else { @@ -316,6 +322,14 @@ namespace { bool lower_dst_region(fs_visitor *v, bblock_t *block, fs_inst *inst) { + /* We cannot replace the result of an integer multiply which writes the + * accumulator because MUL+MACH pairs act on the accumulator as a 64-bit + * value whereas the MOV will act on only 32 or 33 bits of the + * accumulator. + */ + assert(inst->opcode != BRW_OPCODE_MUL || !inst->dst.is_accumulator() || + brw_reg_type_is_floating_point(inst->dst.type)); + const fs_builder ibld(v, block, inst); const unsigned stride = required_dst_byte_stride(inst) / type_sz(inst->dst.type); -- 2.20.1 _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
