Francisco Jerez <[email protected]> writes:

> Matt Turner <[email protected]> writes:
>
>> On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <[email protected]> 
>> wrote:
>>> Fixes rewrite by the register coalesce pass of references to
>>> individual halves of 16-wide coalesced registers.
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 8 ++++++--
>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>>> index 09f0fad..2a26a46 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
>>> @@ -211,9 +211,13 @@ fs_visitor::register_coalesce()
>>>              continue;
>>>           }
>>>           reg_to_offset[offset] = inst->dst.reg_offset;
>>> -         if (inst->src[0].width == 16)
>>> -            reg_to_offset[offset + 1] = inst->dst.reg_offset + 1;
>>>           mov[offset] = inst;
>>> +
>>> +         if (inst->exec_size * type_sz(inst->src[0].type) > REG_SIZE) {
>>> +            reg_to_offset[offset + 1] = inst->dst.reg_offset + 1;
>>> +            mov[offset + 1] = inst;
>>> +         }
>>> +
>>>           channels_remaining -= inst->regs_written;
>>>        }
>>>
>>> --
>>> 2.1.3
>>
>> I can believe it. It would help me to understand if we had an example
>> of a sequence of instructions that this code didn't handle properly.
>
> The problem is in the "rewrite" phase of the register coalesce pass
> (roughly lines 264-283).  It won't fix up instructions that reference
> some specific offset of the coalesced register if mov[i] is NULL for
> that offset, as is the case for the second half of a 16-wide move.  For
> example:
>
> | ADD (16) vgrf0:f, vgrf0:f, 1.0:f
> | MOV (16) vgrf1:f, vgrf0:f
> | MOV (8)  vgrf2:f, vgrf0+1:f { sechalf }
>
> will get incorrectly register-coalesced into:
>
> | ADD (16) vgrf1:f, vgrf1:f, 1.0:f
> | MOV (8)  vgrf2:f, vgrf0+1:f { sechalf }

Ping.  The SIMD lowering pass emits this kind of code so this will lead
to actual piglit regressions
(e.g. 
tests/spec/arb_shader_texture_lod/execution/glsl-fs-shadow2DGradARB-07.shader_test).

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to