Bernd Schmidt <[email protected]> writes:
> On 10/03/11 10:23, Richard Sandiford wrote:
>> Bernd Schmidt <[email protected]> writes:
>>> On 09/14/11 11:03, Richard Sandiford wrote:
>>>> ...I didn't see from an admittedly quick read of the patch how you
>>>> handle memory disambiguation between iterations. If a loop includes:
>>>>
>>>> lb $3,($4)
>>>> sb $5,1($4)
>>>>
>>>> then the two instructions can be reordered by normal ebb scheduling,
>>>> but the inter-iteration conflict is important for modulo scheduling.
>>>
>>> There's nothing special to handle, I think. sched-deps should see that
>>> the ld in iteration 1 is DEP_ANTI against the sb in iteration 0
>>> (assuming there's also an increment).
>>
>> For the record, I don't agree that we should rely on register
>> dependencies to handle memory dependencies. It's possible for MEMs in
>> different iterations to alias without there being a register dependence
>> between them.
>
> I don't know what you mean by "register dependence" here. sched-deps
> analyzes MEMs for whether they depend on each other, but the term
> "register dependence" suggests you aren't thinking about this.
Well, as you said, sched-deps uses more exact memory disambiguation
than SMS. But that's for a reason: if we're scheduling a loop body
using haifa-sched, we only care about intra-iteration memory
dependencies. But modulo scheduling allows movement between
iterations as well.
So my original point was that it looked like you were adding support
for inter-iteration scheduling while still using intra-iteration memory
dependencies. I (probably wrongly, sorry) took your response to mean
that inter-iteration memory dependencies would be accompanied by some
sort of register dependency, so that doesn't matter.
> If there was a problem, then rtl loop unrolling would also cause it
> (since the modulo scheduling patch effectively does nothing else). Are
> you sure there really is a problem?
I'm not sure I follow. Unrolling a loop {A, B, C, D} gives:
A1
B1
C1
D1
A2
B2
C2
D2
A3
B3
C3
D3
so inter-iteration dependencies aren't a problem. Whereas I thought your
modulo instruction did:
A1
B1 A2
C1 B2 A3
D1 C2 B3
D2 C3
D3
so if D1 writes to memory that A2 (but not A1) _might_ load, then the
loop doesn't behave the same way.
Richard