Kenneth Graunke writes:
> On Wednesday, December 14, 2016 2:18:16 PM PST Francisco Jerez wrote:
>> Francisco Jerez writes:
>>
>> > Kenneth Graunke writes:
>> >
>> >> On Friday, December 9, 2016 11:03:29 AM PST Francisco Jerez wrote:
>> >>> Asking the DC for less than one cacheline (4 owords) o
On Wednesday, December 14, 2016 2:18:16 PM PST Francisco Jerez wrote:
> Francisco Jerez writes:
>
> > Kenneth Graunke writes:
> >
> >> On Friday, December 9, 2016 11:03:29 AM PST Francisco Jerez wrote:
> >>> Asking the DC for less than one cacheline (4 owords) of data for
> >>> uniform pull cons
Francisco Jerez writes:
> Kenneth Graunke writes:
>
>> On Friday, December 9, 2016 11:03:29 AM PST Francisco Jerez wrote:
>>> Asking the DC for less than one cacheline (4 owords) of data for
>>> uniform pull constants is suboptimal because the DC cannot request
>>> less than that from L3, result
Kenneth Graunke writes:
> On Friday, December 9, 2016 11:03:29 AM PST Francisco Jerez wrote:
>> Asking the DC for less than one cacheline (4 owords) of data for
>> uniform pull constants is suboptimal because the DC cannot request
>> less than that from L3, resulting in wasted bandwidth and unnec
On Friday, December 9, 2016 11:03:29 AM PST Francisco Jerez wrote:
> Asking the DC for less than one cacheline (4 owords) of data for
> uniform pull constants is suboptimal because the DC cannot request
> less than that from L3, resulting in wasted bandwidth and unnecessary
> message dispatch overh
Asking the DC for less than one cacheline (4 owords) of data for
uniform pull constants is suboptimal because the DC cannot request
less than that from L3, resulting in wasted bandwidth and unnecessary
message dispatch overhead, and exacerbating the IVB L3 serialization
bug. The following table su