-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 07/11/2011 12:49 PM, Dan McCabe wrote:
> On 07/09/2011 08:56 AM, Chad Versace wrote:
>> Up until this point, we incorrectly believed that the stencil buffer is
>> Y-tiled. In fact, it is W tiled. From PRM Vol 1 Part 2 Section 4.5.2.1
>> W-Major Tile Format:
>> "W-Major Tile Format is used for separate stencil."
>>
>> Since the stencil buffer is allocated with I915_TILING_Y, the span
>> functions must decode W tiling through a Y tiled fence.
>>
>> On gen5 with intel_screen.hw_must_use_separate_stencil enabled,
>> Fixes-Piglit-test: stencil-drawpixels
>> Fixes-Piglit-test: stencil-scissor-clear
>> Fixes-Piglit-test: readpixels-24_8
>>
>> Note: This is a candidate for the 7.11 branch
>> Signed-off-by: Chad Versace<[email protected]>
>> ---
>> src/mesa/drivers/dri/intel/intel_span.c | 52
>> +++++++++++++++++++++----------
>> 1 files changed, 35 insertions(+), 17 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/intel/intel_span.c
>> b/src/mesa/drivers/dri/intel/intel_span.c
>> index 153803f..f39c008 100644
>> --- a/src/mesa/drivers/dri/intel/intel_span.c
>> +++ b/src/mesa/drivers/dri/intel/intel_span.c
>> @@ -141,28 +141,46 @@ intel_set_span_functions(struct intel_context *intel,
>> /**
>> * \brief Get pointer offset into stencil buffer.
>> *
>> - * The stencil buffer interleaves two rows into one. Yay for crazy hardware.
>> - * The table below demonstrates how the pointer arithmetic behaves for a
>> buffer
>> - * with positive stride (s=stride).
>> - *
>> - * x | y | byte offset
>> - * --------------------------
>> - * 0 | 0 | 0
>> - * 0 | 1 | 1
>> - * 1 | 0 | 2
>> - * 1 | 1 | 3
>> - * ... | ... | ...
>> - * 0 | 2 | s
>> - * 0 | 3 | s + 1
>> - * 1 | 2 | s + 2
>> - * 1 | 3 | s + 3
>> - *
>> + * The stencil buffer is W-tiled, yet the drm buffer is allocated with
>> + * I915_TILING_Y. So here we must decode the W tiling through a Y fence.
>> *
>> + * From PRM Vol 1 Part 2 Section 4.5.2.1 W-Major Tile Format:
>> + * "W-Major Tile Format is used for separate stencil."
>> */
>> static inline intptr_t
>> intel_offset_S8(int stride, GLint x, GLint y)
>> {
>> - return 2 * ((y / 2) * stride + x) + y % 2;
>> + /* f: (x, y) -> (fx, fy) */
>> + int fx = x / 8;
>> + int fy = y / 4;
>> +
>> + /* e: (x, y) -> (ex, 0) */
>> + int ex = (x % 8) / 4;
>> +
>> + /* d: (x, y) -> (dx, dy) */
>> + int dx = (x % 4) / 2;
>> + int dy = (y % 4) / 2;
>> +
>> + /* c: (x, y) -> (cx, cy) */
>> + int cx = x % 2;
>> + int cy = y % 2;
>> +
>> + int s = stride;
>> + intptr_t o = 0;
>> +
>> + if (s> 0) {
>> + /*f*/ o += 16 * fx + 4 * s * fy;
>> + /*e*/ o += 2 * s * ex;
>> + /*d*/ o += 4 * dx + 8 * dy;
>> + /*c*/ o += cx + 2 * cy;
>> + } else {
>> + /*f*/ o += 16 * fx + 4 * s * fy;
>> + /*e*/ o += 2 * s * (1 - ex);
>> + /*d*/ o += 4 * dx + 8 * (1 - dy);
>> + /*c*/ o += cx + 2 * (1 - cy);
>> + }
>> +
>> + return o;
>> }
>>
>> #define WRITE_STENCIL(x, y, src) buf[intel_offset_S8(stride, x, y)] = src;
> Can stride ever be negative? If so, why?
Yes. The stride is negative for window-system renderbuffers.
> If the app ever specified a negative stride, it could have fixed at buffer
> creation time (by also adjusting the buffer base address). No need to worry
> about that issue thereafter.
No-can-do. As far as I know, X demands that its buffers have negative stride.
> On the other hand, negative strides could be considered evil :).
Yes, they are :)
> Also, can x or y ever be negative?
No.
[snip]
> Using "*", "/" and "%" for bit manipulations of pixel addresses should be
> avoided. Shifts and masks are clearer for bit manipulation, IMO.
>
> Regarding the offset computation, it might be useful to think of x,y
> addresses of the tile and then of x,y addresses within the tile to make the
> code more readable and perhaps simplify your computations.
Damn crazy hardware. Here we need to decode a W tile through a Y fence, so the
"x,y addresses of the tile and then of x,y addresses within the tile" are
non-existent. The fence mapping has carved up the W tile and made a mess of it.
So, there are no tile addresses to compute.
Here is the little meaning that my equations possess:
- (fx + ex, fy) is the address of a 4x4 block in which (x, y) resides.
- Decompose that 4x4 block into 2x2 blocks. (dx, dy) is
the address of that 2x2 block within the 4x4 block.
- (cx, cy) is the address of (x, y) within that 2x2 block.
> For example, if stride is a power of 2, tile size is 2x2 pixels, and your x
> and y address bits look like (upper case bits are tile bits and lower case
> letters are intra-tile bits):
> XXXXXx
> and
> YYYYYy
> then the offset for that pixel and a power of two stride has a bit pattern
> that looks like
> YYYYXXXXyx
> But the devil is in the details and this might not be valid for our
> particular (crazy?) hardware. YMMV.
Can a similiar set of bit operations replicate intel_offset_S8()?
> cheers, danm
- --
Chad Versace
[email protected]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJOG4EuAAoJEAIvNt057x8iZowQAIJXGQBGlBDUC9US7U+r83Jp
SulXEua8kE/KHfptTN7WoPgmdmrRKH2JMcIC2kcCUE6I9/PdSRV2VIaVgWr/pMdP
Z1SUVM4lQP57rPRe5iZmlS43u4gUSQaEr0fbYU6RynD4tXphH5W8gnF5QdF6/pGo
LNIaGJ4ya3euAWX+aJwYI7Z+skbZ1ucyDUbII+RpDOnrO80v4E8uVrrGizr5PQtp
VrX6Zpr1s800kgUXBtkiREK71VSSUYSxFEQ3xP/xBYzUcyhJxpqapN7qOWu9tOI+
2OGy9MX2JuMwqmjVDLg8ap6dsQ92OjKtsgatNJ94Llg5xfP1hN7t91IRWtYpK9/V
v90d/OEu4R1z+8864oqmo7oKlAKersFhhrvIUQ22N6lVxzQqjoR+Ph9BXUwhgnKb
iJ3JMDE90UrJWeO+EnqBbL75HgvgcjZr15PlqZSKNvcNd5M5R/7lPKg/3Zbk88cy
NRJtnrQdWrFk7aGrdi2bfIqm1teCq9kWBBbYuKQn1buXsXtMEC46SJKS9vSua/pF
nVvr1361M9c+Whd4jTSErd/vLvgUA0dxptWPAestalT6i7yYoeq+ZxA236TIqvDZ
1AbSiDeXzslv3ArshbWrtQAZw33cop1z9HGiYMNCdbsdLVJtN1zf0Y25VV3nBcom
3anb5cgbYF601N7wI+CN
=Y7Df
-----END PGP SIGNATURE-----
_______________________________________________
mesa-dev mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/mesa-dev