On 14.02.2017 11:58, Marek Olšák wrote:
On Feb 14, 2017 9:06 AM, "Nicolai Hähnle" <[email protected]
<mailto:[email protected]>> wrote:
On 13.02.2017 18:01, Marek Olšák wrote:
From: Marek Olšák <[email protected] <mailto:[email protected]>>
So that we can disable u_vbuf for GL core profiles.
This is a v2 of the previous VI-only patch.
It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.
---
src/gallium/drivers/radeonsi/si_descriptors.c | 1 -
src/gallium/drivers/radeonsi/si_pipe.c | 12 +++++++++---
2 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 3c98176..8abbf10 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -1014,21 +1014,20 @@ bool
si_upload_vertex_buffer_descriptors(struct si_context *sctx)
* up so that the hardware sees four
components as
* being inside the buffer if and only
if the first
* three components are in the buffer.
*
* Since the offset and stride are
guaranteed to be
* 4-byte aligned, this alignment will
never cross the
* winsys buffer boundary.
*/
size3 = (fix_size3 >> (2 * i)) & 3;
if (vb->stride && size3) {
- assert(offset % 4 == 0 &&
vb->stride % 4 == 0);
assert(size3 <= 2);
desc[2] = align(desc[2], size3 * 2);
I think there's still a bug with this. Consider the following setup:
3-element GL_UNSIGNED_BYTE vertex attribute, stride = 3, pointing at
an offset e.g. 12 bytes before the end of a buffer for a total of 4
vertices.
We use unaligned 8_8_8_8 to read this vertex attribute, but when the
last vertex is read, its last byte (of the extended 8_8_8_8) is
out-of-bounds of the buffer. The hardware does bounds checks
all-or-nothing (i.e., not per-channel) and drops the entire load as
if the vertex data were out-of-bounds.
This is admittedly a bit of a ridiculous case that should not
prevent us from dropping u_vbuf. It'll probably need a shader
work-around of loading each channel individually, like with your
recent GL_DOUBLE patch. At least I can't think of anything else
right now.
In any case, the comment above that code snippet is no longer really
correct.
We can also say that such cases don't exist in the real world and call
it a day.
I think we should strive for better than that.
Or we can increase the buffer size and potentially risk a VM
fault.
The small problem with fix_fetch is that we have only one unused value
left in the 4 bits, but we need one fix for 888 and another for 161616.
I see the problem. To be honest, I'd still prefer a proper fix, even if
it requires an additional bit there, but I guess we can increase the
buffer size and print a warning when it crosses a page boundary.
Nicolai
Marek
Cheers,
Nicolai
}
}
desc[3] = velems->rsrc_word3[i];
if (first_vb_use_mask & (1 << i)) {
radeon_add_to_buffer_list(&sctx->b,
&sctx->b.gfx,
(struct
r600_resource*)vb->buffer,
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index 8806027..ec324b8 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -353,23 +353,20 @@ static int si_get_param(struct
pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
case PIPE_CAP_SM3:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
case PIPE_CAP_PRIMITIVE_RESTART:
case PIPE_CAP_CONDITIONAL_RENDER:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_INDEP_BLEND_ENABLE:
case PIPE_CAP_INDEP_BLEND_FUNC:
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
- case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
- case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
- case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_USER_INDEX_BUFFERS:
case PIPE_CAP_USER_CONSTANT_BUFFERS:
case PIPE_CAP_START_INSTANCE:
case PIPE_CAP_NPOT_TEXTURES:
case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
case PIPE_CAP_TGSI_INSTANCEID:
@@ -455,20 +452,29 @@ static int si_get_param(struct
pipe_screen* pscreen, enum pipe_cap param)
case PIPE_CAP_GLSL_FEATURE_LEVEL:
if (si_have_tgsi_compute(sscreen))
return 450;
return HAVE_LLVM >= 0x0309 ? 420 :
HAVE_LLVM >= 0x0307 ? 410 : 330;
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return MIN2(sscreen->b.info.max_alloc_size,
INT_MAX);
+ case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
+ case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
+ case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
+ /* SI doesn't support unaligned loads.
+ * CIK needs DRM 2.50.0 on radeon. */
+ return sscreen->b.chip_class == SI ||
+ (sscreen->b.info.drm_major == 2 &&
+ sscreen->b.info.drm_minor < 50);
+
/* Unsupported features. */
case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_USER_VERTEX_BUFFERS:
case PIPE_CAP_FAKE_SW_MSAA:
case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
case PIPE_CAP_VERTEXID_NOBASE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev