On Sat, Feb 17, 2018 at 1:42 AM, Jason Ekstrand <[email protected]> wrote:
> On Sat, Feb 17, 2018 at 1:21 AM, Kenneth Graunke <[email protected]> > wrote: > >> On Saturday, February 17, 2018 1:09:10 AM PST Jason Ekstrand wrote: >> > This fixes a pile of hangs caused by the recent shuffling of resolves >> > and transitions. The particularly problematic case is when you have at >> > least three attachments with load ops of CLEAR, LOAD, CLEAR. In this >> > case, we execute the first CLEAR followed by a MI memcpy to copy the >> > clear values over for the LOAD followed by a second CLEAR. The MI >> > commands cause the first CLEAR to hang which causes us to get stuck on >> > the 3DSTATE_MULTISAMPLE in the second CLEAR. >> > >> > We also add guards for BLORP to fix the same issue. These shouldn't >> > actually do anything right now because the only use of indirect clears >> > in BLORP today is for resolves which are already guarded by a render >> > cache flush and CS stall. However, this will guard us against potential >> > issues in the future. >> > >> > Cc: Kenneth Graunke <[email protected]> >> > Cc: Nanley Chery <[email protected]> >> > --- >> > src/intel/vulkan/genX_blorp_exec.c | 10 ++++++++++ >> > src/intel/vulkan/genX_gpu_memcpy.c | 22 ++++++++++++++++++++++ >> > 2 files changed, 32 insertions(+) >> > >> > diff --git a/src/intel/vulkan/genX_blorp_exec.c >> b/src/intel/vulkan/genX_blorp_exec.c >> > index 04f7675..f956715 100644 >> > --- a/src/intel/vulkan/genX_blorp_exec.c >> > +++ b/src/intel/vulkan/genX_blorp_exec.c >> > @@ -200,6 +200,16 @@ genX(blorp_exec)(struct blorp_batch *batch, >> > genX(cmd_buffer_config_l3)(cmd_buffer, cfg); >> > } >> > >> > +#if GEN_GEN == 7 >> > + /* The MI_LOAD/STORE_REGISTER_MEM commands which BLORP uses to >> implement >> > + * indirect fast-clear colors can cause GPU hangs if we don't stall >> first. >> > + * See genX(cmd_buffer_mi_memcpy) for more details. >> > + */ >> > + assert(params->src.clear_color_addr.buffer == NULL); >> > + if (params->dst.clear_color_addr.buffer) >> > + cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_CS_STALL_BIT; >> > +#endif >> > + >> > genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer); >> > >> > genX(flush_pipeline_select_3d)(cmd_buffer); >> > diff --git a/src/intel/vulkan/genX_gpu_memcpy.c >> b/src/intel/vulkan/genX_gpu_memcpy.c >> > index f3ada93..1d527fb 100644 >> > --- a/src/intel/vulkan/genX_gpu_memcpy.c >> > +++ b/src/intel/vulkan/genX_gpu_memcpy.c >> > @@ -62,6 +62,28 @@ genX(cmd_buffer_mi_memcpy)(struct anv_cmd_buffer >> *cmd_buffer, >> > assert(dst_offset % 4 == 0); >> > assert(src_offset % 4 == 0); >> > >> > +#if GEN_GEN == 7 >> > + /* On gen7, the combination of commands used >> here(MI_LOAD_REGISTER_MEM >> > + * and MI_STORE_REGISTER_MEM) can cause GPU hangs if any rendering >> is >> > + * in-flight when they are issued even if the memory touched is not >> > + * currently active for rendering. The weird bit is that it is not >> the >> > + * MI_LOAD/STORE_REGISTER_MEM commands which hang but rather the >> in-flight >> > + * rendering hangs such that the next stalling command after the >> > + * MI_LOAD/STORE_REGISTER_MEM commands will catch the hang. >> > + * >> > + * It is unclear exactly why this hang occurs. Both MI commands >> come with >> > + * warnings about the 3D pipeline but that doesn't seem to fully >> explain >> > + * it. My (Jason's) best theory is that it has something to do >> with the >> > + * fact that we're using a GPU state register as our temporary and >> that >> > + * something with reading/writing it is causing problems. >> > + * >> > + * In order to work around this issue, we emit a PIPE_CONTROL with >> the >> > + * command streamer stall bit set. >> > + */ >> > + cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_CS_STALL_BIT; >> > + genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer); >> > +#endif >> > + >> > for (uint32_t i = 0; i < size; i += 4) { >> > const struct anv_address src_addr = >> > (struct anv_address) { src, src_offset + i}; >> > >> >> This seems reasonable enough. I can't say I understand why this is >> necessary, > > > That makes two of us and, when Nanley gets around to looking at it, I'm > sure it'll make 3. :-) > > >> but if it fixes the problem...it's probably best to just >> do it. We can always change it later if we find new information. >> > > I'm fairly sure that this is more-or-less the correct fix and 10 Jenkins > runs along with 50 (soon to be 100) runs of > dEQP-VK.renderpass.dedicated_allocation.attachment.* > agree. > At this point, that number is more like 250. I'm now very sure this fixes the problem. > Acked-by: Kenneth Graunke <[email protected]> >> > > Thanks! > > >> I'll let Nanley review as well. >> > > I can wait. I've only been fighting this for a week and a half... > > --Jason >
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
