> -----Original Message-----
> From: Tom Stellard>
> On Wed, Apr 23, 2014 at 01:27:11PM +0000, Dorrington, Albert wrote:
<snip>
> > When I try to call clEnqueueReadImage(), after a
> clEnqueueNDRangeKernel(); the clover/aop/transfer.cpp again generates a
> Pixel Shader, which gets integrated into the command stream after the
> Compute Shader entries (so I send up with:
Command Sequence
1 [Initial Configuration]
2 [PixelShader]
Need PS_PARTIAL_FLUSH here <-- this one I think I have with what I added to
evergren_init_atom_start_compute_cs()
3 [ComputeShader]
Need CS_PARTIAL_FLUSH here
4 [PixelShader]
Need PS_PARTIAL_FLUSH here
5 [Final Configuration/Cleanup/Wait]
The other two, I've been experimenting trying to add them in various places,
but I haevn't seen a change in behavior yet (probably not putting them in the
right place yet...)
>
> You also may need to flush the various caches after the pixel shader and
> compute shaders have completed. See r600_flush_emit in
> r600_hw_context.c
That could be why I'm not seeing any changes yet... I have been looking at
r600_flush_emit() this morning and experimenting with its use.
> Does the documentation say that setting the RAT bit means it can only be
> used by compute shaders, or have you discovered this from your testing?
> If this is the case, you may have to use a different CB_COLOR# for the image
> when coying it with a pixel shader.
The documentation I'm referring to is "Radeon Evergreen/Northern Islands
Acceleration" Rev 1.0 Dated May 24, 2011
Section 8 CB Programming
Section 8.6 Compute Shader
Compute shaders can perform atomic writes ("device reduction
operations") to memory via the CB. The order of
execution of the operations is not guaranteed, only that they are
atomic. These writes can include simple operations
(min, max, add, and, or, exchange, compare-exchange) and can optionally
return a value (pre-op) back to the shader.
The CF_export adds two new opcodes for RAT exports: EXPORT_RAT and
EXPORT_RAT_CACHELESS.
If CB_COLOR<mrt>_INFO.RAT is programmed, the surface is treated as a
Random Access Target and can only be
drawn by Compute Shader operations. A set of MRTs can be configured for
RATs and normal rendering. The only
stipulation is that all RAT MRTs must be assigned to higher number MRTs
than normal rendering MRTs.
I take the statement in the 3rd paragraph to mean that if the RAT bit is set,
that a CB setup within a Compute Shader
will not work in a Pixel Shader. However, looking at the command stream
executed, the 'r600_draw_vbo()' function
that gets called appears to reconfigure all of the Color buffers, so I don't
think this is the issue causing the conflict.
I'm getting more convinced that the issue is that the Compute Shader needs to
run completely, before the Pixel Shader runs.
(presumably the need for the CS_PARTIAL_FLUSH and PS_PARTIAL_FLUSH directives)
Frustrating thing is, sometimes DRM can recover from these stalls, other times
my box locks up.
> >
>
> Looking forward to your contribution. As always the sooner you can post the
> code the better as that will make it easier to review and may help uncover
> some of your issues.
>
I'd love to contribute some of what I've done back soon, unfortunately with our
teams choice to use the AMD SDK OpenCL compiler rather than LLVM, I can't
easily migrate my changes back to the Mesa baseline. We have quite a few
work-arounds (hacks? haha) to setup the RATs and Vertex Buffers the way the AMD
compiler needs them. Once the crazy hours at work slowdown, I'll be able to
have some more free time at home where i can contribute. :)
After I reboot the hung box (yet again) I'll be taking another look to verify
if the partial flushes are in the command stream or not.
Thanks!
_______________________________________________
mesa-dev mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/mesa-dev