Hi, all,
I have a problem with the usage of SLM, please help me.
Problem Description:
My hardware is Gen7.5, then the total SLM is 64KB. By someone telling me, once
the workgroup using SLM, it will be allocated 4KB SLM at least. Meanwhile, in
order to utilize full 64B bandwidth, in the mode of SIMD16, every workitem
should read/write 4B once.
On the other hand, every subslice has 10 EUs, every EU could run 7 threads. So,
one subslice could run 7*10=70 threads once.
Current situation is that, each workgroup of mine only contains 32 workitems,
which could be splitted to 2 threads. And due to usage of SLM, one subslice can
only load 64KB/4KB=16 workgroups one time. Thus, one subslice can only run
2*16=32 threads at one time. This is just half of the full capacity of one
subslice. It's a waste.
What do I want to do?
In fact, 4KB SLM for one workgoup of mine is redundant. Could I change this?
Like, only preallocated 2KB SLM for one group.
Thanks for your help!
Regards
_______________________________________________
Beignet mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/beignet