Quoting Lionel Landwerlin (2018-02-15 14:02:02)
> With the introduction of asymmetric slices in CNL, we cannot rely on
> the previous SUBSLICE_MASK getparam to tell userspace what subslices
> are available. Here we introduce a more detailed way of querying the
> Gen's GPU topology that doesn't aggregate numbers.
>
> This is essential for monitoring parts of the GPU with the OA unit,
> because counters need to be normalized to the number of
> EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
> not gives us sufficient information.
>
> As a bonus we can draw representations of the GPU :
>
> https://imgur.com/a/vuqpa
>
> v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
> Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
> Add uapi macros to read data from *_info structs (Tvrtko)
>
> v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts
> (Tvrtko)
>
> v4: factorize query item writting (Tvrtko)
> tweak uapi struct/define names (Tvrtko)
>
> v5: Replace ALIGN() macro (Chris)
>
> v6: Updated uapi comments (Tvrtko)
> Moved flags != 0 checks into vfuncs (Tvrtko)
>
> v7: Use access_ok() before copying anything, to avoid overflows (Chris)
> Switch BUG_ON() to GEM_WARN_ON() (Tvrtko)
>
> v8: Tweak uapi comments style to match the coding style (Lionel)
>
> v9: Fix error in comment about computation of enabled subslice (Tvrtko)
>
> v10: Fix/update comments in uAPI (Sagar)
>
> v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single
> drm_i915_query_topology_info (Joonas)
>
> Signed-off-by: Lionel Landwerlin <[email protected]>
<SNIP>
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1620,6 +1620,7 @@ struct drm_i915_perf_oa_config {
>
> struct drm_i915_query_item {
> __u64 query_id;
> +#define DRM_I915_QUERY_TOPOLOGY_INFO 0x01
Just a number should be sufficient? Hex would indicate a mask.
> +/*
> + * Data written by the kernel with query DRM_I915_QUERY_TOPOLOGY_INFO :
> + *
> + * data: contains the 3 pieces of information :
> + *
> + * - the slice mask with one bit per slice telling whether a slice is
> + * available. The availability of slice X can be queried with the following
> + * formula :
> + *
> + * (data[X / 8] >> (X % 8)) & 1
> + *
> + * - the subslice mask for each slice with one bit per subslice telling
> + * whether a subslice is available. The availability of subslice Y in slice
> + * X can be queried with the following formula :
> + *
> + * (data[subslice_offset +
> + * X * DIV_ROUND_UP(max_subslices, 8) +
> + * Y / 8] >> (Y % 8)) & 1
> + *
> + * - the EU mask for each subslice in each slice with one bit per EU telling
> + * whether an EU is available. The availability of EU Z in subslice Y in
> + * slice X can be queried with the following formula :
> + *
> + * (data[eu_offset +
> + * X * max_subslices * DIV_ROUND_UP(max_eus_per_subslice, 8)
> +
> + * Y * DIV_ROUND_UP(max_eus_per_subslice, 8) +
> + * Z / 8] >> (Z % 8)) & 1
I'm still contemplating if providing *_stride to make this more straightofrward
would be a good or bad thing. The cases would become:
data[X / 8] & BIT(X % 8)
data[subslice_offset + X * subslice_stride + Y/8] & BIT(Y % 8)
data[eu_offset + (X * max_subslices + Y) * eu_stride + Z/8] & BIT(Z % 8)
I think I'm heavily leaning towards that, as it comes with the option
that we increase eu_stride two-fold to report more information per EU
(or subslice for that matter).
Thoughts?
Regards, Joonas
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx