Re: [PATCH v3] uapi/drm/i915: Document memory residency and Flat-CCS capability of obj

2022-05-16 Thread Lionel Landwerlin

On 14/05/2022 00:06, Jordan Justen wrote:

On 2022-05-13 05:31:00, Lionel Landwerlin wrote:

On 02/05/2022 17:15, Ramalingam C wrote:

Capture the impact of memory region preference list of the objects, on
their memory residency and Flat-CCS capability.

v2:
Fix the Flat-CCS capability of an obj with {lmem, smem} preference
list [Thomas]
v3:
Reworded the doc [Matt]

Signed-off-by: Ramalingam C
cc: Matthew Auld
cc: Thomas Hellstrom
cc: Daniel Vetter
cc: Jon Bloomfield
cc: Lionel Landwerlin
cc: Kenneth Graunke
cc:mesa-dev@lists.freedesktop.org
cc: Jordan Justen
cc: Tony Ye
Reviewed-by: Matthew Auld
---
   include/uapi/drm/i915_drm.h | 16 
   1 file changed, 16 insertions(+)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a2def7b27009..b7e1c2fe08dc 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3443,6 +3443,22 @@ struct drm_i915_gem_create_ext {
* At which point we get the object handle in 
&drm_i915_gem_create_ext.handle,
* along with the final object size in &drm_i915_gem_create_ext.size, which
* should account for any rounding up, if required.
+ *
+ * Note that userspace has no means of knowing the current backing region
+ * for objects where @num_regions is larger than one. The kernel will only
+ * ensure that the priority order of the @regions array is honoured, either
+ * when initially placing the object, or when moving memory around due to
+ * memory pressure
+ *
+ * On Flat-CCS capable HW, compression is supported for the objects residing
+ * in I915_MEMORY_CLASS_DEVICE. When such objects (compressed) has other
+ * memory class in @regions and migrated (by I915, due to memory
+ * constrain) to the non I915_MEMORY_CLASS_DEVICE region, then I915 needs to
+ * decompress the content. But I915 dosen't have the required information to
+ * decompress the userspace compressed objects.
+ *
+ * So I915 supports Flat-CCS, only on the objects which can reside only on
+ * I915_MEMORY_CLASS_DEVICE regions.

I think it's fine to assume Flat-CSS surface will always be in lmem.

I see no issue for the Anv Vulkan driver.

Maybe Nanley or Ken can speak for the Iris GL driver?


Acked-by: Jordan Justen

I think Nanley has accounted for this on iris with:

https://gitlab.freedesktop.org/mesa/mesa/-/commit/42a865730ef72574e179b56a314f30fdccc6cba8

-Jordan


Thanks Jordan,


We might want to through in an additional : assert((|flags 
&||BO_ALLOC_SMEM) == 0); in the CCS case

|

|
|

|-Lionel
|


Re: [PATCH v3] uapi/drm/i915: Document memory residency and Flat-CCS capability of obj

2022-05-16 Thread Jordan Justen
On 2022-05-16 00:47:43, Lionel Landwerlin wrote:
> On 14/05/2022 00:06, Jordan Justen wrote:
>> 
>> Acked-by: Jordan Justen 
>> 
>> I think Nanley has accounted for this on iris with:
>> 
>> 
>> https://gitlab.freedesktop.org/mesa/mesa/-/commit/42a865730ef72574e179b56a314f30fdccc6cba8
>> 
> 
> Thanks Jordan,
> 
> We might want to through in an additional : assert((flags & BO_ALLOC_SMEM) ==
> 0); in the CCS case

Yeah. I noticed this potential for concern when looking at the
small-bar uapi on iris. I added an assert, and I haven't seen it get
triggered yet.

-Jordan


[PATCH v3] drm/doc: add rfc section for small BAR uapi

2022-05-16 Thread Matthew Auld
Add an entry for the new uapi needed for small BAR on DG2+.

v2:
  - Some spelling fixes and other small tweaks. (Akeem & Thomas)
  - Rework error capture interactions, including no longer needing
NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
  - Add probed_cpu_visible_size. (Lionel)
v3:
  - Drop the vma query for now.
  - Add unallocated_cpu_visible_size as part of the region query.
  - Improve the docs some more, including documenting the expected
behaviour on older kernels, since this came up in some offline
discussion.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Tvrtko Ursulin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
 Documentation/gpu/rfc/i915_small_bar.h   | 164 +++
 Documentation/gpu/rfc/i915_small_bar.rst |  47 +++
 Documentation/gpu/rfc/index.rst  |   4 +
 3 files changed, 215 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
 create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h
new file mode 100644
index ..4079d287750b
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,164 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as known to the
+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS
+ * at &drm_i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+   /** @region: The class:instance pair encoding */
+   struct drm_i915_gem_memory_class_instance region;
+
+   /** @rsvd0: MBZ */
+   __u32 rsvd0;
+
+   /** @probed_size: Memory probed by the driver (-1 = unknown) */
+   __u64 probed_size;
+
+   /**
+* @unallocated_size: Estimate of memory remaining (-1 = unknown)
+*
+* Note this is only currently tracked for I915_MEMORY_CLASS_DEVICE
+* regions, and also requires CAP_PERFMON or CAP_SYS_ADMIN to get
+* reliable accounting. Without this(or if this an older kernel) the
+* value here will always match the @probed_size.
+*/
+   __u64 unallocated_size;
+
+   union {
+   /** @rsvd1: MBZ */
+   __u64 rsvd1[8];
+   struct {
+   /**
+* @probed_cpu_visible_size: Memory probed by the driver
+* that is CPU accessible. (-1 = unknown).
+*
+* This will be always be <= @probed_size, and the
+* remainder(if there is any) will not be CPU
+* accessible.
+*
+* On systems without small BAR, the @probed_size will
+* always equal the @probed_cpu_visible_size, since all
+* of it will be CPU accessible.
+*
+* Note that if the value returned here is zero, then
+* this must be an old kernel which lacks the relevant
+* small-bar uAPI support(including
+* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS), but on
+* such systems we should never actually end up with a
+* small BAR configuration, assuming we are able to load
+* the kernel module. Hence it should be safe to treat
+* this the same as when @probed_cpu_visible_size ==
+* @probed_size.
+*/
+   __u64 probed_cpu_visible_size;
+
+   /**
+* @unallocated_cpu_visible_size: Estimate of CPU
+* visible memory remaining (-1 = unknown).
+*
+* Note this is only currently tracked for
+* I915_MEMORY_CLASS_DEVICE regions, and also requires
+* CAP_PERFMON or CAP_SYS_ADMIN to get reliable
+* accounting. Without this the value here will always
+* equal the @probed_cpu_visible_size.
+*/
+   __u64 unallocated_cpu_visible_size;
+   };
+   };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added
+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the stuff that
+ * is immutable. Previously we would have two ioctls, one to create the object
+ * with gem_create, and another to apply various parameters, however this
+ * creates some