[PATCH v5 08/19] drm/i915/uapi: document behaviour for DG2 64K support
From: Matthew Auld On discrete platforms like DG2, we need to support a minimum page size of 64K when dealing with device local-memory. This is quite tricky for various reasons, so try to document the new implicit uapi for this. v3: fix typos and less emphasis v2: Fixed suggestions on formatting [Daniel] Signed-off-by: Matthew Auld Signed-off-by: Ramalingam C Signed-off-by: Robert Beckett Acked-by: Jordan Justen Reviewed-by: Ramalingam C Reviewed-by: Thomas Hellström cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- include/uapi/drm/i915_drm.h | 44 - 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 5e678917da70..77e5e74c32c1 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 { /** * When the EXEC_OBJECT_PINNED flag is specified this is populated by * the user with the GTT offset at which this object will be pinned. +* * When the I915_EXEC_NO_RELOC flag is specified this must contain the * presumed_offset of the object. +* * During execbuffer2 the kernel populates it with the value of the * current GTT offset of the object, for future presumed_offset writes. +* +* See struct drm_i915_gem_create_ext for the rules when dealing with +* alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with +* minimum page sizes, like DG2. */ __u64 offset; @@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext { * * The (page-aligned) allocated size for the object will be returned. * -* Note that for some devices we have might have further minimum -* page-size restrictions(larger than 4K), like for device local-memory. -* However in general the final size here should always reflect any -* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS -* extension to place the object in device local-memory. +* +* DG2 64K min page size implications: +* +* On discrete platforms, starting from DG2, we have to contend with GTT +* page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE +* objects. Specifically the hardware only supports 64K or larger GTT +* page sizes for such memory. The kernel will already ensure that all +* I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page +* sizes underneath. +* +* Note that the returned size here will always reflect any required +* rounding up done by the kernel, i.e 4K will now become 64K on devices +* such as DG2. +* +* Special DG2 GTT address alignment requirement: +* +* The GTT alignment will also need to be at least 2M for such objects. +* +* Note that due to how the hardware implements 64K GTT page support, we +* have some further complications: +* +* 1) The entire PDE (which covers a 2MB virtual address range), must +* contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same +* PDE is forbidden by the hardware. +* +* 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM +* objects. +* +* To keep things simple for userland, we mandate that any GTT mappings +* must be aligned to and rounded up to 2MB. As this only wastes virtual +* address space and avoids userland having to copy any needlessly +* complicated PDE sharing scheme (coloring) and only affects DG2, this +* is deemed to be a good compromise. */ __u64 size; /** -- 2.20.1
[PATCH v5 09/19] Doc/gpu/rfc/i915: i915 DG2 64k pagesize uAPI
Details of the 64k pagesize support added as part of DG2 enabling and its implicit impact on the uAPI. v2: improvised the Flat-CCS documentation [Danvet & CQ] v3: made only for 64k pagesize support Signed-off-by: Ramalingam C cc: Daniel Vetter cc: Matthew Auld cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- Documentation/gpu/rfc/i915_dg2.rst | 25 + Documentation/gpu/rfc/index.rst| 3 +++ 2 files changed, 28 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_dg2.rst diff --git a/Documentation/gpu/rfc/i915_dg2.rst b/Documentation/gpu/rfc/i915_dg2.rst new file mode 100644 index ..f4eb5a219897 --- /dev/null +++ b/Documentation/gpu/rfc/i915_dg2.rst @@ -0,0 +1,25 @@ + +I915 DG2 RFC Section + + +Upstream plan += +Plan to upstream the DG2 enabling is: + +* Merge basic HW enabling for DG2 (Still without pciid) +* Merge the 64k support for lmem +* Merge the flat CCS enabling patches +* Add the pciid for DG2 and enable the DG2 in CI + + +64K page support for lmem += +On DG2 hw, local-memory supports minimum GTT page size of 64k only. 4k is not +supported anymore. + +DG2 hw doesn't support the 64k (lmem) and 4k (smem) pages in the same ppgtt +Page table. Refer the struct drm_i915_gem_create_ext for the implication of +handling the 64k page size. + +.. kernel-doc:: include/uapi/drm/i915_drm.h +:functions: drm_i915_gem_create_ext diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst index 91e93a705230..afb320ed4028 100644 --- a/Documentation/gpu/rfc/index.rst +++ b/Documentation/gpu/rfc/index.rst @@ -20,6 +20,9 @@ host such documentation: i915_gem_lmem.rst +.. toctree:: +i915_dg2.rst + .. toctree:: i915_scheduler.rst -- 2.20.1
[PATCH v5 18/19] drm/i915/Flat-CCS: Document on Flat-CCS memory compression
Documents the Flat-CCS feature and kernel handling required along with modifiers used. Signed-off-by: Ramalingam C cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- drivers/gpu/drm/i915/gt/intel_migrate.c | 47 + 1 file changed, 47 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index 3e1cf224cdf0..5bdab0b3c735 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -596,6 +596,53 @@ intel_context_migrate_copy(struct intel_context *ce, return err; } +/** + * DOC: Flat-CCS - Memory compression for Local memory + * + * On Xe-HP and later devices, we use dedicated compression control state (CCS) + * stored in local memory for each surface, to support the 3D and media + * compression formats. + * + * The memory required for the CCS of the entire local memory is 1/256 of the + * local memory size. So before the kernel boot, the required memory is reserved + * for the CCS data and a secure register will be programmed with the CCS base + * address. + * + * Flat CCS data needs to be cleared when a lmem object is allocated. + * And CCS data can be copied in and out of CCS region through + * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly. + * + * When we exaust the lmem, if the object's placements support smem, then we can + * directly decompress the compressed lmem object into smem and start using it + * from smem itself. + * + * But when we need to swapout the compressed lmem object into a smem region + * though objects' placement doesn't support smem, then we copy the lmem content + * as it is into smem region along with ccs data (using XY_CTRL_SURF_COPY_BLT). + * When the object is referred, lmem content will be swaped in along with + * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at corresponding + * location. + * + * + * Flat-CCS Modifiers for different compression formats + * + * + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers of Flat CCS + * render compression formats. Though the general layout is same as + * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression algorithm is + * used. Render compression uses 128 byte compression blocks + * + * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers of Flat CCS + * media compression formats. Though the general layout is same as + * I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression algorithm is + * used. Media compression uses 256 byte compression blocks. + * + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the buffers of Flat + * CCS clear color render compression formats. Unified compression format for + * clear color render compression. The genral layout is a tiled layout using + * 4Kb tiles i.e Tile4 layout. + */ + static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags) { /* Mask the 3 LSB to use the PPGTT address space */ -- 2.20.1
[PATCH v5 19/19] Doc/gpu/rfc/i915: i915 DG2 flat-CCS uAPI
Details of the Flat-CCS getting added as part of DG2 enabling and its implicit impact on the uAPI. Signed-off-by: Ramalingam C cc: Daniel Vetter cc: Matthew Auld cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- Documentation/gpu/rfc/i915_dg2.rst | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/gpu/rfc/i915_dg2.rst b/Documentation/gpu/rfc/i915_dg2.rst index f4eb5a219897..9d28b1812bc7 100644 --- a/Documentation/gpu/rfc/i915_dg2.rst +++ b/Documentation/gpu/rfc/i915_dg2.rst @@ -23,3 +23,10 @@ handling the 64k page size. .. kernel-doc:: include/uapi/drm/i915_drm.h :functions: drm_i915_gem_create_ext + + +Flat CCS support for lmem += + +.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_migrate.c +:doc: Flat-CCS - Memory compression for Local memory -- 2.20.1
[PATCH v7 5/5] drm/i915/uapi: document behaviour for DG2 64K support
From: Matthew Auld On discrete platforms like DG2, we need to support a minimum page size of 64K when dealing with device local-memory. This is quite tricky for various reasons, so try to document the new implicit uapi for this. v3: fix typos and less emphasis v2: Fixed suggestions on formatting [Daniel] Signed-off-by: Matthew Auld Signed-off-by: Ramalingam C Signed-off-by: Robert Beckett Acked-by: Jordan Justen Reviewed-by: Ramalingam C Reviewed-by: Thomas Hellström cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- include/uapi/drm/i915_drm.h | 44 - 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 5e678917da70..77e5e74c32c1 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 { /** * When the EXEC_OBJECT_PINNED flag is specified this is populated by * the user with the GTT offset at which this object will be pinned. +* * When the I915_EXEC_NO_RELOC flag is specified this must contain the * presumed_offset of the object. +* * During execbuffer2 the kernel populates it with the value of the * current GTT offset of the object, for future presumed_offset writes. +* +* See struct drm_i915_gem_create_ext for the rules when dealing with +* alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with +* minimum page sizes, like DG2. */ __u64 offset; @@ -3145,11 +3151,39 @@ struct drm_i915_gem_create_ext { * * The (page-aligned) allocated size for the object will be returned. * -* Note that for some devices we have might have further minimum -* page-size restrictions(larger than 4K), like for device local-memory. -* However in general the final size here should always reflect any -* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS -* extension to place the object in device local-memory. +* +* DG2 64K min page size implications: +* +* On discrete platforms, starting from DG2, we have to contend with GTT +* page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE +* objects. Specifically the hardware only supports 64K or larger GTT +* page sizes for such memory. The kernel will already ensure that all +* I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page +* sizes underneath. +* +* Note that the returned size here will always reflect any required +* rounding up done by the kernel, i.e 4K will now become 64K on devices +* such as DG2. +* +* Special DG2 GTT address alignment requirement: +* +* The GTT alignment will also need to be at least 2M for such objects. +* +* Note that due to how the hardware implements 64K GTT page support, we +* have some further complications: +* +* 1) The entire PDE (which covers a 2MB virtual address range), must +* contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same +* PDE is forbidden by the hardware. +* +* 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM +* objects. +* +* To keep things simple for userland, we mandate that any GTT mappings +* must be aligned to and rounded up to 2MB. As this only wastes virtual +* address space and avoids userland having to copy any needlessly +* complicated PDE sharing scheme (coloring) and only affects DG2, this +* is deemed to be a good compromise. */ __u64 size; /** -- 2.25.1