Patch #1, #2, #4 and #6 are Reviewed-by: Christian König
<[email protected]>.
Patch #3:
+ /* Select the first entry VMID */
+ idle = list_first_entry(&id_mgr->ids_lru, struct amdgpu_vm_id, list);
+ list_del_init(&idle->list);
+ vm->reserved_vmid[vmhub] = idle;
+ mutex_unlock(&id_mgr->lock);
I think we should wait for the VMID to be completely idle before we can
use it, but that possible also go into the handling in patch #5.
Patch #5:
+ if ((amdgpu_vm_had_gpu_reset(adev, id)) ||
+ (atomic64_read(&id->owner) != vm->client_id) ||
+ (job->vm_pd_addr != id->pd_gpu_addr) ||
+ (updates && (!flushed || updates->context != flushed->context ||
+ fence_is_later(updates, flushed)))) {
You also need this check here as well:
if (!id->last_flush ||
(id->last_flush->context != fence_context &&
!fence_is_signaled(id->last_flush)))
+ tmp = amdgpu_sync_get_fence(&id->active);
+ if (tmp) {
+ r = amdgpu_sync_fence(adev, sync, tmp);
+ fence_put(tmp);
+ return r;
+ }
That won't work correctly.
The first problem is that amdgpu_sync_get_fence() removes the fence from
the active fences. So a another command submission from a different
context won't wait for all the necessary fences. I think using
amdgpu_sync_peek_fence() instead should work here.
The second problem is that a context could be starved when it needs a VM
flush while another context can still submit jobs without a flush. I
think you could work around that by setting id->pd_gpu_addr to an
invalid value when we hit this case. This way all other contexts would
need to do a VM flush as well.
+ id->pd_gpu_addr = job->vm_pd_addr;
+ id->current_gpu_reset_count = atomic_read(&adev->gpu_reset_counter);
+ atomic64_set(&id->owner, vm->client_id);
+ job->vm_needs_flush = needs_flush;
If we need a flush id->last_flush needs to be set to NULL here as well.
E.g. do
fence_put(id->last_flush);
id->last_flush = NULL;
Regards,
Christian.
Am 27.04.2017 um 07:00 schrieb Chunming Zhou:
The current kernel implementation, which grabs the idle VMID from pool when
emitting the job may:
The back-to-back submission from one process could use different VMID.
The submission to different queues from single process could use different
VMID
It works well in most case but cannot work for the SQ thread trace capture.
The VMID for the submission that set the {SQTT}_BASE, which refers to the
address of the trace buffer, is stored in shader engine.
If the profiling application have to use different VMIDs to submit IBs in its
life cycle:
Some trace is not captured since it actually uses different VMID to submit
jobs.
Some part of captured trace may come from different application since they
are accidentally uses the owner’s VMID to submit jobs.
V2:
1. address Christian's comments:
a. drop context flags for tag process, instead, add vm ioctl.
b. change order of patches.
c. sync waiting only when vm flush needs.
2. address Alex's comments;
bump module version
V3:
address Jerry and Christian's comments.
and only reserve gfxhub vmid
v4:
address Jerry and Christian's comments.
fix some race condistions.
Chunming Zhou (6):
drm/amdgpu: add vm ioctl
drm/amdgpu: add reserved vmid field in vm struct v2
drm/amdgpu: reserve/unreserve vmid by vm ioctl v4
drm/amdgpu: add limitation for dedicated vm number v4
drm/amdgpu: implement grab reserved vmid V3
drm/amdgpu: bump module verion for reserved vmid
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 152 ++++++++++++++++++++++++++++++--
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 6 ++
include/uapi/drm/amdgpu_drm.h | 22 +++++
5 files changed, 178 insertions(+), 6 deletions(-)
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx