> [This is a resend -- I believe the first one was stuck infinitely in > moderation, so I subscribed and re-sent.] > > Hi, > > I'm on a Haswell system, using the new Mesa 11 EGL/VAAPI interop to > render > directly into VAAPI buffers and get H.264 video out. I have three > relevant > threads: > > * Thread #1 drives OpenGL rendering into the right textures, and > sends > frame numbers (and fences) into the encoding queue. > > * Thread #2 reads from the encoding queue, waits for the fence (so > that > the GPU is done rendering) and asks VAAPI to encode the frame > through > vaBeginPicture() etc., then sends the frame numbers into the > storage > queue. > > * Thread #3 reads from the storage queue, waits for the frame to be > done encoding (through vaSyncSurface) and stores it to disk. > > Now, my problem is that thread #3 is using a lot of CPU; in > particular, > __i915_wait_request uses around 30% of the total CPU of my > application (which > uses almost 1.5 of the two cores when the thermal constraints come > and clock > down my CPU). Looking at the stack trace from perf, this comes from > the > vaSyncSurface call, which as I understand it waits for the encoder to > be done > with the frame.
Are you sure all __i915_wait_request() comes from vaSyncSurface() ? > > Is it possible to make it wait without busylooping? It seems like a > strange > way to use the CPU. (I'm fine with extra latency if need be.) vaSyncSurface() just calls drm_intel_bo_wait_rendering() to make sure all GPU operations with the surface are finished. drm_intel_bo_wait_rendering() is implemented with SET_DOMAIN ioctl(), I don't think it waits with busylooping. BTW is HW semaphore enabled on your system ? You can check /sys/kernel/debug/dri/0/i915_semaphore_status for the status. > > /* Steinar */ _______________________________________________ Libva mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/libva
