Is there a way to do scaling without displaying the video on the monitor? I need to show a high res version of the video on the display with putSurface and a lower res version for further processing. vaPutSurface will display the video at a lower res but the surface resolution is unmodified. Unfortunately the GPU I am using does not have the PostProcessing capability or else I could copy the surface to another post Processing surface and apply the filter and then derive the image.
On Mon, Feb 25, 2013 at 6:35 PM, Ratin <[email protected]> wrote: > Just to let you know , I tried the VPP code for de-noise, I set the > de-noise level all the way to (max) it doesn't seem to make any noticable > difference as far as quality. I noticed the fast path for setting the > scaling type in VAProcPipelineParameterBuffer, so including putSurface flag > and as well as an additional filter, there seems to be three different ways > to do this. Anyways, with VA_FILTER_SCALING_NL_ANAMORPHIC and de-noise, the > gpu usage increases to 31 % (first measure below). With > VA_FILTER_SCALING_NL_ANAMORPHIC as part of putSurface flag and no de-noise > filtering, GPU usage drops 10 %. With VA_FILTER_SCALING_FAST as part of > putsurface flag, it drops another 4 %. The stream I am decoding / rendering > is 1280 x 720 H 264, 3 mbps. > > > > clock: unknown sampler clock: unknown > render busy: 31%: > ██████▎ render space: 37/131072 > bitstream busy: 4%: ▉ > bitstream space: 1/131072 > blitter busy: 24%: > ████▉ blitter space: 10/131072 > > task percent busy > GAM: 33%: ██████▋ vert fetch: > 0 (0/sec) > TSG: 19%: ███▉ prim fetch: > 0 (0/sec) > VFE: 19%: ███▉ VS invocations: > 104062 (0/sec) > VF: 19%: ███▉ GS invocations: > 0 (0/sec) > GAFS: 14%: ██▉ GS prims: > 0 (0/sec) > TDG: 0%: CL invocations: > 44604 (0/sec) > GAFM: 0%: CL prims: > 47270 (0/sec) > SOL: 0%: PS invocations: > 3011525360 (0/sec) > GS: 0%: PS depth pass: > 3010388146 (0/sec) > > render clock: unknown sampler clock: unknown > render busy: 21%: > ████▎ render space: 20/131072 > bitstream busy: 4%: ▉ > bitstream space: 1/131072 > blitter busy: 21%: > ████▎ blitter space: 8/131072 > > task percent busy > GAM: 23%: ████▋ vert fetch: > 0 (0/sec) > TSG: 10%: ██ prim fetch: > 0 (0/sec) > VFE: 10%: ██ VS invocations: > 104062 (0/sec) > VF: 10%: ██ GS invocations: > 0 (0/sec) > GAFS: 9%: █▉ GS prims: > 0 (0/sec) > TDG: 0%: CL invocations: > 44604 (0/sec) > GAFM: 0%: CL prims: > 47270 (0/sec) > DS: 0%: PS invocations: > 3011525360 (0/sec) > GS: 0%: PS depth pass: > 3010388146 (0/sec) > > > > > render clock: unknown sampler clock: unknown > render busy: 17%: > ███▎ render space: 10/131072 > bitstream busy: 4%: ▉ > bitstream space: 1/131072 > blitter busy: 17%: > ███▎ blitter space: 7/131072 > > task percent busy > GAM: 17%: ███▌ vert fetch: > 0 (0/sec) > GAFS: 4%: ▉ prim fetch: > 0 (0/sec) > VS: 0%: VS invocations: > 104062 (0/sec) > VF: 0%: GS invocations: > 0 (0/sec) > GS prims: > 0 (0/sec) > CL invocations: > 44604 (0/sec) > CL prims: > 47270 (0/sec) > PS invocations: > 3011525360 (0/sec) > PS depth pass: > 3010388146 (0/sec) > > > > > On Fri, Feb 22, 2013 at 7:17 AM, Ratin <[email protected]> wrote: > >> >> >> >> On Thu, Feb 21, 2013 at 5:26 PM, ykzhao <[email protected]> wrote: >> >>> On Thu, 2013-02-21 at 06:30 -0700, Ratin wrote: >>> > awesome, would like to see the result from HQ scaling sometime in the >>> > future. I am just using putSurface, don't want to go thru Proc >>> > pipeline if I don't have to. Is the performance penalty identical in >>> > both ways? Is there a way I can measure how much GPU processing (% >>> > and such) is being utilized? >>> >>> They are implemented in different ways and it is difficult to check the >>> performance penalty. The putsurface is based on the 3D model while the >>> proc pipeline is based on GPGPU model. (Intel_gpu_top may help to show >>> the GPU utility, which can be downloaded from the >>> http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/). >>> >>> Will you please check whether it can meet with your requirement if you >>> can use the proc VPP to do the upscaling conversion and then call the >>> vaPutsurface to display it? >>> >>> Thanks. >>> Yakui >>> >> >> Hi Yakui, Thanks for your reply. I just started looking into this , the >> total number of filters available for me seems to be only two, not sure if >> thats normal or not. I am using HD4000 My lspci output shows the following: >> >> d02788e046eb:/usr/local/bin# lspci >> 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM >> Controller (rev 09) >> 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core >> processor Graphics Controller (rev 09) >> 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset >> Family USB xHCI Host Controller (rev 04) >> 00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series >> Chipset Family MEI Controller #1 (rev 04) >> 00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset >> Family USB Enhanced Host Controller #2 (rev 04) >> 00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family >> PCI Express Root Port 1 (rev c4) >> 00:1c.3 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family >> PCI Express Root Port 4 (rev c4) >> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset >> Family USB Enhanced Host Controller #1 (rev 04) >> 00:1f.0 ISA bridge: Intel Corporation HM76 Express Chipset LPC Controller >> (rev 04) >> 00:1f.2 IDE interface: Intel Corporation 7 Series Chipset Family 4-port >> SATA Controller [IDE mode] (rev 04) >> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family >> SMBus Controller (rev 04) >> 00:1f.5 IDE interface: Intel Corporation 7 Series Chipset Family 2-port >> SATA Controller [IDE mode] (rev 04) >> 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. >> RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 07) >> 02:00.0 Network controller: Intel Corporation Centrino Wireless-N 2200 >> (rev c4) >> >> I will use more query code to know what are those two filters and will >> post the results here.. >> >> Thanks >> >> Ratin >> >> >> On Tue, Feb 19, 2013 at 7:24 PM, Xiang, Haihao >> > <[email protected]> wrote: >> > >> > >> > > I am using Intel_driver from the staging branch, on a Gen 3 >> > HD4000. So >> > > there other algorithms like bi-cubic is not supported? >> > >> > >> > You can select another scaling method other than the default >> > method via >> > the flag to vaPutSurface() or the filter_flag in >> > VAProcPipelineParameterBuffer. >> > >> > /* Scaling flags for vaPutSurface() */ >> > #define VA_FILTER_SCALING_DEFAULT 0x00000000 >> > #define VA_FILTER_SCALING_FAST 0x00000100 >> > #define VA_FILTER_SCALING_HQ 0x00000200 >> > #define VA_FILTER_SCALING_NL_ANAMORPHIC 0x00000300 >> > #define VA_FILTER_SCALING_MASK 0x00000f00 >> > >> > In VAProcPipelineParameterBuffer: >> > >> > * - Scaling: \c VA_FILTER_SCALING_DEFAULT, \c >> > VA_FILTER_SCALING_FAST, >> > * \c VA_FILTER_SCALING_HQ, \c >> > VA_FILTER_SCALING_NL_ANAMORPHIC. >> > */ >> > unsigned int filter_flags; >> > >> > For Inter driver, Currently only >> > VA_FILTER_SCALING_NL_ANAMORPHIC and >> > VA_FILTER_SCALING_DEFAULT/VA_FILTER_SCALING_FAST are >> > supported. We >> > will add the support for VA_FILTER_SCALING_HQ. >> > >> > Thanks >> > Haihao >> > >> > > >> > > >> > > >> > > On Mon, Feb 18, 2013 at 12:11 AM, Xiang, Haihao >> > > <[email protected]> wrote: >> > > On Fri, 2013-02-15 at 16:18 -0800, Ratin wrote: >> > > > I am decoding a 720 P video stream from a camera >> > to 1080 P >> > > surfaces >> > > > and displaying them on the screen. I am seeing >> > noticable >> > > noise and >> > > > pulsating which is directly related to the I frame >> > interval >> > > > (aparently), the lowest I-frame interval I can >> > specify for >> > > the camera >> > > > is 1 second and selecting that in addition to >> > bitrate of >> > > 8192 kbps >> > > > makes is slightly better but still a lot of noise. >> > A >> > > software >> > > > decoded/scaled video looks all smooth. >> > > > >> > > > >> > > > What I am wondering is what's the default scaling >> > algorithm >> > > being used >> > > > in vaapi/intel driver and how do I specify better >> > scaling >> > > algorithms >> > > > like bi-cubic etc.and possibly specify the >> > strength of >> > > deblocking >> > > > filter level as well, and what can I do to reduce >> > the >> > > pulsating ? >> > > >> > > >> > > Which driver are you using ? For Intel, it is >> > bilinear. >> > > >> > > > >> > > > >> > > > >> > > > >> > > > Any input would be much appreciated. >> > > > >> > > > >> > > > Thanks >> > > > >> > > > >> > > > Ratin >> > > > >> > > > >> > > >> > > > _______________________________________________ >> > > > Libva mailing list >> > > > [email protected] >> > > > >> > http://lists.freedesktop.org/mailman/listinfo/libva >> > > >> > > >> > > >> > > >> > >> > >> > >> > >> > >> >> >> >> >
_______________________________________________ Libva mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/libva
