> -----Original Message----- > From: Song, Ruiling > Sent: Sunday, April 21, 2019 8:18 PM > To: FFmpeg development discussions and patches <ffmpeg- > [email protected]> > Subject: RE: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add > nlmeans_opencl filter > > > > > -----Original Message----- > > From: ffmpeg-devel [mailto:[email protected]] On > Behalf Of > > Mark Thompson > > Sent: Saturday, April 20, 2019 11:08 PM > > To: [email protected] > > Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add > nlmeans_opencl > > filter > > > > On 17/04/2019 03:43, Song, Ruiling wrote: > > >> -----Original Message----- > > >> From: ffmpeg-devel [mailto:[email protected]] On > Behalf > > Of > > >> Mark Thompson > > >> Sent: Wednesday, April 17, 2019 5:28 AM > > >> To: [email protected] > > >> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add > > nlmeans_opencl > > >> filter > > >> > > >> On 12/04/2019 16:09, Ruiling Song wrote: > > >>> Signed-off-by: Ruiling Song <[email protected]> > > >> > > >> I can't work out where the problem is, but there is something really > weirdly > > >> nondeterministic going on here. > > >> > > >> E.g. > > >> > > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120- > > mbps- > > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf > > >> > format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p - > > >> frames:v 10 -f framemd5 - > > >> ... > > >> 0, 0, 0, 1, 12441600, > > >> 8b8805818076b23ae6f80ec2b5a349d4 > > >> 0, 1, 1, 1, 12441600, > > >> 7a7fdaa083dc337cfb6af31b643f30a3 > > >> 0, 2, 2, 1, 12441600, > > >> b10ef2a1e5125cc67e262e086f8040b5 > > >> 0, 3, 3, 1, 12441600, > > >> c06b53ad90e0357e537df41b63d5b1dc > > >> 0, 4, 4, 1, 12441600, > > >> 5aa2da07703859a3dee080847dd17d46 > > >> 0, 5, 5, 1, 12441600, > > >> 733364c6be6af825057e905a6092937d > > >> 0, 6, 6, 1, 12441600, > > >> 47edae2dec956a582b04babb745d26b0 > > >> 0, 7, 7, 1, 12441600, > > >> 4e45fe8268df4298d06a17ab8e46c3e9 > > >> 0, 8, 8, 1, 12441600, > > >> 960d722a3f8787c9191299a114c04174 > > >> 0, 9, 9, 1, 12441600, > > >> e759c07ee4834a9cf94bfcb4128e7612 > > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120- > > mbps- > > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf > > >> > format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p - > > >> frames:v 10 -f framemd5 - > > >> 0, 0, 0, 1, 12441600, > > >> 8b8805818076b23ae6f80ec2b5a349d4 > > >> [Parsed_nlmeans_opencl_2 @ 0x5557ae580d00] integral image > overflow > > >> 2157538 > > >> 0, 1, 1, 1, 12441600, > > >> bce72e10a9f1118940c5a8392ad78ec3 > > >> 0, 2, 2, 1, 12441600, > > >> b10ef2a1e5125cc67e262e086f8040b5 > > >> 0, 3, 3, 1, 12441600, > > >> c06b53ad90e0357e537df41b63d5b1dc > > >> 0, 4, 4, 1, 12441600, > > >> 5aa2da07703859a3dee080847dd17d46 > > >> 0, 5, 5, 1, 12441600, > > >> 733364c6be6af825057e905a6092937d > > >> 0, 6, 6, 1, 12441600, > > >> 47edae2dec956a582b04babb745d26b0 > > >> 0, 7, 7, 1, 12441600, > > >> 4e45fe8268df4298d06a17ab8e46c3e9 > > >> 0, 8, 8, 1, 12441600, > > >> 960d722a3f8787c9191299a114c04174 > > >> 0, 9, 9, 1, 12441600, > > >> e759c07ee4834a9cf94bfcb4128e7612 > > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120- > > mbps- > > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf > > >> > format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p - > > >> frames:v 10 -f framemd5 - > > >> 0, 0, 0, 1, 12441600, > > >> 8b8805818076b23ae6f80ec2b5a349d4 > > >> 0, 1, 1, 1, 12441600, > > >> 7a7fdaa083dc337cfb6af31b643f30a3 > > >> [Parsed_nlmeans_opencl_2 @ 0x557c51fbfe80] integral image overflow > > >> 2098545 > > >> 0, 2, 2, 1, 12441600, > > >> 68b390535adc5cfa0f8a7942c42a47ca > > >> 0, 3, 3, 1, 12441600, > > >> c06b53ad90e0357e537df41b63d5b1dc > > >> 0, 4, 4, 1, 12441600, > > >> 5aa2da07703859a3dee080847dd17d46 > > >> 0, 5, 5, 1, 12441600, > > >> 733364c6be6af825057e905a6092937d > > >> 0, 6, 6, 1, 12441600, > > >> 47edae2dec956a582b04babb745d26b0 > > >> 0, 7, 7, 1, 12441600, > > >> 4e45fe8268df4298d06a17ab8e46c3e9 > > >> 0, 8, 8, 1, 12441600, > > >> 960d722a3f8787c9191299a114c04174 > > >> 0, 9, 9, 1, 12441600, > > >> e759c07ee4834a9cf94bfcb4128e7612 > > >> > > >> Frame 1 gave an overflow on the second run, and gets a different > answer, > > then > > >> frame 2 in the same way on the third run? I can't characterise when this > > >> happens, it seems to be pretty random with low probability. > > > > > > I tried to reproduce on my SKL and KBL, with Beignet and Neo. And didn't > > reproduce the issue. > > > As I am encountering some network issue, I didn't get the video sample > you > > provide (I am using https://4ksamples.com/ses-astra-uhd-test-2-2160p- > uhdtv/ ), > > I can try later to download the same video as you. > > > May be an OpenCL driver issue? I am not sure yet. So could you provide > what > > hardware and opencl driver version you are using? So I can do some > debugging if > > possible. > > > > CFL-8700 with git Beignet. > First I want to say that Beignet never declare official support of CFL, which > means that CFL was not fully tested. > I guess your problem is specific to CFL, may be specific to Beignet + CFL, > maybe not. > I highly recommend you to try NEO(https://github.com/intel/compute- > runtime ) which officially support CFL. > If you cannot reproduce with NEO, then it would be obvious this is a bug of > Beignet on CFL. > I also try jellyfish sample on KBL and SKL, both Beignet and NEO, still not > reproduce the issue. > The Beignet was not developed or tested anymore. What's more the CFL > support of Beignet was not tested extensively. > I will try to find one CFL machine to have a test. I think you are running against Beignet “Allow creating out-of-order queues with clCreateCommandQueue”, right? I got one CFL i5-7600k, and only make small modification to CMake file to use llvm-4.0. And use exact command and the jellyfish video clip. Still could not reproduce. So could you have a test against intel-compute-runtime when you have time? Or do you have any local changes against Beignet or related software? Which Linux kernel, libdrm, llvm version are you using?
I am guessing may be the event or asynchronous not correctly handled in Beignet, Could you make some local modification and test on your machine? Make the clEnqueueWriteBuffer(ctx->command_queue, ctx->overflow, CL_TRUE,...); Previously I use CL_FALSE. Hope synchronous write could help on this. Thanks! Ruiling > > > > It also sometimes happens with your sample (took >10 tries to get this): > > > > $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i > > SES.Astra.UHD.Test.2.2160p.UHDTV.HEVC.x265-LiebeIst.mkv -an - > > filter_hw_device opencl0 -vf > > > format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p - > > frames:v 10 -f framemd5 - > > ... > > 0, 0, 0, 1, 12441600, > > 3eba6db2c5f693f6b3c8646a950084bc > > 0, 1, 1, 1, 12441600, > > b538be935c6bb38dbb6fdfba4ef035d1 > > 0, 2, 2, 1, 12441600, > > dafec46e81cb9b50609671fd4c9db645 > > 0, 3, 3, 1, 12441600, > > eaca33534b94031df566489dacacc9e5 > > 0, 4, 4, 1, 12441600, > > 5e49c45c50b36516ce53c708dd16f512 > > 0, 5, 5, 1, 12441600, > > 5d1be0800efd126670de20f468ae78b9 > > 0, 6, 6, 1, 12441600, > > f022199f0519ff884ac2f3d8655e8489 > > 0, 7, 7, 1, 12441600, > > df9daccf85ef00b99b4c086d890fbddc > > 0, 8, 8, 1, 12441600, > > 5a5b16518fce6021569e576505277a27 > > 0, 9, 9, 1, 12441600, > > 095a68d27d322525e62fb182cb1b9aa1 > > ... > > $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i > > SES.Astra.UHD.Test.2.2160p.UHDTV.HEVC.x265-LiebeIst.mkv -an - > > filter_hw_device opencl0 -vf > > > format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p - > > frames:v 10 -f framemd5 - > > ... > > 0, 0, 0, 1, 12441600, > > 3eba6db2c5f693f6b3c8646a950084bc > > 0, 1, 1, 1, 12441600, > > b538be935c6bb38dbb6fdfba4ef035d1 > > 0, 2, 2, 1, 12441600, > > dafec46e81cb9b50609671fd4c9db645 > > 0, 3, 3, 1, 12441600, > > eaca33534b94031df566489dacacc9e5 > > 0, 4, 4, 1, 12441600, > > 5e49c45c50b36516ce53c708dd16f512 > > 0, 5, 5, 1, 12441600, > > 5d1be0800efd126670de20f468ae78b9 > > 0, 6, 6, 1, 12441600, > > f022199f0519ff884ac2f3d8655e8489 > > [Parsed_nlmeans_opencl_2 @ 0x565343792d00] integral image overflow > > 2943427 > > 0, 7, 7, 1, 12441600, > > bdac59f2b6c73af4ea81e75e6e7cc598 > > 0, 8, 8, 1, 12441600, > > 5a5b16518fce6021569e576505277a27 > > 0, 9, 9, 1, 12441600, > > 095a68d27d322525e62fb182cb1b9aa1 > > ... > > > > I'm unable to reproduce on a Mali T760, but the probability seems to be low > and > > that platform is significantly slower / less parallel so it's possible it's > > just > much > > less likely to happen there. > You can try with "nlmeans_opencl=r=5" to do a faster test. > > > > > Thanks, > > > > - Mark > > _______________________________________________ > > ffmpeg-devel mailing list > > [email protected] > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > To unsubscribe, visit link above, or email > > [email protected] with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
