On Thu, Apr 23, 2015 at 12:20:38AM -0400, Tucker DiNapoli wrote: > I added a new file with the sse2/avx2 code for do_a_deblock. > I also moved the code for running vertical deblock filters into it's own > function, both to clean up the postprocess funciton and to make it > easier to integrate the new sse2/avx2 versions of these filters. > --- > libpostproc/postprocess_template.c | 123 +++++++--- > libpostproc/x86/Makefile | 1 + > libpostproc/x86/deblock.asm | 454 > +++++++++++++++++++++++++++++++++++++ > 3 files changed, 545 insertions(+), 33 deletions(-) > create mode 100644 libpostproc/x86/deblock.asm
putting a av_log() before the old inline asm for do_a_deblock*()
and a jump to NULL in the yasm code shows that only the old code is
executed when testing as in:
./ffplay matrixbench_mpeg2.mpg -vf pp=ha/va
postproc clearly does not use the new code so i have no idea how to
test it
tested both on AVX and AVX2 machines
also there is:
In file included from libpostproc/postprocess.c:538:0:
libpostproc/postprocess_template.c: In function ‘deblock_MMX’:
libpostproc/postprocess_template.c:3414:20: note: The ABI for passing
parameters with 32-byte alignment has changed in GCC 4.6
static inline void RENAME(deblock)(uint8_t *dstBlock, int stride,
^
diff --git a/libpostproc/postprocess_template.c
b/libpostproc/postprocess_template.c
index 9bff458..f98a00c 100644
--- a/libpostproc/postprocess_template.c
+++ b/libpostproc/postprocess_template.c
@@ -2649,6 +2649,7 @@ static av_always_inline void RENAME(do_a_deblock)(uint8_t
*src, int step, int st
int64_t dc_mask, eq_mask, both_masks;
int64_t sums[10*8*2];
src+= step*3; // src points to begin of the 8x8 Block
+ av_log(0,0, "Old do_a_deblock\n");
//{ START_TIMER
__asm__ volatile(
"movq %0, %%mm7 \n\t"
diff --git a/libpostproc/x86/deblock.asm b/libpostproc/x86/deblock.asm
index fbee291..1aa91f5 100644
--- a/libpostproc/x86/deblock.asm
+++ b/libpostproc/x86/deblock.asm
@@ -28,6 +28,9 @@
cglobal do_a_deblock, 5, 6, 7, 22 * mmsize ;src, step, stride, ppcontext, mode
;; stride, mode arguments are unused, but kept for compatability with
;; existing c version. They will be removed eventually
+xor r0, r0
+jmp r0
+
lea r0, [r0 + r1*2]
add r0, r1
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list [email protected] http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
