On Sun, 3 Dec 2017, Alexandra Hájková wrote:

Checkasm timings:
block size bitdepth  C       NEON
4           8 bit:    146.7   48.7
          10 bit:    146.7   52.7
8           8 bit:    430.3   84.4
          10 bit:    430.4  119.5
12          8 bit:    812.8  141.0
          10 bit:    812.8  195.0
16          8 bit:   1499.1  268.0
          10 bit:   1498.9  368.4
24          8 bit:   4394.2  574.8
          10 bit:   3696.3  804.8
32          8 bit:   5108.6  568.9
          10 bit:   4249.6  918.8
48          8 bit:  16819.6 2304.9
          10 bit:  13882.0 3178.5
64          8 bit:  13490.8 1799.5
          10 bit:  11018.5 2519.4
---
libavcodec/arm/Makefile           |   3 +-
libavcodec/arm/hevc_mc.S          | 381 ++++++++++++++++++++++++++++++++++++++
libavcodec/arm/hevcdsp_init_arm.c |  67 +++++++
3 files changed, 450 insertions(+), 1 deletion(-)
create mode 100644 libavcodec/arm/hevc_mc.S

diff --git a/libavcodec/arm/Makefile b/libavcodec/arm/Makefile
index b48745ad4..49e17ce0d 100644
--- a/libavcodec/arm/Makefile
+++ b/libavcodec/arm/Makefile
@@ -135,7 +135,8 @@ NEON-OBJS-$(CONFIG_AAC_DECODER)        += 
arm/aacpsdsp_neon.o           \
NEON-OBJS-$(CONFIG_APE_DECODER)        += arm/apedsp_neon.o
NEON-OBJS-$(CONFIG_DCA_DECODER)        += arm/dcadsp_neon.o             \
                                          arm/synth_filter_neon.o
-NEON-OBJS-$(CONFIG_HEVC_DECODER)       += arm/hevc_idct.o
+NEON-OBJS-$(CONFIG_HEVC_DECODER)       += arm/hevc_idct.o               \
+                                          arm/hevc_mc.o
NEON-OBJS-$(CONFIG_RV30_DECODER)       += arm/rv34dsp_neon.o
NEON-OBJS-$(CONFIG_RV40_DECODER)       += arm/rv34dsp_neon.o            \
                                          arm/rv40dsp_neon.o
diff --git a/libavcodec/arm/hevc_mc.S b/libavcodec/arm/hevc_mc.S
new file mode 100644
index 000000000..a1274ec71
--- /dev/null
+++ b/libavcodec/arm/hevc_mc.S
@@ -0,0 +1,381 @@
+/*
+ * ARM NEON optimised MC functions for HEVC decoding
+ *
+ * Copyright (c) 2017 Alexandra Hájková
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with Libav; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/arm/asm.S"
+
+.macro get_pixels4 bitdepth
+function ff_hevc_get_pixels_4_\bitdepth\()_neon, export=1
+@r0 dst, r1 dststride, r2 src, r3 srcstride
+        ldr             r12, [sp] @height
+        cmp             r12, #0
+        bxeq            lr

This needs "it eq" before it, to build in thumb mode

+
+1: .if \bitdepth == 8

Gas-preprocessor fails with this construct, it needs to have the 1: on a separate line. I can maybe look into working around that (I fixed one case of that issue some time ago), but here we can adjust it before pushing.

The rest of it looks good enough, so I'll push it soon with these issues fixed (in all occurances).

// Martin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to