Hi, this patchset is the initial set of aarch64 neon patches for the h264 decoder. It is tested on an ipad mini and ARM's software simulator and passes FATE. Single threaded decoding time for a random sample (720p, 133s, 3.4mbit/s) decreases from 103s to 53s, i.e. almost twice as fast.
It builds on linux with gcc-4.8 and on Mac OS X with Xcode 5.1 beta3. A recent checkout of gas-preprocessor is needed for Xcode. Xcode needs to additional patches since clang's integrated assembler does not support all valid instructions. They are available from http://git.jannau.net/libav.git/log/?h=2014-01-10_aarch64-ios-wip I'm not quite sure how I want to handle that. Clang-3.3 on linux might work with gas as assembler, clang-3.4 fails to compile even the c code. Bug reports for the clang and Xcode issues in preparation. Hardware for test is kindly provided by the Videolan association. Enjoy, Janne configure | 33 +- libavcodec/aarch64/Makefile | 22 + libavcodec/aarch64/h264chroma_init_aarch64.c | 51 +++ libavcodec/aarch64/h264cmc_neon.S | 402 +++++++++++++++++++ libavcodec/aarch64/h264dsp_init_aarch64.c | 101 +++++ libavcodec/aarch64/h264dsp_neon.S | 527 ++++++++++++++++++++++++ libavcodec/aarch64/h264idct_neon.S | 408 +++++++++++++++++++ libavcodec/aarch64/h264qpel_init_aarch64.c | 172 ++++++++ libavcodec/aarch64/h264qpel_neon.S | 934 +++++++++++++++++++++++++++++++++++++++++++ libavcodec/aarch64/hpeldsp_init_aarch64.c | 96 +++++ libavcodec/aarch64/hpeldsp_neon.S | 421 +++++++++++++++++++ libavcodec/aarch64/neon.S | 149 +++++++ libavcodec/aarch64/neontest.c | 79 ++++ libavcodec/aarch64/rv40dsp_init_aarch64.c | 43 ++ libavcodec/aarch64/vc1dsp_init_aarch64.c | 47 +++ libavcodec/h264chroma.c | 2 + libavcodec/h264chroma.h | 1 + libavcodec/h264dsp.c | 1 + libavcodec/h264dsp.h | 2 + libavcodec/h264qpel.c | 2 + libavcodec/h264qpel.h | 1 + libavcodec/hpeldsp.c | 2 + libavcodec/hpeldsp.h | 1 + libavcodec/rv34dsp.h | 1 + libavcodec/rv40dsp.c | 2 + libavcodec/vc1dsp.c | 2 + libavcodec/vc1dsp.h | 1 + libavutil/aarch64/Makefile | 1 + libavutil/aarch64/asm.S | 63 +++ libavcodec/h264chroma.h => libavutil/aarch64/bswap.h | 37 +- libavutil/{cpu_internal.h => aarch64/cpu.c} | 22 +- libavutil/{cpu_internal.h => aarch64/cpu.h} | 22 +- libavutil/aarch64/neontest.h | 65 +++ libavutil/bswap.h | 4 +- libavutil/cpu.c | 19 +- libavutil/cpu.h | 1 + libavutil/cpu_internal.h | 1 + 37 files changed, 3690 insertions(+), 48 deletions(-) _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
