1
mirror of https://git.videolan.org/git/ffmpeg.git synced 2024-07-26 06:01:30 +02:00
Commit Graph

39 Commits

Author SHA1 Message Date
Reimar Döffinger
30f80d855b lavc/aarch64: port HEVC SIMD idct NEON
Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
available on aarch64.
For a UHD HDR (10 bit) sample video these were consuming the most time
and this optimization reduced overall decode time from 19.4s to 16.4s,
approximately 15% speedup.
Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
running on Apple M1.

Signed-off-by: Josh Dekker <josh@itanimul.li>
2021-02-18 14:11:53 +01:00
James Almer
c0683dce89 Merge commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7'
* commit '0b9a237b2386ff84a6f99716bd58fa27a1b767e7':
  hevc: Add NEON 4x4 and 8x8 IDCT

[15:12:59] <@ubitux> hevc_idct_4x4_8_c: 389.1
[15:13:00] <@ubitux> hevc_idct_4x4_8_neon: 126.6
[15:13:02] <@ubitux> our ^
[15:13:06] <@ubitux> hevc_idct_4x4_8_c: 389.3
[15:13:08] <@ubitux> hevc_idct_4x4_8_neon: 107.8
[15:13:10] <@ubitux> hevc_idct_4x4_10_c: 418.6
[15:13:12] <@ubitux> hevc_idct_4x4_10_neon: 108.1
[15:13:14] <@ubitux> libav ^
[15:13:30] <@ubitux> so yeah, we can probably trash our versions here

Merged-by: James Almer <jamrial@gmail.com>
2017-10-24 19:10:22 -03:00
Mark Thompson
4dee92f6bc hevc: Fix aligned array declarations
(cherry picked from commit d41e10c148)
2017-10-21 22:19:50 +01:00
Clément Bœsch
f6e8d54fcc Merge commit 'b0e6b3f4777910d61083976aa9fc78a1e0731aae'
* commit 'b0e6b3f4777910d61083976aa9fc78a1e0731aae':
  hevc: ppc: Add HEVC 4x4 IDCT for PowerPC

Merged-by: Clément Bœsch <u@pkh.me>
2017-04-17 13:54:51 +02:00
Alexandra Hájková
0b9a237b23 hevc: Add NEON 4x4 and 8x8 IDCT
Optimized by Martin Storsjö <martin@martin.st>.

The speedup vs C code is around 3.2-4.4x.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-03-27 22:56:23 +03:00
Clément Bœsch
d0e132bab6 Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d'
* commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d':
  hevc: Separate adding residual to prediction from IDCT

This commit should be a noop but isn't because of the following renames:

- transform_add  → add_residual
- transform_skip → dequant
- idct_4x4_luma  → transform_4x4_luma

Merged-by: Clément Bœsch <cboesch@gopro.com>
2017-01-31 15:31:34 +01:00
Alexandra Hajkova
b0e6b3f477 hevc: ppc: Add HEVC 4x4 IDCT for PowerPC
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-12-12 09:25:16 +01:00
Alexandra Hájková
1bd890ad17 hevc: Separate adding residual to prediction from IDCT
Based on patch 250430bf28
by Mickaël Raulet <mraulet@insa-rennes.fr>, integrated
to Libav by Josh de Kock <josh@itanimul.li>.

Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
2016-07-18 15:27:13 +02:00
Mickaël Raulet
a92fd8a062 hevc: Add DC IDCT
Integrated to Libav by Josh de Kock <josh@itanimul.li>.

Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
2016-07-18 15:27:13 +02:00
Anton Khirnov
e7078e842d hevcdsp: add x86 SIMD for MC 2015-12-05 21:11:52 +01:00
Anton Khirnov
688417399c hevcdsp: split the pred functions by width
This should allow for more efficient SIMD.
2015-12-05 21:10:41 +01:00
Anton Khirnov
818bfe7f0a hevcdsp: split the epel functions by width
This should allow for more efficient SIMD.
2015-12-05 21:09:57 +01:00
Anton Khirnov
1f821750f0 hevcdsp: split the qpel functions by width instead of by the subpixel fraction
This should allow for more efficient SIMD.

Keep the C versions as they are now, to allow the compiler to inline the
interpolation coefficients.
2015-12-05 21:08:04 +01:00
Anton Khirnov
d8ebb6157d hevcdsp: fix a function name
put_weighted_pred_avg should be put_unweighted_pred_avg, there is no
weighting there.
2015-08-21 08:46:05 +02:00
James Almer
7a806c68a6 avcodec/hevcdsp: rename sao_band_filter c functions
Signed-off-by: James Almer <jamrial@gmail.com>
2015-08-21 02:24:30 -03:00
Shivraj Patil
4efc0e6451 avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC horizontal and vertical mc functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-04-17 17:39:32 +02:00
James Almer
d5addf1555 hevcdsp: fix compilation for arm and aarch64
Also add av_cold to ff_hevcdsp_init_arm.

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-03-12 20:01:01 +01:00
Carl Eugen Hoyos
1d523ea89a lavc/hevcdsp: Fix compilation for arm with --disable-neon. 2015-03-10 12:14:16 +01:00
Seppo Tomperi
0c494114cc hevcdsp: ARM NEON optimized deblocking filter
cherry picked from commit 1b9ee47d2f43b0a029a9468233626102eb1473b8

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-02-05 22:01:52 +01:00
James Almer
042c1159fc x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3,avx2}
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.

Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U

Width 32
158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips
5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips
2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips

Width 64
705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips
10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips

Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-05 15:02:27 -03:00
Seppo Tomperi
74d7faf400 hevcdsp: separated sao edge filter and pixel restore funcs
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
2015-02-04 17:52:49 -03:00
James Almer
fa3eccb4f9 x86/hevc: add ff_hevc_sao_band_filter_{8,10,12}_{sse2,avx,avx2}
Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere.
10/12bit yasm ports, refactoring and optimizations by James Almer

Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U

width 32
40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips
8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips
7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips
4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips

width 64
136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips
28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips
26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips
14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips

Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-02-01 20:22:35 -03:00
Michael Niedermayer
706f81a2c2 Merge commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d'
* commit '1a880b2fb8456ce68eefe5902bac95fea1e6a72d':
  hevc: SSE2 and SSSE3 loop filters

Conflicts:
	libavcodec/hevcdsp.c
	libavcodec/hevcdsp.h
	libavcodec/x86/Makefile
	libavcodec/x86/hevc_deblock.asm
	libavcodec/x86/hevcdsp_init.c

See: de7b89fd43 and several others
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-27 00:20:48 +02:00
Pierre Edouard Lepere
1a880b2fb8 hevc: SSE2 and SSSE3 loop filters
Additional contributions by James Almer <jamrial@gmail.com>,
Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and
Anton Khirnov <anton@khirnov.net>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2014-07-26 15:01:01 +00:00
Mickaël Raulet
453f8eaee2 hevc/rext: add support for Range extension tools
SPS features/flags:
- transform_skip_rotation_enabled_flag
- transform_skip_context_enabled_flag
- implicit_rdpcm_enabled_flag
- explicit_rdpcm_enabled_flag
- intra_smoothing_disabled_flag
- persistent_rice_adaptation_enabled_flag

PPS features/flags:
- log2_max_transform_skip_block_size
- cross_component_prediction_enabled_flag
- chroma_qp_offset_list_enabled_flag
- diff_cu_chroma_qp_offset_depth
- chroma_qp_offset_list_len_minus1
- cb_qp_offset_list
- cr_qp_offset_list
- log2_sao_offset_scale_luma
- log2_sao_offset_scale_chroma
(cherry picked from commit 005294c5b939a23099871c6130c8a7cc331f73ee)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 14:08:20 +02:00
Mickaël Raulet
5a41999d81 hevc/rext: basic infrastructure for supporting range extension
- support for 4:2:2 and 4:4:4 up to 12 bits
- add a new profile for range extension
(cherry picked from commit d3c067fa65bbc871758d28aa07f54123430ca346)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 13:47:35 +02:00
Mickaël Raulet
250430bf28 hevc: separate residu and prediction (needed for Range Extension)
(cherry picked from commit 6b3856ef57d66f2e59ee61fd2eb5f83b6d0d7d4a)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 13:37:27 +02:00
Mickaël Raulet
1241eb8870 hevc: simplify SAO computation, delay from one row its computation
(cherry picked from commit f2c5f647cec786df26f442a85e6d685a131a50c9)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-15 13:11:33 +02:00
plepere
92cccb7bcd avcodec/hevc: new idct + asm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-06-17 13:23:36 +02:00
plepere
7a2491c436 HEVC : added assembly MC functions
pretty print x86

Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:23:36 +02:00
Mickaël Raulet
83976e40e8 hevc: C code update for new motion compensation
pretty print C

Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-06 18:22:34 +02:00
Michael Niedermayer
5410a5dc66 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  hevc: move DSP declarations from hevc.h into hevcdsp.h

Conflicts:
	libavcodec/hevc.h
	libavcodec/hevcdsp.c
	libavcodec/hevcdsp.h

See: c8dd048ab8
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-12-22 12:46:19 +01:00
Guillaume Martres
7398e0516f hevc: move DSP declarations from hevc.h into hevcdsp.h
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-12-22 03:49:11 +01:00
Michael Niedermayer
2c4f573696 libavcodec/hevc: random cosmetics to reduce diff to 064698d381
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-02 15:32:15 +01:00
Michael Niedermayer
69b3668b83 libavcodec/hevc: indention related cosmetics to reduce diff to 064698d381
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-02 14:36:59 +01:00
Michael Niedermayer
ce7f1c76bd avcodec/hevc: more whitespaces to reduce difference to 064698d381
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-02 12:33:54 +01:00
Michael Niedermayer
f578e5d937 avcodec/hevc: Adjust white-spaces to reduce difference to 064698d381
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-02 00:44:54 +01:00
Guillaume Martres
064698d381 Add HEVC decoder
Initially written by Guillaume Martres <smarter@ubuntu.com> as a GSoC
project. Further contributions by the OpenHEVC project and other
developers, namely:

Mickaël Raulet <mraulet@insa-rennes.fr>
Seppo Tomperi <seppo.tomperi@vtt.fi>
Gildas Cocherel <gildas.cocherel@laposte.net>
Khaled Jerbi <khaled_jerbi@yahoo.fr>
Wassim Hamidouche <wassim.hamidouche@insa-rennes.fr>
Vittorio Giovara <vittorio.giovara@gmail.com>
Jan Ekström <jeebjp@gmail.com>
Anton Khirnov <anton@khirnov.net>
Martin Storsjö <martin@martin.st>
Luca Barbato <lu_zero@gentoo.org>
Yusuke Nakamura <muken.the.vfrmaniac@gmail.com>
Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Diego Biurrun <diego@biurrun.de>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-10-31 20:19:59 +01:00
Guillaume Martres
c8dd048ab8 lavc: add a HEVC decoder.
Initially written by Guillaume Martres <smarter@ubuntu.com> as a GSoC
project. Further contributions by the OpenHEVC project and other
developers, namely:

Mickaël Raulet <mraulet@insa-rennes.fr>
Seppo Tomperi <seppo.tomperi@vtt.fi>
Gildas Cocherel <gildas.cocherel@laposte.net>
Khaled Jerbi <khaled_jerbi@yahoo.fr>
Wassim Hamidouche <wassim.hamidouche@insa-rennes.fr>
Vittorio Giovara <vittorio.giovara@gmail.com>
Jan Ekström <jeebjp@gmail.com>
Anton Khirnov <anton@khirnov.net>
Martin Storsjö <martin@martin.st>
Luca Barbato <lu_zero@gentoo.org>
Yusuke Nakamura <muken.the.vfrmaniac@gmail.com>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-15 22:13:02 +02:00