1
mirror of https://git.videolan.org/git/ffmpeg.git synced 2024-09-05 15:58:07 +02:00
Commit Graph

2474 Commits

Author SHA1 Message Date
Christopher Degawa
8990c5869e get_cabac_inline_x86: Don't inline if 32-bit clang on windows
Fixes https://trac.ffmpeg.org/ticket/8903

relevant https://github.com/msys2/MINGW-packages/discussions/9258

Signed-off-by: Christopher Degawa <ccom@randomderp.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2021-08-19 22:29:23 +03:00
Andreas Rheinhardt
afc95a10ac avcodec/h264dsp, h264idct: Fix lengths of array parameters
Fixes many -Warray-parameter warnings from GCC 11.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-08-08 17:44:57 +02:00
Andreas Rheinhardt
25c8507818 Remove/replace some unnecessary avcodec.h inclusions
Also remove other unnecessary headers and include headers directly while
at it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 15:29:46 +02:00
Andreas Rheinhardt
4608f7cc6a Remove unnecessary mem.h inclusions
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 14:47:57 +02:00
Andreas Rheinhardt
2c05ee092b avutil/internal, swresample/audioconvert: Remove cpu.h inclusions
These inclusions are not necessary, as cpu.h is already included
wherever it is needed (via direct inclusion or via the arch-specific
headers).
Also remove other unnecessary cpu.h inclusions from ordinary
non-headers.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-07-22 14:33:45 +02:00
Andreas Rheinhardt
7c1f347b18 avcodec: Remove deprecated old encode/decode APIs
Deprecated in commits 7fc329e2dd
and 31f6a4b4b8.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2021-04-27 10:43:12 -03:00
Andreas Rheinhardt
f3c197b129 Include attributes.h directly
Some files currently rely on libavutil/cpu.h to include it for them;
yet said file won't use include it any more after the currently
deprecated functions are removed, so include attributes.h directly.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2021-04-19 14:34:10 +02:00
Paul B Mahol
b69c91bbee avcodec/x86: add cfhdenc SIMD 2021-02-27 17:09:44 +01:00
James Almer
f1a894f9d3 avcodec: add missing FF_API_OLD_ENCDEC wrappers to xmm clobber functions
Signed-off-by: James Almer <jamrial@gmail.com>
2021-02-26 19:26:31 -03:00
Andreas Rheinhardt
585b764f95 avcodec/x86/constants: Remove unused ff_pw_17
Unused since 80944df720.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:49:03 +01:00
Andreas Rheinhardt
7825cc392a avcodec/x86/diracdsp_init: Reuse macro
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:38:12 +01:00
Andreas Rheinhardt
0f317eb8e7 avcodec/x86/diracdsp_init: Simplify macro
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:36:13 +01:00
Andreas Rheinhardt
68bd6c7dac avcodec/x86/diracdsp_init: Make functions only used here static
This allowed to remove forward declarations. Because compilers expect
declarations for all functions they encounter even when it is within
blocks disabled via "if (0 && foo)", one has to use a real #if in
ff_diracdsp_init_x86.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 09:17:40 +01:00
Andreas Rheinhardt
3a80b1ac12 avcodec/x86/diracdsp_init: Remove unused MMX functions
Unused since a1f3b18bf5, yet as nonstatic
functions the compiler can't detect this, so that these functions aren't
stripped and no warning is emitted.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-24 08:58:57 +01:00
Andreas Rheinhardt
4f3d8cb554 avcodec/cabac_functions, x86/cabac: Include stddef.h
Fixes checkheaders after 8c01eb0a31.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2021-02-04 05:17:33 +01:00
Lynne
9e05421dbe
ac3enc_fixed: drop unnecessary fixed-point DSP code 2021-01-14 01:44:20 +01:00
Anton Khirnov
e15371061d lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump
They are not properly namespaced and not intended for public use.
2021-01-01 14:14:57 +01:00
Anton Khirnov
c8c2dfbc37 lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h
That is a more appropriate place for it.
2021-01-01 14:11:01 +01:00
Andreas Rheinhardt
ead3134150 avcodec/mpegaudiodsp: Make ff_mpadsp_init() thread-safe
The only thing missing for this is to make ff_mpadsp_init_x86()
thread-safe; it currently isn't because a static table is initialized
every time ff_mpadsp_init() is called (when ARCH_X86 is true). Solve
this by initializing this table only once, namely together with the
ordinary not-arch specific tables. This also allows to reuse their AVOnce.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
2020-11-24 11:35:03 +01:00
James Almer
1a35fffaf2 x86/cfhddsp: zero extend int arguments
if taken from stack, they may have garbage in the upper bits otherwise.
Also, there are only 8 arguments, so don't attempt to load 11.

Fixes SIGSEV crashes in some targets.

Reviewed-by: durandal_1707
Signed-off-by: James Almer <jamrial@gmail.com>
2020-08-28 20:09:25 -03:00
Paul B Mahol
4aac742505 avcodec/x86/cfhddsp: try to fix build on x32 2020-08-26 23:39:58 +02:00
Paul B Mahol
389cc142fb avcodec/cfhd: add x86 SIMD
Overall speed changes for 1920x1080, yuv422p10le, 60fps from: 0.19x to 0.343x
2020-08-26 21:13:38 +02:00
James Almer
2c844c9828 x86/h264_deblock: fix warning about trailing empty parameter
Fixes part of ticket #8771

Signed-off-by: James Almer <jamrial@gmail.com>
2020-07-12 11:30:23 -03:00
Martin Storsjö
353aecbb28 pixblockdsp, avdct: Add get_pixels_unaligned
Use this in vf_spp.c, where the get_pixels operation is done on
unaligned source addresses.

Hook up the x86 (mmx and sse) versions of get_pixels to this
function pointer, as those implementations seem to support unaligned
use.

This fixes fate-filter-spp on armv7.

Signed-off-by: Martin Storsjö <martin@martin.st>
2020-05-13 13:20:08 +03:00
Linjie Fu
8b8492452d lavc/x86/hevc_add_res: Fix coeff overflow in ADD_RES_SSE_16_32_8
Fix overflow for coeff -32768 in function ADD_RES_SSE_16_32_8 with no
performance drop.(SSE2/AVX/AVX2)

./checkasm --test=hevc_add_res --bench

Mainline:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_32x32_8_sse2: 127.5
    hevc_add_res_32x32_8_avx: 127.0
    hevc_add_res_32x32_8_avx2: 86.5

Add overflow test case:
  - hevc_add_res.add_residual [FAILED]

After:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_32x32_8_sse2: 126.8
    hevc_add_res_32x32_8_avx: 128.3
    hevc_add_res_32x32_8_avx2: 86.8

Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27 10:57:40 +01:00
Linjie Fu
e9abef437f lavc/x86/hevc_add_res: Fix overflow in ADD_RES_SSE_8_8
Fix overflow for coeff -32768 in function ADD_RES_SSE_8_8 with
no performance drop.

./checkasm --test=hevc_add_res --bench

Mainline:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_8x8_8_sse2: 15.5

Add overflow test case:
  - hevc_add_res.add_residual [FAILED]

After:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_8x8_8_sse2: 15.5

Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27 10:57:40 +01:00
Linjie Fu
0da14ed09e lavc/x86/hevc_add_res: Fix overflow in ADD_RES_MMX_4_8
Fix overflow for coeff -32768 in function ADD_RES_MMX_4_8 with no
performance drop.

./checkasm --test=hevc_add_res --bench

Mainline:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_4x4_8_mmxext: 15.5

Add overflow test case:
  - hevc_add_res.add_residual [FAILED]

After:
  - hevc_add_res.add_residual [OK]
    hevc_add_res_4x4_8_mmxext: 15.0

Signed-off-by: Xu Guangxin <guangxin.xu@intel.com>
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2020-03-27 10:57:40 +01:00
Michael Niedermayer
24af459d1e avcodec/x86/diracdsp: Fix high bits on Windows x86_64
Found-by: james
2020-01-31 00:04:22 +01:00
Michael Niedermayer
0694b60b7b avcodec/x86/diracdsp: Fix incorrect src addressing in dequant_subband_32()
Fixes: Segfault (not reproducable with asm, which made this hard to debug)
Fixes: decoding errors
Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-01-30 18:47:21 +01:00
Peter Ross
fd17218558 vp4: prevent unaligned memory access in loop filter
VP4 applies a loop filter during motion compensation, causing the block offset
will often by unaligned. This produces a bus error on some platforms, namely
ARMv7 NEON.

This patch adds a unaligned version of the loop filter function pointer
to VP3DSPContext.

Reported-by: Mike Melanson <mike@multimedia.cx>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-10-30 10:06:38 +01:00
James Almer
1faedb9a11 x85/opusdsp: enable the functions on all FMA3 CPUs
It's not using ymm registers, so limiting it to CPUs with fast AVX
is not necessary.

Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-11 20:50:45 -03:00
James Almer
80444e23ac x86/opusdps: clear the high bits from some gprs
Fixes checkasm on systems like win64.

Reviewed-by: Lynne
Signed-off-by: James Almer <jamrial@gmail.com>
2019-09-11 20:42:31 -03:00
James Almer
58d167bcd5 avcodec/Makefile: add missing pngdsp dependency to the lscr decoder
Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-14 16:47:56 -03:00
James Almer
b41d8ab2e6 x86/v210dec: use named registers
Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03 01:20:18 -03:00
James Almer
abf1aa87ab x86/v210dec: don't reserve more xmm regs than needed
Prevents pointless register saving on win64 for the sse3 and avx
versions of the function.

Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03 01:09:50 -03:00
James Almer
b0e29357ba x86/v210dec: remove duplicate load instruction
Signed-off-by: James Almer <jamrial@gmail.com>
2019-05-03 01:08:34 -03:00
James Darnley
46f1718cd9 avcodec/x86/v210: fix operands of vpblendd used in new avx2 code
Assembly failed when using yasm rather than nasm.
2019-05-02 21:20:54 +02:00
Michael Stoner
ebd6fb23c5 libavcodec Adding ff_v210_planar_unpack AVX2
Replaced VSHUFPS with VPBLENDD to relieve port 5 bottleneck
AVX2 is 1.4x faster than AVX
2019-05-02 19:21:37 +02:00
Lynne
4b7166c9d5 x86/opusdsp: replace loads with shuffles
Has a slight speedup.
Can't be carried over to aarch64, since it has no shufps-like instruction.

Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2019-04-26 20:39:38 -03:00
Lynne
b43b8d337d x86/opusdsp: fix WIN64 return value
Signed-off-by: James Almer <jamrial@gmail.com>
2019-04-01 11:06:34 -03:00
Lynne
605e330310 x86/opusdsp: implement FMA3 accelerated postfilter and deemphasis
58893 decicycles in deemphasis_c,  130548 runs,    524 skips
9475 decicycles in deemphasis_fma3,  130686 runs,    386 skips -> 6.21x speedup

24866 decicycles in postfilter_c,   65386 runs,    150 skips
5268 decicycles in postfilter_fma3,   65505 runs,     31 skips -> 4.72x speedup

Total decoder speedup: ~14%

Deemphasis SIMD based on the following unrolling:
const float c1 = CELT_EMPH_COEFF, c2 = c1*c1, c3 = c2*c1, c4 = c3*c1;
float state = coeff;

for (int i = 0; i < len; i += 4) {
    y[0] = x[0] + c1*state;
    y[1] = x[1] + c2*state + c1*x[0];
    y[2] = x[2] + c3*state + c1*x[1] + c2*x[0];
    y[3] = x[3] + c4*state + c1*x[2] + c2*x[1] + c3*x[0];

    state = y[3];
    y += 4;
    x += 4;
}
2019-04-01 00:22:00 +02:00
Lynne
5468c1d075 celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabled
The entire function was defined away before.
2019-03-31 23:36:43 +02:00
Lynne
4a2c651620 x86/opus_dsp: rename to celt_pvq
Its only used in the encoder and in CELT's PVQ.
2019-03-31 23:35:00 +02:00
James Almer
d5d699ab6e avcodec/h264dsp: change loop filter stride argument to ptrdiff_t 2019-02-20 15:27:43 -03:00
Martin Vignali
9a22e6fa1d avcodec/proresdsp indent after prev commit 2018-12-02 12:55:35 +01:00
Martin Vignali
c097a32e93 avcodec/proresdec : rename dsp part for 10b and check dspinit for supported bits per raw sample
based on patch by Kieran Kunhya
2018-12-02 12:55:31 +01:00
Rostislav Pehlivanov
29eb1c51d7 mdct15: simplify x86 exptab permutation
Removes an unneeded copy and does the 5-point permute in-place.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2018-05-07 23:44:40 +01:00
Rostislav Pehlivanov
a72d0fb973 mdct15: simplify the fft15 x86 SIMD
Saves 1 gpr and 2 instructions and simplifies the macros a bit.

Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2018-05-07 23:27:41 +01:00
Kieran Kunhya
f9d3841ae6 mpeg4video: Add support for MPEG-4 Simple Studio Profile.
This is a profile supporting > 8-bit video and has a higher quality DCT
2018-04-02 13:06:23 +01:00
Aurelien Jacobs
f1e490b1ad sbcenc: add MMX optimizations
This was originally based on libsbc, and was fully integrated into ffmpeg.

Rough speed test:
C version:    speed= 592x
MMX version:  speed= 785x
2018-03-07 22:26:53 +01:00