ffmpeg

mirror of https://git.videolan.org/git/ffmpeg.git synced 2024-09-05 15:58:07 +02:00

Author	SHA1	Message	Date
Christopher Degawa	8990c5869e	get_cabac_inline_x86: Don't inline if 32-bit clang on windows Fixes https://trac.ffmpeg.org/ticket/8903 relevant https://github.com/msys2/MINGW-packages/discussions/9258 Signed-off-by: Christopher Degawa <ccom@randomderp.com> Signed-off-by: Martin Storsjö <martin@martin.st>	2021-08-19 22:29:23 +03:00
Andreas Rheinhardt	afc95a10ac	avcodec/h264dsp, h264idct: Fix lengths of array parameters Fixes many -Warray-parameter warnings from GCC 11. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-08-08 17:44:57 +02:00
Andreas Rheinhardt	25c8507818	Remove/replace some unnecessary avcodec.h inclusions Also remove other unnecessary headers and include headers directly while at it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-07-22 15:29:46 +02:00
Andreas Rheinhardt	4608f7cc6a	Remove unnecessary mem.h inclusions Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-07-22 14:47:57 +02:00
Andreas Rheinhardt	2c05ee092b	avutil/internal, swresample/audioconvert: Remove cpu.h inclusions These inclusions are not necessary, as cpu.h is already included wherever it is needed (via direct inclusion or via the arch-specific headers). Also remove other unnecessary cpu.h inclusions from ordinary non-headers. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-07-22 14:33:45 +02:00
Andreas Rheinhardt	7c1f347b18	avcodec: Remove deprecated old encode/decode APIs Deprecated in commits `7fc329e2dd` and `31f6a4b4b8`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: James Almer <jamrial@gmail.com>	2021-04-27 10:43:12 -03:00
Andreas Rheinhardt	f3c197b129	Include attributes.h directly Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2021-04-19 14:34:10 +02:00
Paul B Mahol	b69c91bbee	avcodec/x86: add cfhdenc SIMD	2021-02-27 17:09:44 +01:00
James Almer	f1a894f9d3	avcodec: add missing FF_API_OLD_ENCDEC wrappers to xmm clobber functions Signed-off-by: James Almer <jamrial@gmail.com>	2021-02-26 19:26:31 -03:00
Andreas Rheinhardt	585b764f95	avcodec/x86/constants: Remove unused ff_pw_17 Unused since `80944df720`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 09:49:03 +01:00
Andreas Rheinhardt	7825cc392a	avcodec/x86/diracdsp_init: Reuse macro Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 09:38:12 +01:00
Andreas Rheinhardt	0f317eb8e7	avcodec/x86/diracdsp_init: Simplify macro Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 09:36:13 +01:00
Andreas Rheinhardt	68bd6c7dac	avcodec/x86/diracdsp_init: Make functions only used here static This allowed to remove forward declarations. Because compilers expect declarations for all functions they encounter even when it is within blocks disabled via "if (0 && foo)", one has to use a real #if in ff_diracdsp_init_x86. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 09:17:40 +01:00
Andreas Rheinhardt	3a80b1ac12	avcodec/x86/diracdsp_init: Remove unused MMX functions Unused since `a1f3b18bf5`, yet as nonstatic functions the compiler can't detect this, so that these functions aren't stripped and no warning is emitted. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-24 08:58:57 +01:00
Andreas Rheinhardt	4f3d8cb554	avcodec/cabac_functions, x86/cabac: Include stddef.h Fixes checkheaders after `8c01eb0a31`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2021-02-04 05:17:33 +01:00
Lynne	9e05421dbe	ac3enc_fixed: drop unnecessary fixed-point DSP code	2021-01-14 01:44:20 +01:00
Anton Khirnov	e15371061d	lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bump They are not properly namespaced and not intended for public use.	2021-01-01 14:14:57 +01:00
Anton Khirnov	c8c2dfbc37	lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h That is a more appropriate place for it.	2021-01-01 14:11:01 +01:00
Andreas Rheinhardt	ead3134150	avcodec/mpegaudiodsp: Make ff_mpadsp_init() thread-safe The only thing missing for this is to make ff_mpadsp_init_x86() thread-safe; it currently isn't because a static table is initialized every time ff_mpadsp_init() is called (when ARCH_X86 is true). Solve this by initializing this table only once, namely together with the ordinary not-arch specific tables. This also allows to reuse their AVOnce. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>	2020-11-24 11:35:03 +01:00
James Almer	1a35fffaf2	x86/cfhddsp: zero extend int arguments if taken from stack, they may have garbage in the upper bits otherwise. Also, there are only 8 arguments, so don't attempt to load 11. Fixes SIGSEV crashes in some targets. Reviewed-by: durandal_1707 Signed-off-by: James Almer <jamrial@gmail.com>	2020-08-28 20:09:25 -03:00
Paul B Mahol	4aac742505	avcodec/x86/cfhddsp: try to fix build on x32	2020-08-26 23:39:58 +02:00
Paul B Mahol	389cc142fb	avcodec/cfhd: add x86 SIMD Overall speed changes for 1920x1080, yuv422p10le, 60fps from: 0.19x to 0.343x	2020-08-26 21:13:38 +02:00
James Almer	2c844c9828	x86/h264_deblock: fix warning about trailing empty parameter Fixes part of ticket #8771 Signed-off-by: James Almer <jamrial@gmail.com>	2020-07-12 11:30:23 -03:00
Martin Storsjö	353aecbb28	pixblockdsp, avdct: Add get_pixels_unaligned Use this in vf_spp.c, where the get_pixels operation is done on unaligned source addresses. Hook up the x86 (mmx and sse) versions of get_pixels to this function pointer, as those implementations seem to support unaligned use. This fixes fate-filter-spp on armv7. Signed-off-by: Martin Storsjö <martin@martin.st>	2020-05-13 13:20:08 +03:00
Linjie Fu	8b8492452d	lavc/x86/hevc_add_res: Fix coeff overflow in ADD_RES_SSE_16_32_8 Fix overflow for coeff -32768 in function ADD_RES_SSE_16_32_8 with no performance drop.(SSE2/AVX/AVX2) ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_32x32_8_sse2: 127.5 hevc_add_res_32x32_8_avx: 127.0 hevc_add_res_32x32_8_avx2: 86.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_32x32_8_sse2: 126.8 hevc_add_res_32x32_8_avx: 128.3 hevc_add_res_32x32_8_avx2: 86.8 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2020-03-27 10:57:40 +01:00
Linjie Fu	e9abef437f	lavc/x86/hevc_add_res: Fix overflow in ADD_RES_SSE_8_8 Fix overflow for coeff -32768 in function ADD_RES_SSE_8_8 with no performance drop. ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_8x8_8_sse2: 15.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_8x8_8_sse2: 15.5 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2020-03-27 10:57:40 +01:00
Linjie Fu	0da14ed09e	lavc/x86/hevc_add_res: Fix overflow in ADD_RES_MMX_4_8 Fix overflow for coeff -32768 in function ADD_RES_MMX_4_8 with no performance drop. ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_4x4_8_mmxext: 15.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_4x4_8_mmxext: 15.0 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	2020-03-27 10:57:40 +01:00
Michael Niedermayer	24af459d1e	avcodec/x86/diracdsp: Fix high bits on Windows x86_64 Found-by: james	2020-01-31 00:04:22 +01:00
Michael Niedermayer	0694b60b7b	avcodec/x86/diracdsp: Fix incorrect src addressing in dequant_subband_32() Fixes: Segfault (not reproducable with asm, which made this hard to debug) Fixes: decoding errors Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2020-01-30 18:47:21 +01:00
Peter Ross	fd17218558	vp4: prevent unaligned memory access in loop filter VP4 applies a loop filter during motion compensation, causing the block offset will often by unaligned. This produces a bus error on some platforms, namely ARMv7 NEON. This patch adds a unaligned version of the loop filter function pointer to VP3DSPContext. Reported-by: Mike Melanson <mike@multimedia.cx> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2019-10-30 10:06:38 +01:00
James Almer	1faedb9a11	x85/opusdsp: enable the functions on all FMA3 CPUs It's not using ymm registers, so limiting it to CPUs with fast AVX is not necessary. Signed-off-by: James Almer <jamrial@gmail.com>	2019-09-11 20:50:45 -03:00
James Almer	80444e23ac	x86/opusdps: clear the high bits from some gprs Fixes checkasm on systems like win64. Reviewed-by: Lynne Signed-off-by: James Almer <jamrial@gmail.com>	2019-09-11 20:42:31 -03:00
James Almer	58d167bcd5	avcodec/Makefile: add missing pngdsp dependency to the lscr decoder Signed-off-by: James Almer <jamrial@gmail.com>	2019-05-14 16:47:56 -03:00
James Almer	b41d8ab2e6	x86/v210dec: use named registers Signed-off-by: James Almer <jamrial@gmail.com>	2019-05-03 01:20:18 -03:00
James Almer	abf1aa87ab	x86/v210dec: don't reserve more xmm regs than needed Prevents pointless register saving on win64 for the sse3 and avx versions of the function. Signed-off-by: James Almer <jamrial@gmail.com>	2019-05-03 01:09:50 -03:00
James Almer	b0e29357ba	x86/v210dec: remove duplicate load instruction Signed-off-by: James Almer <jamrial@gmail.com>	2019-05-03 01:08:34 -03:00
James Darnley	46f1718cd9	avcodec/x86/v210: fix operands of vpblendd used in new avx2 code Assembly failed when using yasm rather than nasm.	2019-05-02 21:20:54 +02:00
Michael Stoner	ebd6fb23c5	libavcodec Adding ff_v210_planar_unpack AVX2 Replaced VSHUFPS with VPBLENDD to relieve port 5 bottleneck AVX2 is 1.4x faster than AVX	2019-05-02 19:21:37 +02:00
Lynne	4b7166c9d5	x86/opusdsp: replace loads with shuffles Has a slight speedup. Can't be carried over to aarch64, since it has no shufps-like instruction. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2019-04-26 20:39:38 -03:00
Lynne	b43b8d337d	x86/opusdsp: fix WIN64 return value Signed-off-by: James Almer <jamrial@gmail.com>	2019-04-01 11:06:34 -03:00
Lynne	605e330310	x86/opusdsp: implement FMA3 accelerated postfilter and deemphasis 58893 decicycles in deemphasis_c, 130548 runs, 524 skips 9475 decicycles in deemphasis_fma3, 130686 runs, 386 skips -> 6.21x speedup 24866 decicycles in postfilter_c, 65386 runs, 150 skips 5268 decicycles in postfilter_fma3, 65505 runs, 31 skips -> 4.72x speedup Total decoder speedup: ~14% Deemphasis SIMD based on the following unrolling: const float c1 = CELT_EMPH_COEFF, c2 = c1c1, c3 = c2c1, c4 = c3c1; float state = coeff; for (int i = 0; i < len; i += 4) { y[0] = x[0] + c1state; y[1] = x[1] + c2state + c1x[0]; y[2] = x[2] + c3state + c1x[1] + c2x[0]; y[3] = x[3] + c4state + c1x[2] + c2x[1] + c3*x[0]; state = y[3]; y += 4; x += 4; }	2019-04-01 00:22:00 +02:00
Lynne	5468c1d075	celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabled The entire function was defined away before.	2019-03-31 23:36:43 +02:00
Lynne	4a2c651620	x86/opus_dsp: rename to celt_pvq Its only used in the encoder and in CELT's PVQ.	2019-03-31 23:35:00 +02:00
James Almer	d5d699ab6e	avcodec/h264dsp: change loop filter stride argument to ptrdiff_t	2019-02-20 15:27:43 -03:00
Martin Vignali	9a22e6fa1d	avcodec/proresdsp indent after prev commit	2018-12-02 12:55:35 +01:00
Martin Vignali	c097a32e93	avcodec/proresdec : rename dsp part for 10b and check dspinit for supported bits per raw sample based on patch by Kieran Kunhya	2018-12-02 12:55:31 +01:00
Rostislav Pehlivanov	29eb1c51d7	mdct15: simplify x86 exptab permutation Removes an unneeded copy and does the 5-point permute in-place. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	2018-05-07 23:44:40 +01:00
Rostislav Pehlivanov	a72d0fb973	mdct15: simplify the fft15 x86 SIMD Saves 1 gpr and 2 instructions and simplifies the macros a bit. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	2018-05-07 23:27:41 +01:00
Kieran Kunhya	f9d3841ae6	mpeg4video: Add support for MPEG-4 Simple Studio Profile. This is a profile supporting > 8-bit video and has a higher quality DCT	2018-04-02 13:06:23 +01:00
Aurelien Jacobs	f1e490b1ad	sbcenc: add MMX optimizations This was originally based on libsbc, and was fully integrated into ffmpeg. Rough speed test: C version: speed= 592x MMX version: speed= 785x	2018-03-07 22:26:53 +01:00

1 2 3 4 5 ...

2474 Commits