videolan/x264 - x264

Commit Graph

Author	SHA1	Message	Date
Hubert Mazur	8743a46d10	pixel: Add neon sa8d implementations for 10 bit Provide arm64 neon implementation for sa8d 16x8 and 16x16 functions for 10 bit depth. Benchmarks are shown below. sa8d_8x8_c: 2914 sa8d_8x8_neon: 608 sa8d_16x16_c: 11469 sa8d_16x16_neon: 2030 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	820fb5a7d8	pixel: Add neon satd implementations for 10 bit Provide arm64 neon implementation for satd 16x8 and 16x16 functions for 10 bit depth. Benchmarks are shown below. satd_16x8_c: 4268 satd_16x8_neon: 1493 satd_16x16_c: 8382 satd_16x16_neon: 2908 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	9927ac9ae0	Add neon pixel_var2 implementation for 10 bit Provide arm64 neon implementation for pixel_var2 function for 10 bit depth. Benchmarks are shown below. var2_8x8_c: 1988 var2_8x8_neon: 505 var2_8x16_c: 3800 var2_8x16_neon: 862 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	7ae0053807	Add neon pixel_var implementation for 10 bit Provide arm64 neon implementation for pixel_var function for 10 bit depth. Benchmarks are shown below. var_8x8_c: 757 var_8x8_neon: 342 var_8x16_c: 1431 var_8x16_neon: 582 var_16x16_c: 2721 var_16x16_neon: 767 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	a87a9f89eb	pixel: Add neon ssd_nv12 implementation for 10 bit Provide arm64 neon implementation for ssd_nv12 function for 10 bit depth. Benchmarks are shown below. ssd_nv12_c: 181441 ssd_nv12_neon: 29037 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	1b59a1f3ee	pixel: Add neon satd implementations for 10 bit Provide arm64 neon implementation for satd 8x8 and 8x16 functions for 10 bit depth. Benchmarks are shown below. satd_8x8_c: 2143 satd_8x8_neon: 812 satd_8x16_c: 4228 satd_8x16_neon: 1504 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Grzegorz Bernacki	1754f6b20c	pixel: Add neon satd implementations for 10 bit Provide arm64 neon implementation for satd functions for 10 bit depth. Benchmarks are shown below. satd_4x4_c: 858 satd_4x4_neon: 712 satd_4x8_c: 1834 satd_4x8_neon: 812 satd_4x16_c: 3677 satd_4x16_neon: 1149 satd_8x4_c: 1290 satd_8x4_neon: 427 Signed-off-by: Grzegorz Bernacki <gjb@semihalf.com> Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	8fd1e5f26d	pixel: Add neon ssd implementations for 10 bit Provide arm64 neon implementation for ssd functions for 10 bit depth. Benchmarks are shown below. ssd_4x4_c: 1466 ssd_4x4_neon: 240 ssd_4x8_c: 1918 ssd_4x8_neon: 482 ssd_4x16_c: 5258 ssd_4x16_neon: 1025 ssd_8x4_c: 1291 ssd_8x4_neon: 235 ssd_8x8_c: 2431 ssd_8x8_neon: 425 ssd_8x16_c: 4635 ssd_8x16_neon: 910 ssd_16x8_c: 4198 ssd_16x8_neon: 897 ssd_16x16_c: 8549 ssd_16x16_neon: 1907 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	90b3391ee6	pixel: Add neon asd8 implementations for 10 bit Provide arm64 neon implementation for asd8 function for 10 bit depth. Benchmarks are shown below. asd8_c: 4400 asd8_neon: 857 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	8a90ffa7d1	pixel: Add neon vsad implementations for 10 bit Provide arm64 neon implementation for vsad function for 10 bit depth. Benchmarks are shown below. vsad_c: 3599 vsad_neon: 392 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	3afe3c82bc	pixel: Add neon sad_x3 implementations for 10 bit Provide arm64 neon implementations for sad_x3 functions for 10 bit depth. Benchmarks are shown below. sad_x3_4x4_c: 710 sad_x3_4x4_neon: 286 sad_x3_4x8_c: 1422 sad_x3_4x8_neon: 430 sad_x3_8x4_c: 1350 sad_x3_8x4_neon: 269 sad_x3_8x8_c: 2851 sad_x3_8x8_neon: 440 sad_x3_8x16_c: 5597 sad_x3_8x16_neon: 734 sad_x3_16x8_c: 5414 sad_x3_16x8_neon: 722 sad_x3_16x16_c: 10729 sad_x3_16x16_neon: 1288 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:45:18 +00:00
Hubert Mazur	7882a3689b	quant: Add implementation for denoise_dct function Provide arm64 neon implementation for denoise_dct function for high bit depth. Benchmarks are shown below. denoise_dct_c: 2149 denoise_dct_neon: 585 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:31:51 +00:00
Hubert Mazur	01e056712c	quant: Add neon implementations of coeff_level_run Provide arm64 neon implementations for coeff_level_run functions for high bit depth. Benchmarks are shown below. coeff_level_run4_c: 135 coeff_level_run4_neon: 155 coeff_level_run8_c: 181 coeff_level_run8_neon: 182 coeff_level_run15_c: 296 coeff_level_run15_neon: 275 coeff_level_run16_c: 305 coeff_level_run16_neon: 264 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:31:51 +00:00
Hubert Mazur	03c0e9a900	quant: Add neon implementations of coeff_last Provide arm64 neon implementations for coeff_last functions for high bit depth. Benchmarks are shown below. coeff_last4_c: 79 coeff_last4_neon: 107 coeff_last8_c: 109 coeff_last8_neon: 154 coeff_last15_c: 161 coeff_last15_neon: 135 coeff_last16_c: 160 coeff_last16_neon: 132 coeff_last64_c: 782 coeff_last64_neon: 400 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:31:51 +00:00
Hubert Mazur	7c62a144ff	quant: Add implementation for decimate64 Provide neon arm64 implementation for decimate_score64 for high bit depth. Benchmarks are shown below. decimate_score64_c: 894 decimate_score64_neon: 431 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:31:51 +00:00
Hubert Mazur	66d000d2d6	quant: Add implementation for decimate functions Provide neon arm64 implementations for decimate score functions for high bit depth. Benchmarks are shown below. decimate_score15_c: 273 decimate_score15_neon: 205 decimate_score16_c: 284 decimate_score16_neon: 208 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:31:51 +00:00
Hubert Mazur	986dd1f3b7	quant: Add implementation for dequant Provide neon arm64 implementations for dequant functions for high bit depth. Benchmarks are shown below. dequant_4x4_cqm_c: 359 dequant_4x4_cqm_neon: 225 dequant_4x4_dc_cqm_c: 344 dequant_4x4_dc_cqm_neon: 208 dequant_4x4_dc_flat_c: 348 dequant_4x4_dc_flat_neon: 210 dequant_4x4_flat_c: 362 dequant_4x4_flat_neon: 227 dequant_8x8_cqm_c: 1526 dequant_8x8_cqm_neon: 517 dequant_8x8_flat_c: 1547 dequant_8x8_flat_neon: 520 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:31:51 +00:00
Hubert Mazur	b8ea87e05c	quant: Add neon implementation of quant functions Provide arm64 neon implementations of quant functions for high bit depth. Benchmarks are shown below. quant_2x2_dc_c: 217 quant_2x2_dc_neon: 275 quant_4x4_c: 482 quant_4x4_neon: 326 quant_4x4_dc_c: 428 quant_4x4_dc_neon: 348 quant_4x4x4_c: 2508 quant_4x4x4_neon: 1027 quant_8x8_c: 2439 quant_8x8_neon: 936 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:31:51 +00:00
Hubert Mazur	cc5c343f43	mc: Add arm64 neon implementation for hpel filter Provide neon optimized implementation for mc_plane_copy function from motion compensation family for 10 bit depth. Benchmark results are shown below. hpel_filter_c: 111495 hpel_filter_neon: 37849 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	e47bede829	mc: Add arm64 neon implementation for copy funcs Provide neon optimized implementation for mc_plane_copy function from motion compensation family for 10 bit depth. Benchmark results are shown below. plane_copy_c: 2955 plane_copy_neon: 2910 plane_copy_deinterleave_c: 24056 plane_copy_deinterleave_neon: 3625 plane_copy_deinterleave_rgb_c: 19928 plane_copy_deinterleave_rgb_neon: 3941 plane_copy_interleave_c: 24399 plane_copy_interleave_neon: 4723 plane_copy_swap_c: 32269 plane_copy_swap_neon: 3211 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	df179744c9	mc: Add arm64 neon implementation for store func Provide neon optimized implementation for mc_store_interleave function from motion compensation family for 10 bit depth. Benchmark results are shown below. load_deinterleave_chroma_fenc_c: 2910 load_deinterleave_chroma_fenc_neon: 430 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	68d712065f	mc: Add arm64 neon implementation for mc_load func Provide neon optimized implementation for mc_load_deinterleave function from motion compensation family for 10 bit depth. Benchmark results are shown below. load_deinterleave_chroma_fdec_c: 2936 load_deinterleave_chroma_fdec_neon: 422 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	0a810f4f58	mc: Add arm64 neon implementation for mc_lowres Provide neon optimized implementation for mc_lowres function from motion compensation family for 10 bit depth. Benchmark results are shown below. lowres_init_c: 149446 lowres_init_neon: 13172 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	25ef883299	mc: Add arm64 neon implementation for mc_integral Provide neon optimized implementation for mc_integral functions from motion compensation family for 10 bit depth. Benchmark results are shown below. integral_init4h_c: 2651 integral_init4h_neon: 550 integral_init4v_c: 4247 integral_init4v_neon: 612 integral_init8h_c: 2544 integral_init8h_neon: 1027 integral_init8v_c: 1996 integral_init8v_neon: 245 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	7ff0f978fa	mc: Add arm64 neon implementation for mc_chroma Provide neon optimized implementation for mc_chroma functions from motion compensation family for 10 bit depth. Benchmark results are shown below. mc_chroma_2x2_c: 700 mc_chroma_2x2_neon: 478 mc_chroma_2x4_c: 1300 mc_chroma_2x4_neon: 765 mc_chroma_4x2_c: 1229 mc_chroma_4x2_neon: 483 mc_chroma_4x4_c: 2383 mc_chroma_4x4_neon: 773 mc_chroma_4x8_c: 4662 mc_chroma_4x8_neon: 1319 mc_chroma_8x4_c: 4450 mc_chroma_8x4_neon: 940 mc_chroma_8x8_c: 8797 mc_chroma_8x8_neon: 1638 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	0876120871	mc: Move mc_luma and get_ref wrappers Provide mc_luma and get_ref wrappers were only defined with 8 bit depth. As all required 10 bit depth helper functions exists, move it out from if scope and make it always defined regardless the bit depth. Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	25d5baf43d	mc: Add arm64 neon implementation for mc_weight Provide neon optimized implementation for mc_weight functions from motion compensation family for 10 bit depth. Benchmark results are shown below. weight_w4_c: 4734 weight_w4_neon: 4165 weight_w8_c: 8930 weight_w8_neon: 1620 weight_w16_c: 16939 weight_w16_neon: 2729 weight_w20_c: 20721 weight_w20_neon: 3470 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	f0b0489f19	mc: Add arm64 neon implementation for mc_copy Provide neon optimized implementation for mc_copy functions from motion compensation family for 10 bit depth. Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	bb3d83dd02	mc: Add arm64 neon implementation for pixel_avg2 Provide neon optimized implementation for pixel_avg2 functions from motion compensation family for 10 bit depth. Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	13a2488815	mc: Add arm64 neon implementation for pixel_avg Provide neon optimized implementation for pixel_avg functions from motion compensation family for 10 bit depth. Checkasm benchmarks are shown below. avg_4x2_c: 703 avg_4x2_neon: 222 avg_4x4_c: 1405 avg_4x4_neon: 516 avg_4x8_c: 2759 avg_4x8_neon: 898 avg_4x16_c: 5808 avg_4x16_neon: 1776 avg_8x4_c: 2767 avg_8x4_neon: 412 avg_8x8_c: 5559 avg_8x8_neon: 841 avg_8x16_c: 11176 avg_8x16_neon: 1668 avg_16x8_c: 10493 avg_16x8_neon: 1504 avg_16x16_c: 21116 avg_16x16_neon: 2985 Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	ba45eba390	aarch64/mc-c: Unify pixel/uint8_t usage Previously some functions from motion compensation family used uint8_t, while the others pixel definition. Unify this and change every uint8_t usage to pixel. This commit is a prerequisite to 10 bit depth support. Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Hubert Mazur	249924eaf1	mc: Add initial support for 10 bit neon support Add if/else clause in files to control which code is used. Move generic function out of 8-bit depth scope to common one for both modes. Signed-off-by: Hubert Mazur <hum@semihalf.com>	2023-10-01 15:13:40 +00:00
Martin Husemann	834c5c92db	ppc: Add x264_cpu_detect() for NetBSD/macppc The altivec instruction set detection is very similar to FreeBSD and OpenBSD, but uses slightly different sysctl selectors.	2023-10-01 17:35:48 +03:00
Anton Mitrofanov	31e19f92f0	ppc: Fix compilation on unknown OS	2023-10-01 17:28:26 +03:00
Anton Mitrofanov	a8b68ebfaa	Improve qpfile parsing resiliency	2023-04-02 15:51:50 +03:00
Anton Mitrofanov	eaa68fad9e	Fix high bit depth deinterleave of YUYV or UYVY	2023-01-28 22:11:33 +00:00
Anton Mitrofanov	cd31a90ba5	Fix compilation of only 8 or 10 bit by a non-optimizing compiler	2023-01-28 21:45:30 +03:00
Anton Mitrofanov	17df75b32e	Bump dates to 2023	2023-01-28 16:37:02 +03:00
Roger Hardiman	941cae6d1d	Add Risc-V 64 bit	2022-12-17 16:09:25 +00:00
Hubert Mazur	416e3eb2b5	aarch64: pixel: add 10bits sad functions Provide routines for sad functions for high bit depth, i.e. 10 bits. Benchmarks run on AWS Gravtion 2 instances. sad_4x4_c: 583 sad_4x4_neon: 273 sad_4x8_c: 1179 sad_4x8_neon: 366 sad_4x16_c: 2121 sad_4x16_neon: 550 sad_8x4_c: 924 sad_8x4_neon: 213 sad_8x8_c: 1711 sad_8x8_neon: 316 sad_8x16_c: 3505 sad_8x16_neon: 497 sad_16x8_c: 3070 sad_16x8_neon: 635 sad_16x16_c: 6113 sad_16x16_neon: 1118 Signed-off-by: Hubert Mazur <hum@semihalf.com> Signed-off-by: Grzegorz Bernacki <gjb@semihalf.com>	2022-10-28 07:11:57 +00:00
Anton Mitrofanov	b093bbe7d9	ffms: Fix crash if stream properties changes	2022-10-06 00:12:42 +03:00
Henrik Gramner	ed0f7a6340	cli: Use space instead of newline as autocomplete delimiter On most systems any whitespace is fine, but MSYS2 wants ASCII 0x20.	2022-10-01 17:21:11 +02:00
Sergei Trofimovich	e067ab0b53	Makefile: Add missing dependency of '.depend' on 'oclobj.h' Without the change parallel build occasionally fails as: $ make --shuffle ... gcc ... -c common/opencl.c -o common/opencl-8.o ... common/opencl.c:116:10: fatal error: common/oclobj.h: No such file or directory 116 \| #include "common/oclobj.h" \| ^~~~~~~~~~~~~~~~~ Best reproducible with `make --shuffle` mode: https://savannah.gnu.org/bugs/index.php?62100 This happens because `common/oclobj.h` is an autogenerated file. Normally `.depend` would contain this autogenerated dependency. But nothing forces `common/oclobj.h` to be generated. The change moves dependency of $(GENERATED) from final binaries to `.depend` itself: .depend: $(GENERATED)	2022-09-19 22:31:01 +01:00
Anton Mitrofanov	7628a5696f	Fix memory overread in mbtree	2022-09-05 19:32:40 +00:00
Anton Mitrofanov	8bdd8b8993	CI: Fix vlc-contrib linking on macOS Use pkg-config from the custom PATH.	2022-09-01 23:17:40 +03:00
Anton Mitrofanov	f7074e12d9	CI: Migrate build runners to macOS Monterey	2022-08-31 20:06:58 +03:00
Anton Mitrofanov	baee400fa9	CI: Fix vlc-contrib processing on macos Use perl for in-place editing because sed doesn't work with symlinks.	2022-06-02 01:31:50 +03:00
Stephen Hutchinson	bfc87b7a33	configure: Allow AviSynth+ on *BSD and Haiku	2022-02-22 18:03:57 +00:00
Anton Mitrofanov	95634be643	Fix build on MIPS with AviSynth+ support	2022-02-22 20:46:39 +03:00
Anton Mitrofanov	35fe20d1ba	Replace AvxSynth with AviSynth+ on POSIX systems	2022-02-21 21:57:05 +00:00

1 2 3 4 5 ...

3191 Commits All Branches Search

3191 Commits

All Branches