videolan/x264 - x264

Commit Graph

Author	SHA1	Message	Date
Ziemowit Zabawa	d8debbc6dd	Fix typos with codespell tool This includes fixing convert_method_to_flag so it recognizes "gauss" parameter properly instead of silently defaulting to bicubic.	2021-09-29 21:11:16 +00:00
Anton Mitrofanov	8e5e8340f0	Bump dates to 2021	2021-01-24 16:38:34 +03:00
Anton Mitrofanov	33f9e14746	Fix warning: comparison of integers of different signs [-Wsign-compare]	2020-04-09 15:36:22 +03:00
Anton Mitrofanov	04e6c65e6b	Bump dates to 2020	2020-02-29 22:02:01 +03:00
Henrik Gramner	ec1d32302d	Bump dates to 2019	2019-03-06 22:45:52 +03:00
Henrik Gramner	ca5408b13c	Bump dates to 2018	2018-01-17 18:31:04 +01:00
Vittorio Giovara	71ed44c731	Unify 8-bit and 10-bit CLI and libraries Add 'i_bitdepth' to x264_param_t with the corresponding '--output-depth' CLI option to set the bit depth at runtime. Drop the 'x264_bit_depth' global variable. Rather than hardcoding it to an incorrect value, it's preferable to induce a linking failure. If applications relies on this symbol this will make it more obvious where the problem is. Add Makefile rules that compiles modules with different bit depths. Assembly on x86 is prefixed with the 'private_prefix' define, while all other archs modify their function prefix internally. Templatize the main C library, x86/x86_64 assembly, ARM assembly, AARCH64 assembly, PowerPC assembly, and MIPS assembly. The depth and cache CLI filters heavily depend on bit depth size, so they need to be duplicated for each value. This means having to rename these filters, and adjust the callers to use the right version. Unfortunately the threaded input CLI module inherits a common.h dependency (input/frame -> common/threadpool -> common/frame -> common/common) which is extremely complicated to address in a sensible way. Instead duplicate the module and select the appropriate one at run time. Each bitdepth needs different checkasm compilation rules, so split the main checkasm target into two executables.	2017-12-24 23:47:24 +03:00
Vittorio Giovara	8f2437d333	Drop the x264 prefix from static functions and variables	2017-12-24 23:11:30 +03:00
Vittorio Giovara	1d2420981a	arm/aarch64: Correctly prefix integral function symbols	2017-01-21 14:10:37 +01:00
Henrik Gramner	c7a2e327be	Bump dates to 2017	2017-01-21 14:10:37 +01:00
Anton Mitrofanov	b2b39dae0b	Cosmetics Also make x264_weighted_reference_duplicate() static.	2016-12-01 18:00:07 +01:00
Janne Grunau	5caef139cf	arm/aarch64: use plane_copy wrapper macros Move the macros to common/mc.h to share them across all architectures. Fixes possible buffer overreads if the width of the user supplied frames is not a multiple of 16. Reported-by: Kirill Batuzov <batuzovk@ispras.ru>	2016-09-17 15:10:14 +02:00
Janne Grunau	14a58532fe	arm: Add asm for mbtree fixed point conversion 7-8 times faster on a cortex-a53 vs. gcc-5.3. mbtree_fix8_pack_c: 44114 mbtree_fix8_pack_neon: 5805 mbtree_fix8_unpack_c: 38924 mbtree_fix8_unpack_neon: 4870	2016-06-13 22:07:00 +02:00
Henrik Gramner	d23d186552	Bump dates to 2016	2016-01-17 00:30:13 +01:00
Janne Grunau	424534537a	arm: do not fill mc_weight*_neon tabs for HIGH_BIT_DEPTH The asm is only for 8-bit and function prototypes reflect that. Avoids numerous warnings with --bit-depth=9/10.	2015-12-20 18:40:11 +01:00
Martin Storsjö	6f04b14687	arm: Implement x264_mbtree_propagate_{cost, list}_neon The cost function could be simplified to avoid having to clobber q4/q5, but this requires reordering instructions which increase the total runtime. checkasm timing Cortex-A7 A8 A9 mbtree_propagate_cost_c 63702 155835 62829 mbtree_propagate_cost_neon 17199 10454 11106 mbtree_propagate_list_c 104203 108949 84532 mbtree_propagate_list_neon 82035 78348 60410	2015-10-11 18:44:54 +02:00
Martin Storsjö	5db8b6b93a	arm: Implement x264_plane_copy_neon checkasm timing Cortex-A7 A8 A9 plane_copy_c 13124 10925 9106 plane_copy_neon 7349 5103 8945	2015-10-11 18:44:54 +02:00
Martin Storsjö	5265b927b0	arm: Implement integral_init4/8h/v_neon checkasm timing Cortex-A7 A8 A9 integral_init4h_c 10466 8590 6161 integral_init4h_neon 3021 1494 1800 integral_init4v_c 16250 13590 13628 integral_init4v_neon 3473 2073 3291 integral_init8h_c 10100 8275 5705 integral_init8h_neon 4403 2344 2751 integral_init8v_c 6403 4632 4999 integral_init8v_neon 1184 783 1306	2015-10-11 18:44:54 +02:00
Yu Xiaolei	627f891c57	NV21 input support Eliminates an extra copy when encoding Android camera preview images. Checkasm test by Janne Grunau. ARM assembly with improvements from Janne Grunau.	2015-07-25 22:52:54 +02:00
Anton Mitrofanov	d7ccd89f1b	Bump dates to 2015	2015-02-23 13:34:44 +03:00
Anton Mitrofanov	30140b34b8	Fix bugs/typos in motion compensation and cache_load Didn't affect output due to the incorrect values either not being used in the code path or producing equal results compared to the correct values. Also deduplicate hpel_ref arrays.	2014-12-13 00:34:15 +01:00
Janne Grunau	fadc4045f9	arm: use the weight_fn_t typedef for mc weight function arrays	2014-04-22 15:37:50 -07:00
Janne Grunau	644c396be9	arm: correct x264_mc_chroma_neon function declaration	2014-04-22 15:37:50 -07:00
Janne Grunau	2e96c571b8	arm: x264_store_interleave_chroma_neon store_interleave_chroma_c: 4036 store_interleave_chroma_neon: 1043	2014-04-22 15:37:49 -07:00
Janne Grunau	1576e51e52	arm: x264_plane_copy_interleave_neon plane_copy_interleave_c: 40285 plane_copy_interleave_neon: 10137	2014-04-22 15:37:49 -07:00
Janne Grunau	0016dec270	arm: x264_plane_copy_deinterleave_rgb_neon plane_copy_deinterleave_rgb_c: 31543 plane_copy_deinterleave_rgb_neon: 8312	2014-04-22 15:37:49 -07:00
Janne Grunau	5e0ca9aa4e	arm: load_deinterleave_chroma_f{dec,enc}_neon load_deinterleave_chroma_fdec_c: 4055 load_deinterleave_chroma_fdec_neon: 995 load_deinterleave_chroma_fenc_c: 4071 load_deinterleave_chroma_fenc_neon: 992	2014-04-22 15:37:48 -07:00
Janne Grunau	c9a5ae0d21	arm: x264_plane_copy_deinterleave_neon plane_copy_deinterleave_c: 42988 plane_copy_deinterleave_neon: 10184	2014-04-22 15:37:48 -07:00
Janne Grunau	2794ba5bb0	arm: add missing macro instantiation for x264_pixel_avg_4x16_neon checkasm --bench on a cortex-a9: avg_4x16_c: 8910 avg_4x16_neon: 2091	2014-04-22 15:37:48 -07:00
Henrik Gramner	807aeaaae7	Bump dates to 2014 Also update AUTHORS file and my e-mail address in the headers of various files.	2014-01-08 11:15:45 -08:00
Stefan Groenroos	3a8baa0ec6	ARM: update NEON mc_chroma to work with NV12 and re-enable it Up to 10-15% faster overall.	2013-02-26 15:13:17 -08:00
Loren Merritt	732b072ae2	Bump dates to 2013	2013-01-08 16:01:32 -08:00
Henrik Gramner	3131a19cab	Fix incorrect zero-extension assumptions in x86_64 asm Some x264 asm assumed that the high 32 bits of registers containing "int" values would be zero. This is almost always the case, and it seems to work with gcc, but it is not guaranteed by the ABI. As a result, it breaks with some other compilers, like Clang, that take advantage of this in optimizations. Accordingly, fix all x86 code by using intptr_t instead of int or using movsxd where neccessary. Also add checkasm hack to detect when assembly functions incorrectly assumes that 32-bit integers are zero-extended to 64-bit.	2012-03-06 10:37:53 -08:00
Hii	27a7b05b83	Bump dates to 2012	2012-02-04 07:18:13 -08:00
Fiona Glaser	9bbfc30284	Split prefetch_fenc between colorspaces Add 4:2:2 version.	2011-10-21 17:22:56 -07:00
Sean McGovern	ee9bc136e9	Bump dates to 2011	2011-01-25 12:16:24 -08:00
Oskar Arvidsson	1382552b8c	Convert X264_HIGH_BIT_DEPTH to HIGH_BIT_DEPTH Less verbose.	2010-11-19 09:47:36 -08:00
Fiona Glaser	213a99d070	Update source file headers Update dates, improve file descriptions, make things more consistent. Also add information about commercial licensing.	2010-09-18 01:30:37 -07:00
Loren Merritt	387828eda8	Convert x264 to use NV12 pixel format internally ~1% faster overall on Conroe, mostly due to improved cache locality. Also allows improved SIMD on some chroma functions (e.g. deblock). This change also extends the API to allow direct NV12 input, which should be a bit faster than YV12. This isn't currently used in the x264cli, as swscale does not have fast NV12 conversion routines, but it might be useful for other applications. Note this patch disables the chroma SIMD code for PPC and ARM until new versions are written.	2010-07-14 19:03:32 -07:00
Oskar Arvidsson	c91f43a4b0	Support for 9 and 10-bit encoding Output bit depth is specified on compilation time via --bit-depth. There is currently almost no assembly code available for high-bit-depth modes, so encoding will be very slow. Input is still 8-bit only; this will change in the future. Note that very few H.264 decoders support >8 bit depth currently. Also note that the quantizer scale differs for higher bit depth. For example, for 10-bit, the quantizer (and crf) ranges from 0 to 63 instead of 0 to 51.	2010-07-04 14:47:33 -07:00
Henrik Gramner	8c02c79035	Shrink even more constant arrays	2010-05-16 22:51:12 -07:00
David Conrad	b46cec4f01	ARM NEON versions of weightp functions	2010-02-15 01:00:05 -08:00
David Conrad	aa48c1fbb7	Fix x264 compilation on Apple GCC Apple's GCC stupidly ignores the ARM ABI and doesn't give any stack alignment beyond 4.	2010-01-13 23:47:02 -05:00
David Conrad	094110915e	Fix weightp on ARM + PPC No ARM or PPC assembly yet though.	2009-11-08 20:21:52 -08:00
David Conrad	53a5772a35	Various ARM-related fixes Fix comment for mc_copy_neon. Fix memzero_aligned_neon prototype. Update NEON (i)dct_dc prototypes. Duplicate x86 behavior for global+hidden functions.	2009-11-08 20:21:47 -08:00
David Conrad	6bf21c631a	GSOC merge part 4: ARM NEON mc assembly functions prefetch, memcpy_aligned, memzero_aligned, avg, mc_luma, get_ref, mc_chroma, hpel_filter, frame_init_lowres	2009-08-24 06:00:28 -07:00

46 Commits