ffmpeg

Commit Graph

Author	SHA1	Message	Date
Timo Rothenpieler	416923346a	compat/cuda: switch from powf to __powf intrinsic The powf builtin causes crashes on older clang, so manually implement the (faster) intrinsic. The code it spawns is identical to that of nvcc.	2022-09-03 20:27:34 +02:00
Mohamed Khaled Mohamed	1a5cd79f51	avfilter: add bilateral_cuda filter GSoC 2022 Signed-off-by: Mohamed Khaled <mohamed.elbassiony00@eng-st.cu.edu.eg> Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2022-09-03 15:18:56 +02:00
Mohamed Khaled Mohamed	b1648150b2	avfilter: add chromakey_cuda filter GSoC'22 libavfilter/vf_chromakey_cuda.cu:the CUDA kernel for the filter libavfilter/vf_chromakey_cuda.c: the C side that calls the kernel and gets user input libavfilter/allfilters.c: added the filter to it libavfilter/Makefile: added the filter to it cuda/cuda_runtime.h: added two math CUDA functions that are used in the filter Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2022-07-10 17:20:15 +02:00
Timo Rothenpieler	acd3c101ef	compat/cuda: add __expf() implementation	2021-08-14 15:06:47 +02:00
Timo Rothenpieler	072788c46e	avfilter: compress CUDA PTX code if possible	2021-06-22 14:05:44 +02:00
Matt Oliver	b57037d663	compat/cuda: correct ushort4 to use ushort Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2021-02-22 17:03:52 +01:00
rcombs	eabf5e6d6b	All: update names in copyright headers	2021-01-20 01:02:56 -06:00
Timo Rothenpieler	cfdddec0c8	avfilter/scale_cuda: add lanczos algorithm	2020-11-04 01:43:21 +01:00
Timo Rothenpieler	f1d0f83712	avfilter/scale_cuda: add bicubic interpolation	2020-11-03 19:58:13 +01:00
rcombs	fb17ba86a8	compat/cuda/ptx2c: remove shell loop; fix BSD sed compat This fixes building on macOS, and improves build times dramatically there	2020-06-01 22:10:41 -05:00
Andreas Rheinhardt	b307d74fe6	compat/cuda: Change inclusion guards cuda_runtime.h as well as dynlink_loader.h used nonstandard inclusion guards with an AV_ prefix, although these files are not in an libav*/ path. So change the inclusion guards and adapt the ref file of the source fate test accordingly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2019-08-05 12:07:09 +02:00
Rodger Combs	01994c93db	build: add support for building CUDA files with clang This avoids using the CUDA SDK at all; instead, we provide a minimal reimplementation of the basic functionality that lavfi actually uses. It generates very similar code to what NVCC produces. The header contains no implementation code derived from the SDK. The function and type declarations are derived from the SDK only to the extent required to build a compatible implementation. This is generally accepted to qualify as fair use. Because this option does not require the proprietary SDK, it does not require the "--enable-nonfree" flag in configure. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2019-08-04 19:08:08 +02:00
Timo Rothenpieler	a6818d5bd0	compat/cuda/ptx2c: don't drop final newline	2019-05-24 19:23:39 +02:00
Timo Rothenpieler	27cbbbb33f	compat: remove in-tree NVidia headers External headers are no longer welcome in the ffmpeg codebase because they increase the maintenance burden. However, in the NVidia case the vanilla headers need some modifications to be usable in ffmpeg therefore we still provide them, but in a separate repository. The external headers can be found at https://git.videolan.org/?p=ffmpeg/nv-codec-headers.git Fate-source is updated because of the deleted files, and dynlink_loader.h license headers were updated with the standard FFmpeg headers. Signed-off-by: Marton Balint <cus@passwd.hu> Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2018-02-27 16:22:12 +01:00
Mark Thompson	1dc483a6f2	compat/cuda: Pass a logging context to load functions Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>	2017-11-20 15:47:05 +00:00
Ricardo Constantino	7fbc082577	compat/cuda/ptx2c: strip CR from each line Windows nvcc + cl.exe produce a .ctx file with CR+LF newlines which need to be stripped to work with gcc. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>	2017-08-30 11:20:34 +02:00
Timo Rothenpieler	f890a6d712	compat/cuda: make cuvidGetDecoderCaps optional	2017-06-01 12:39:06 +02:00
Timo Rothenpieler	88896c4619	compat/cuda/ptx2c: remove bashism and harden against arbitrary input	2017-05-15 18:54:38 +02:00
Timo Rothenpieler	f1ab71b046	build: add support for building .cu files via nvcc Original work by Yogender Gupta <ygupta@nvidia.com>	2017-05-15 11:46:50 +02:00
Timo Rothenpieler	f15129a44b	compat/cuda: fix cast warnings on windows	2017-05-09 18:38:30 +02:00
Timo Rothenpieler	17f63d98e6	compat/cuda: update cuvid/nvdec headers to Video Codec SDK 8.0.14 This raises the required minimum NVIDIA display driver versions: NVIDIA Linux display driver 378.13 or newer NVIDIA Windows display driver 378.66 or newer	2017-05-09 18:38:30 +02:00
Timo Rothenpieler	b27be563a8	compat/cuda: fix ulong size on cygwin	2017-03-01 12:08:34 +01:00
Philip Langdale	81147b5596	avcodec/cuvid: Add support for P010/P016 as an output surface format The nvidia 375.xx driver introduces support for P016 output surfaces, for 10bit and 12bit HEVC content (it's also the first driver to support hardware decoding of 12bit content). The cuvid api, as far as I can tell, only declares one output format that they appear to refer to as P016 in the driver strings. Of course, 10bit content in P016 is identical to P010, and it is useful for compatibility purposes to declare the format to be P010 to work with other components that only know how to consume P010 (and to avoid triggering swscale conversions that are lossy when they shouldn't be). For simplicity, this change does not maintain the previous ability to output dithered NV12 for 10/12 bit input video - the user will need to update their driver to decode such videos.	2016-11-22 10:09:30 -08:00
Timo Rothenpieler	d9ad18f3b4	avcodec/cuvid: use dynamically loaded CUDA/CUVID And remove the now obsolete compat headers.	2016-11-22 10:34:27 +01:00
Timo Rothenpieler	5c02d2827b	compat/cuda: add dynamic loader	2016-11-22 10:34:27 +01:00
Timo Rothenpieler	7904859fd8	compat/cuda: convert to unix line endings	2016-09-23 11:43:00 +02:00
Philip Langdale	843aff3cf7	cuvid: Use bundled headers We need to remove the dynlink fanciness and replace it with normal function prototypes and update the include paths and configure logic. We don't need to explicitly check for PICPARMS now - they're going to be there.	2016-09-22 18:38:51 -07:00
Philip Langdale	f59e10b0f4	cuvid: Add MIT licenced nvcuid headers from Video SDK 7.0 For unknown reasons, the only accurately descriptive version of cuviddec.h is in the Video SDK - the one in CUDA 7.5 lacks vp8 PICPARAMS and the vp9 struct definition is inaccurate. The CUDA 8 RC includes an ancient version of this file from many many years go. However, the one in the Video SDK is modified to work through a dynamic link mechanism which we don't really want to use, so the next change will modify the files to just declare functions in the normal way. I've split the changes so it's clear to see what changed between the original files and ones that work for us.	2016-09-22 18:38:36 -07:00

28 Commits