Make it dynamic and never remove entries from it.
For now, this is better than possibly creating dangling pointers all
over the place in the gl_user_shader struct.
Untested.
This is now a configurable option, with tunable parameters.
I got inspiration for these algorithms off wikipedia. "simple" seems to
work pretty well, but not well enough to make it a reasonable default.
Some other notable candidates:
- Local functions (e.g. based on local contrast or gradient)
- Clamp with soft knee (linear up to a point)
- Mapping in CIE L*Ch. Map L smoothly, clamp C and h.
- Color appearance models
These will have to be implemented some other time.
Note that the parameter "peak_src" to pass_tone_map should, in
principle, be auto-detected from the SEI information of the source file
where available. This will also have to be implemented in a later
commit.
Due to the way color management in mpv worked historically, the subtitle
blending function was written to preserve the linearity of the input.
(In the past, the 3DLUT function required linear inputs)
Since the 3DLUT was refactored to accept the video color directly, the
re-linearization after blending is now virtually always redundant.
(Notably, it's also redundant when CMS is turned off, so this way of
writing the code stopped making sense a long time ago. It is a remnant
from before the pass_colormanage function was as flexible as it is now)
Currently, this relies on the user manually entering their display
brightness (since we have no way to detect this at runtime or from ICC
metadata). The default value of 250 was picked by looking at ~10 reviews
on tftcentral.co.uk and realizing they all come with around 250 cd/m^2
out of the box. (In addition, ITU-R Rec. BT.2022 supports this)
Since there is no metadata in FFmpeg to indicate usage of this TRC, the
only way to actually play HDR content currently is to set
``--vf=format=gamma=st2084``. (It could be guessed based on SEI, but
this is not implemented yet)
Incidentally, since SEI is ignored, it's currently assumed that all
content is scaled to 10,000 cd/m^2 (and hard-clipped where out of
range). I don't see this assumption changing much, though.
As an unfortunate consequence of the fact that we don't know the display
brightness, mixed with the fact that LittleCMS' parametric tone curves
are not flexible enough to support PQ, we have to build the 3DLUT
against gamma 2.2 if it's used. This might be a good thing, though,
consdering the PQ source space is probably not fantastic for
interpolation either way.
Partially addresses #2572.
This is much more readable than hard-coding magic IDs all over the file,
and removes the need for all the explanatory comments that were a direct
result of this.
This macro takes care of rotation, swizzling, integer conversion and
normalization automatically. I found the performance impact to be
nonexistant for superxbr and debanding, although rotation *did* have an
impact due to the extra matrix multiplication. (So it gets skipped where
possible)
All of the internal hooks have been rewritten to use this new mechanism,
and the prescaler hooks have finally been separated from each other.
This also means the prescale FBO kludge is no longer required.
This fixes image corruption for image formats like 0bgr, and also fixes
prescaling under rotation. (As well as other user hooks that have
orientation-dependent access)
The "raw" attributes (tex, tex_pos, pixel_size) are still un-rotated, in
case something needs them, but ideally the hooks should be rewritten to
use the new API as much as possible. The hooked texture has been renamed
from just NAME to NAME_raw to make script authors notice the change (and
also deemphasize direct texture access).
This is also a step towards getting rid of the use_integer pass.
This replaces the previous TRANSFORM by WIDTH, HEIGHT and OFFSET where
WIDTH and HEIGHT are RPN expressions. This allows for more fine-grained
control over the output size, and also makes sure that overwriting
existing textures works more cleanly.
(Also add some more useful bstr functions)
This allows users to add their own near-arbitrary hooks to the vo_opengl
processing pipeline, greatly enhancing the flexibility of user shaders.
This enables, among other things, user shaders such as CrossBilateral,
SuperRes, LumaSharpen and many more.
To make parsing the user shaders easier, shaders are now loaded as
bstrs, and the hooks are set up during video reconfig instead of on
every single frame.
These are "sequence points" where the image could be rendered out to an
FBO, hooked, and re-loaded if any such hook exists. This is perfect for
things like the current user shaders system, as well as optional effects
like unsharp masking.
Note that since we have to pick *some* FBO to store the optionally
hooked texture, we just store it in an array indexed by an increasing
counter. Since we only ever store as many as MAX_TEXTURE_HOOKS + all
internal hook points entries, this is guaranteed to be enough space.
This commit also removes some of the now unused FBOs.
The hook mechanism allows arbitrary processing stages to get dispatched
whenever certain named textures have been "finalized" by the code.
This is mostly meant to serve as a change that opens up the internal
processing in pass_read_video to user scripts, but as a side benefit all
of the code dealing with offsets and plane alignment and other such
confusing things has been rewritten.
This hook mechanism is powerful enough to cover the needs of both
debanding and prescaling (and more), so as a result they can be removed
from pass_read_video entirely and implemented through hooks.
Some avenues for optimization:
- The prescale hook is currently somewhat distributed code-wise. It might be
cleaner to split it into superxbr and NNEDI3 hooks which can each be
self-contained.
- It might be possible to move a large part of the hook code out to an
external file (including the hook definitions for debanding and
prescaling), which would be very much desired.
- Currently, some stages (chroma merging, integer conversion) will
*always* run even if unnecessary. I'm planning another series of
refactors (deferred img_tex) to allow dropping unnecessary shader
stages like these, but that's probably some ways away. In the meantime
it would be doable to re-add some of the logic to skip these stages if
we know we don't need them.
- More hook locations could be added (?)
Instead of rounding down, we round to the nearest float. This reduces
the maximum possible error introduced by this rounding operation. Also
clarify the comment.
Remove non-texture_rg compatibility from LUT sampling. OpenGL without
texture_rg support will always trigger dumb-mode, and dumb-mode does not
use LUTs. It used not to, and that was when this made sense.
Fixes broken colors with --vf=format=0bgr (but only if deband is
disabled).
0bgr means the first byte is padding, while the following three bytes
are bgr. From the vo_opengl perspective, it has 4 physical components
with 3 logical components. copy_img_tex() simply copied 3 components
from the physical representation, which means the last component (r) was
sliced off.
Fix this by not using p->color_swizzle for packed formats, and instead
let packed formats set the per-plane swizzle in texplane.swizzle. The
latter applies the swizzle as part of operation in copy_img_tex(), which
essentially moves physical to logical representations.
Unfortunately, debanding (and thus with opengl-hq defaults) is still
broken.
Make the find_plane_format function take a bit count.
This also makes the function's comment true for the first time the
function and its comment exist. (It was commented as taking bits, but
always took bytes.)
Only a few very low bit depth internal formats can be rendered to in
pure ES2 (GL_RGB565 is the "best" one).
Seems like the only potentially reasonable renderable formats in ES2
could be provided via GL_OES_rgb8_rgba8, or half-floats, so don't
bother with this at all.
Now that we know in advance whether an implementation should support a
specific format, we have more flexibility when determining which format
to use.
In particular, we can drop the roundabout ES logic.
I'm not sure if actually trying to create the FBO for probing still has
any value. But it might, so leave it for now.
Even if everything else is available, the need for first class arrays
breaks it. In theory we could fix this since we don't strictly need
them, but I guess it's not worth bothering.
Also give the misnamed have_mix variable a slightly better name.
This merges all knowledge about texture format into a central table.
Most of the work done here is actually identifying which formats exactly
are supported by OpenGL(ES) under which circumstances, and keeping this
information in the format table in a somewhat declarative way. (Although
only to the extend needed by mpv.) In particular, ES and float formats
are a horrible mess.
Again this is a big refactor that might cause regression on "obscure"
configurations.
This uses the normal autoprobing rules like "auto", but rejects anything
that isn't flagged as copying data back to system memory.
The chunk in command.c was dead code, so remove it instead of updating
it.
MSDN documents this as "Introduced in Windows 8.1.". I assume on Windows
7 this field will simply be ignored. Too bad for Windows 7 users.
Also, I'm not using D3D11_VIDEO_PROCESSOR_NOMINAL_RANGE_16_235 and
D3D11_VIDEO_PROCESSOR_NOMINAL_RANGE_0_255, because these are apparently
completely missing from the MinGW headers. (Such a damn pain.)
We don't have any reason to disable either. Both are loaded dynamically
at runtime anyway. There is also no reason why dxva2 would disappear
from libavcodec any time soon.
ANGLE is _really_ annoying to build. (Requires special toolchain and a
recent MSVC version.) This results in various issues with people
having trouble to build mpv against ANGLE (apparently linking it
against a prebuilt binary doesn't count, or using binaries from
potentially untrusted sources is not wanted).
Dynamically loading ANGLE is going to be a huge convenience. This commit
implements this, with special focus on keeping it source compatible to
a normal build with ANGLE linked at build-time.
In theory this was needed for the previous commit (but wasn't in
practice, since for hwdec the LUMINANCE_ALPHA mangling is not applied
anymore, and ANGLE uses RG textures in absence of GL_ARB_texture_rg for
whatever crazy reasons).
In practice this caused funky colors on OSX with the uyvy422 format,
which is also fixed in this commit.
This uses EGL_ANGLE_stream_producer_d3d_texture_nv12 and related
extensions to map the D3D textures coming from the hardware decoder
directly in GL.
In theory this would be trivial to achieve, but unfortunately ANGLE does
not have a mechanism to "import" D3D textures as GL textures. Instead,
an awkward mechanism via EGL_KHR_stream was implemented, which involves
at least 5 extensions and a lot of glue code. (Even worse than VAAPI EGL
interop, and very far from the simplicity you get on OSX.)
The ANGLE mechanism so far supports only the NV12 texture format, which
means 10 bit won't work. It also does not work in ES3 mode yet. For
these reasons, the "old" ID3D11VideoProcessor code is kept and used as a
fallback.
It forces es2 mode on ANGLE. Only useful for testing. Since the normal
"angle" backend already falls back to es2 if es3 does not work, this new
backend always exit when autoprobing it.
Rename gl_hwdec_driver.map_image to map_frame, and let it fill out a
struct gl_hwdec_frame describing the exact texture layout. This gives
more flexibility to what the hwdec interop can export. In particular, it
can export strange component orders/permutations and textures with
padded size. (The latter originating from cropped video.)
The way gl_hwdec_frame works is in the spirit of the rest of the
vo_opengl video processing code, which tends to put as much information
in immediate state (as part of the dataflow), instead of declaring it
globally. To some degree this duplicates the texplane and img_tex
structs, but until we somehow unify those, it's better to give the hwdec
state its own struct. The fact that changing the hwdec struct would
require changes and testing on at least 4 platform/GPU combinations
makes duplicating it almost a requirement to avoid pain later.
Make gl_hwdec_driver.reinit set the new image format and remove the
gl_hwdec.converted_imgfmt field.
Likewise, gl_hwdec.gl_texture_target is replaced with
gl_hwdec_plane.gl_target.
Split out a init_image_desc function from init_format. The latter is not
called in the hwdec case at all anymore. Setting up most of struct
texplane is also completely separate in the hwdec and normal cases.
video.c does not check whether the hwdec "mapped" image format is
supported. This should not really happen anyway, and if it does, the
hwdec interop backend must fail at creation time, so this is not an
issue.
The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and
renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or
documented where not), which makes the whole thing saner and cleaner. In
particular, thread-safety rules become less subtle and more obvious.
The new internal API makes it easier to support multiple OpenGL interop
backends. (Although this is not done yet, and it's not clear whether it
ever will.)
This also removes all the API-specific fields from mp_hwdec_ctx and
replaces them with a "ctx" field. For d3d in particular, we drop the
mp_d3d_ctx struct completely, and pass the interfaces directly.
Remove the emulation checks from vaapi.c and vdpau.c; they are
pointless, and the checks that matter are done on the VO layer.
The d3d hardware decoders might slightly change behavior: dxva2-copy
will not use the VO device anymore if the VO supports proper interop.
This pretty much assumes that any in such cases the VO will not use any
form of exclusive mode, which makes using the VO device in copy mode
unnecessary.
This is a big refactor. Some things may be untested and could be broken.
In order to honor the differences between OpenGL and Direct3D coordinate
systems, ANGLE uses a full FBO copy merely to flip the final frame
vertically. This can be avoided with the EGL_ANGLE_surface_orientation
extension.
I hope that this does what we expect it does: destroy the EGLDisplay
specific to our HDC. (Some implementations will terminate all EGL
contexts in the whole process.)
eglReleaseThread() merely calls eglMakeCurrent(0, 0, 0, 0), which is
not enough.
This commit also fixes the problem fixed with the previous commit,
but I think both changes are needed to make our API usage clean.
If ANGLE was probed before (but rejected), the ANGLE API can remain
"initialized", and eglGetCurrentDisplay() will return a non-NULL
EGLDisplay. Then if a native GL context is used, the ANGLE/EGL API will
then (apparently) keep working alongside native OpenGL API. Since GL
objects are just numbers, they'll simply fail to interact, and OpenGL
will get invalid textures. For some reason this will result in black
textures.
With VAAPI-EGL, something similar could happen in theory, but didn't in
practice.
Introduce hwdec-current and hwdec-interop properties.
Deprecate hwdec-detected, which never made a lot of sense, and which is
replaced by the new properties. hwdec-active also becomes useless, as
hwdec-current is a superset, so it's deprecated too (for now).
Cache misses are a normal and expected part of the operation of a cache.
It doesn't really make sense to show a user-visible warning for them.
To work-around this, just skip trying to open the cache if it doesn't
exist yet.
First of all, black point compensation is now on by default. This is
really rather harmless and only improves the result (where "improvement"
means "less black clipping").
Second, this adds an option to limit the ICC profile's contrast, which
helps for untagged matrix profiles that are implicitly black scaled even
in colorimetric intent. (Note that this relies on BPC being enabled to
work properly, which is why the two changes are tied together)
Third, this uses the LittleCMS built in black point estimator instead of
relying on the presence of accurate A2B tables. This also checks tags
and does some amounts of noise elimination.
If the option is unspecified and the profile is missing black point
information, print a warning instructing the user to set the option, and
fall back to 1000 otherwise.
The vdpau_mixer could fail to be recreated properly if preemption
occured at some point before playback initialization (like when using
--hwdec-preload and the opengl-cb API).
Normally, the vdpau_mixer was supposed to be marked invalid when the
components using it detect a preemption, e.g. in hwdec_vdpau.c. This one
didn't mark the vdpau_mixer as invalid if preemption was detected in
reinit(), only in map_image().
It's cleaner to detect preemption directly in the vdpau_mixer, which
ensures it's always recreated correctly.
Including initguid.h at the top of a file that uses references to GUIDs
causes the GUIDs to be declared globally with __declspec(selectany). The
'selectany' attribute tells the linker to consolidate multiple
definitions of each GUID, which would be great except that, in Cygwin
and MinGW GCC 6.1, this method of linking makes the GUIDs conflict with
the ones declared in libuuid.a.
Since initguid.h obsoletes libuuid.a in modern compilers that support
__declspec(selectany), add initguid.h to all files that use GUIDs and
remove libuuid.a from the build.
Fixes#3097