This macro takes care of rotation, swizzling, integer conversion and
normalization automatically. I found the performance impact to be
nonexistant for superxbr and debanding, although rotation *did* have an
impact due to the extra matrix multiplication. (So it gets skipped where
possible)
All of the internal hooks have been rewritten to use this new mechanism,
and the prescaler hooks have finally been separated from each other.
This also means the prescale FBO kludge is no longer required.
This fixes image corruption for image formats like 0bgr, and also fixes
prescaling under rotation. (As well as other user hooks that have
orientation-dependent access)
The "raw" attributes (tex, tex_pos, pixel_size) are still un-rotated, in
case something needs them, but ideally the hooks should be rewritten to
use the new API as much as possible. The hooked texture has been renamed
from just NAME to NAME_raw to make script authors notice the change (and
also deemphasize direct texture access).
This is also a step towards getting rid of the use_integer pass.
These are "sequence points" where the image could be rendered out to an
FBO, hooked, and re-loaded if any such hook exists. This is perfect for
things like the current user shaders system, as well as optional effects
like unsharp masking.
Note that since we have to pick *some* FBO to store the optionally
hooked texture, we just store it in an array indexed by an increasing
counter. Since we only ever store as many as MAX_TEXTURE_HOOKS + all
internal hook points entries, this is guaranteed to be enough space.
This commit also removes some of the now unused FBOs.
Remove non-texture_rg compatibility from LUT sampling. OpenGL without
texture_rg support will always trigger dumb-mode, and dumb-mode does not
use LUTs. It used not to, and that was when this made sense.
Since what we're doing is a linear blend of the four colors, we can just
do it for free by using GPU sampling.
This requires significantly fewer texture fetches and calculations to
compute the final color, making it much more efficient. The code is also
much shorter and simpler.
This commit refactors the 3DLUT loading mechanism to build the 3DLUT
against the original source characteristics of the file. This allows us,
among other things, to use a real BT.1886 profile for the source. This
also allows us to actually use perceptual mappings. Finally, this
reduces errors on standard gamut displays (where the previous 3DLUT
target of BT.2020 was unreasonably wide).
This also improves the overall accuracy of the 3DLUT due to eliminating
rounding errors where possible, and allows for more accurate use of
LUT-based ICC profiles.
The current code is somewhat more ugly than necessary, because the idea
was to implement this commit in a working state first, and then maybe
refactor the profile loading mechanism in a later commit.
Fixes#2815.
This is a pretty major rewrite of the internal texture binding
mechanic, which makes it more flexible.
In general, the difference between the old and current approaches is
that now, all texture description is held in a struct img_tex and only
explicitly bound with pass_bind. (Once bound, a texture unit is assumed
to be set in stone and no longer tied to the img_tex)
This approach makes the code inside pass_read_video significantly more
flexible and cuts down on the number of weird special cases and
spaghetti logic.
It also has some improvements, e.g. cutting down greatly on the number
of unnecessary conversion passes inside pass_read_video (which was
previously mostly done to cope with the fact that the alternative would
have resulted in a combinatorial explosion of code complexity).
Some other notable changes (and potential improvements):
- texture expansion is now *always* handled in pass_read_video, and the
colormatrix never does this anymore. (Which means the code could
probably be removed from the colormatrix generation logic, modulo some
other VOs)
- struct fbo_tex now stores both its "physical" and "logical"
(configured) size, which cuts down on the amount of width/height
baggage on some function calls
- vo_opengl can now technically support textures with different bit
depths (e.g. 10 bit luma, 8 bit chroma) - but the APIs it queries
inside img_format.c doesn't export this (nor does ffmpeg support it,
really) so the status quo of using the same tex_mul for all planes is
kept.
- dumb_mode is now only needed because of the indirect_fbo being in the
main rendering pipeline. If we reintroduce p->use_indirect and thread
a transform through the entire program this could be skipped where
unnecessary, allowing for the removal of dumb_mode. But I'm not sure
how to do this in a clean way. (Which is part of why it got introduced
to begin with)
- It would be trivial to resurrect source-shader now (it would just be
one extra 'if' inside pass_read_video).
Why was this done so stupidly, with so many complicated special cases,
before? Declare it once so the shader bits don't have to figure out where
and when to do so themselves.
Do this to make the license situation less confusing.
This change should be of no consequence, since LGPL is compatible with
GPL anyway, and making it LGPL-only does not restrict the use with GPL
code.
Additionally, the wording implies that this is allowed, and that we can
just remove the GPL part.
The recent LUT adjustment changes broke interpolation.
The concatenation of the shader stages is a bit messy, and it seems like
sampler_prelude is not a good place to add this macro. Always add the
macro to every shader instead. (While this doesn't seem too elegant,
this isn't too inelegant either, and goes these problems out of the
way.)
Define a macro to correct the coordinate for lookup texture. Cache
the corrected coordinate for 1D filter and use mix() to minimize the
performance impact.
If the sampling point is placed diagonally, the radius difference
could be as large as sqrt(2.0). And a loosened check with (radius - 1)
would potentially include pixels out of the range.
Fix the check to handle those corner case properly to avoid
unnecessary texture lookup and improve the performance a bit.
Polar scalers use 1D textures, because they're slightly faster on some
GPUs than 2D textures. But 2D textures work too, so add support for
them.
Allows using these scalers with ANGLE.
Just like commit f9a2fc59. There are probably some more such cases.
The vec2 constructor calls are probably fine, but don't bother with
confusing inconsistencies.
It's great that the new algorithm supports multiple placebo iterations
and all, but it's really not necessary and hurts performance in the
general case for the sake of the 0.1% that actually pause the screen
and look for minute differences.
Signed-off-by: wm4 <wm4@nowhere>
This turns the old scalers (inherited from MPlayer) into a pre-
processing step (after color conversion and before scaling). The code
for the "sharpen5" scaler is reused for this.
The main reason MPlayer implemented this as scalers was perhaps because
FBOs were too expensive, and making it a scaler allowed to implement
this in 1 pass. But unsharp masking is not really a scaler, and I would
guess the result is more like combining bilinear scaling and unsharp
masking.
2 things are being stupid here: Apple for requiring rectangle textures
with their IOSurface interop for no reason, and OpenGL having a
different sampler type for rectangle textures.
The removal of source-shader is a side effect, since this effectively
replaces it - and the video-reading code has been significantly
restructured to make more sense and be more readable.
This means users no longer have to constantly download and maintain a
separate deband.glsl installation alongside mpv, which was the only real
use case for source-shader that we found either way.
This is mostly to cut down somewhat on the amount of code bloat in
video.c by moving out helper functions (including scaler kernels and
color management routines) to a separate file.
It would certainly be possible to move out more functions (eg. dithering
or CMS code) with some extra effort/refactoring, but this is a start.
Signed-off-by: wm4 <wm4@nowhere>