1
mirror of https://github.com/mpv-player/mpv synced 2024-11-14 22:48:35 +01:00
Commit Graph

911 Commits

Author SHA1 Message Date
wm4
168ffbaf23 client API: more opengl_cb clarifications
Also fix a typo in ra_gl.c. Too greedy for a separate commit.
2017-08-07 19:24:25 +02:00
wm4
d45fbecbb5 vo_opengl: add another ra_format field to exclude insane formats
Generic description of pixel formats is hard. In this case, the Apple
special format for packed YUV could have been interpreted as a RGB
format with funny packing.
2017-08-07 19:18:58 +02:00
wm4
47ea771b7a vo_opengl: further GL API use separation
Move multiple GL-specific things from the renderer to other places like
vo_opengl.c, vo_opengl_cb.c, and ra_gl.c.

The vp_w/vp_h parameters to gl_video_resize() make no sense anymore, and
are implicitly part of struct fbodst.

Checking the main framebuffer depth is moved to vo_opengl.c. For
vo_opengl_cb.c it always assumes 8. The API user now has to override
this manually. The previous heuristic didn't make much sense anyway.

The only remaining dependency on GL is the hwdec stuff, which is harder
to change.
2017-08-07 19:17:28 +02:00
wm4
1adf324d8b vo_opengl: fix minor memory leak
Don't leak the buffer if glGetProgramBinary() fails.
2017-08-07 18:46:40 +02:00
Niklas Haas
bed421d483
vo_opengl: nuke ra_gl->first_run
Completely unnecessary, we can just update the uniforms immediately
after creating the program. In theory, for GLSL 4.20+, we could even
skip this, but oh well.
2017-08-07 17:47:04 +02:00
Niklas Haas
ecbb02148b vo_opengl: better formatting for enum RA_CAP
Also fixes an issue where 1 << 5 was used twice, probably because of the
terrible formatting obscuring this bug
2017-08-07 17:46:04 +02:00
Niklas Haas
01a40bb1ee vo_opengl: also support RA_VARTYPE_INT vertex attribs
No reason not to.
2017-08-07 17:46:04 +02:00
wm4
346ac1e09f vo_opengl: simplify mirroring and fix it if glBlitFramebuffer is used
The vp_w/vp_h variables and parameters were not really used anymore
(they were redundant with ra_tex w/h) - but vp_h was still used to
identify whether rendering should be done mirrored.

Simplify this by adding a fbodst struct (some bad naming), which
contains the render target texture, and some parameters how it should be
rendered to (for now only flipping). It would not be appropriate to make
this a member of ra_tex, so it's a separate struct.

Introduces a weird regression for the first frame rendered after
interpolation is toggled at runtime, but seems to work otherwise. This
is possibly due to the change that blit() now mirrors, instead of just
copying. (This is also why ra_fns.blit is changed.)

Fixes #4719.
2017-08-07 16:44:15 +02:00
wm4
41ee66d566 vo_opengl: drop pointless fbotex_init() function 2017-08-07 14:34:18 +02:00
Niklas Haas
9581fbe569 vo_opengl: generalize ra_buf to support other buffer objects
This allows us to integrate PBOs and SSBOs into the same abstraction,
with the potential to easily add UBOs if the need arises.
2017-08-07 12:46:30 +02:00
Niklas Haas
494aa0f651
vo_opengl: only mark frames as fresh if they contain a new image
When using dumb mode, we can actually redraw a frame without uploading
it. Marking this as fresh as well results in unpredictable pass
behavior, which is confusing and makes debugging harder. So mark it as a
redraw instead, in that case.
2017-08-06 02:51:11 +02:00
Niklas Haas
988d188d96
vo_opengl: drop ra_gl.h from shader_cache.c
Since the GL *gl is no longer needed for the timers, we can get rid of
the sc->gl dependency. This requires moving a utility function (which is
not GL-specific anyway) out of gl_utils.h and into utils.h
2017-08-06 00:10:22 +02:00
Niklas Haas
e5748e891f vo_opengl: measure pass_draw_osd as a whole
In the past, this always measured the per-shader execution times of the
individual OSD parts, which was thrown off because the shader was reused
anyway. (And apparently recording the OSD shader execution times was
removed completely, probably because of them being so unrealiably
anyway)

Since ra_timer no longer has the restriction of not allowing timers to
run concurrently, we can just wrap the entire OSD block inside a single
osd_timer now, and record that. (Technically, this can still be off when
using --blend-subtitles=video/yes and showing a full-screen OSD at the
same time. Maybe this can be done better?)
2017-08-06 00:10:20 +02:00
Niklas Haas
f2298f394e vo_opengl: move timers to struct ra
In order to prevent code duplication and keep the ra abstraction as
small as possible, `ra` only implements the actual timer queries,
it does not do pooling/averaging of the results. This is instead moved
to a ra-neutral struct timer_pool in utils.c.
2017-08-06 00:10:20 +02:00
wm4
56742ecdc9 vo_opengl: ra_gl: make getting GL ptr slightly less tedious 2017-08-05 17:09:25 +02:00
wm4
dddda6e4a5 vo_opengl: move GL state resetting to vo_opengl_cb
This code is pretty much for the sake of vo_opengl_cb API users. It
resets certain state that either the user or our code doesn't reset
correctly. This is somewhat outdated. With GL implicit state being
so awfully large, it seems more reasonable require that any code
restores the default state when returning to the caller. Some
exceptions are defined in opengl_cb.h.
2017-08-05 16:27:09 +02:00
wm4
333cae74ef vo_opengl: move shader handling to ra
Now all GL-specifics of shader compilation are abstracted through ra.
Of course we still have everything hardcoded to GLSL - that isn't going
to change.

Some things will probably change later - in particular, the way we pass
uniforms and textures to the shader. Currently, there is a confusing
mismatch between "primitive" uniforms like floats, and others like
textures.

Also, SSBOs are not abstracted yet.
2017-08-05 16:27:09 +02:00
wm4
f72a33d2cb vo_opengl: organize ra PBO flag slightly differently
Instead of having a mutable ra_tex field (and the only one), move the
flag to struct ra, since we have only 2 tex_upload user calls anyway,
and both want the same PBO behavior. (At first I considered making it
a RA_TEX_UPLOAD_ flag, but why bother. PBOs are a terribly GL-specific
thing, so we can't expect a reasonable abstraction of it anyway.)
2017-08-05 13:48:46 +02:00
wm4
dd096863fa vo_opengl: make OSD code use ra for textures
This requires a silly extension to ra_fns.tex_upload: since the OSD
texture can be much larger than the actual OSD image data to upload, a
mechanism for uploading only to a small part of the texture is needed.
Otherwise, we'd have to realloc/copy the data, just to pad it, and then
pay for uploading the padding too.

The RA_TEX_UPLOAD_DISCARD flag is not interpreted by GL (not sure how
you'd tell GL about this), but it clarifies the API and might be
helpful if we support other backend APIs in the future.
2017-08-05 13:44:30 +02:00
wm4
8dd4ae13ff vo_opengl: restore OSX "old" hwdec
Probably. Untested.
2017-08-05 13:09:05 +02:00
wm4
aac04c0d64 vo_opengl: split utils.c/h
Actually GL-specific parts go into gl_utils.c/h, the shader cache
(gl_sc*) into shader_cache.c/h.

No semantic changes of any kind, except that the VAO helper is made
public again as part of gl_utils.c (all while the goal for gl_utils.c
itself is to be included by GL-specific code).
2017-08-05 13:09:05 +02:00
wm4
fa4a1c4675 vo_opengl: always use GL_TRIANGLES for all primitives
Will make the ra layer _slightly_ simpler.
2017-08-05 13:09:05 +02:00
wm4
0206efa94a vo_opengl: pass ra objects during rendering instead of GL objects
Another "small" step towards removing GL dependencies from the renderer.
This commit generally passes ra_tex objects instead of GL FBO integer
IDs to various rendering functions. video.c still manually binds the
FBOs when calling shaders.

This also happens to fix a memory leak with output_fbo.
2017-08-05 13:09:05 +02:00
wm4
a796745fd2 vo_opengl: make fbotex helper use ra
Further work removing GL dependencies from the actual video renderer,
and moving them into ra backends.

Use of glInvalidateFramebuffer() falls away. I'd like to keep this, but
it's better to readd it once shader runs are in ra.
2017-08-05 13:09:05 +02:00
wm4
90b53fede6 vo_opengl: drop unused custom texture filter for FBO helper 2017-08-05 13:09:05 +02:00
Rostislav Pehlivanov
e406e81477 vo_opengl: always print when getting embedded ICC profile data
The printout in get_vid_profile() gets skipped if icc caching has
been enabled, so always print if an embedded ICC profile has been
provided.
2017-08-04 09:50:13 +01:00
Niklas Haas
fee6b287a5 vo_opengl: support embedded ICC profiles
This currently only works when using lcms-based color management
(--icc-profile-*).

In principle, we could also support using lcms even when the user has
not specified an ICC profile, by generating the profile against a fixed
reference (--target-prim/--target-trc) instead. I still might do that
some day, simply because 3dlut provides a higher quality conversion than
our simple gamut mapping does for stuff like BT.2020, and also because
it's now needed to enable embedded ICC profiles. But that would be a
separate change, so preserve the status quo for now.

(Besides, my opinion is still that you should be using an ICC profile if
you care about colors being accurate _at all_)
2017-08-03 21:48:25 +02:00
Niklas Haas
0f956f0929
vo_opengl: use GL_CLIENT_STORAGE_BIT for DR
mesa won't pick client storage unless this bit is set, and we
*absolutely* want to be using client storage for our DR PBOs.
Performance is shit on AMD otherwise. (Nvidia always uses client storage
for persistent coherent buffers whether you tell it it or not, probably
because it's way faster and nvidia doesn't trust users to figure that
out on their own)
2017-08-03 20:06:58 +02:00
wm4
7625bcc716 vo_opengl: remove unused ra_mapped_buffer.preferred_align field
It makes no sense to have this on an already created buffer.

If anything, the ra backend would have to export this as a global value
(e.g. struct ra field), so that whatever allocates the buffer can
account for the required alignment. Since this code is in vo_opengl.c in
the first place, and since GL doesn't dictate any special alignment
here, it doesn't make sense in the first place to export this. (Maybe
something like this will be required later.)
2017-08-03 18:59:43 +02:00
Niklas Haas
2bf094cd55
vo_opengl: don't hardcode texmap0 for polar compute
This was an oversight. The ID shouldn't be hard-coded here, so add it to
sampler_prelude instead.
2017-08-03 18:55:52 +02:00
Niklas Haas
5e89aed934 vo_opengl: don't precompute texcoord in global scope
Breaks on mesa for whatever reason... even though it doesn't generate a
GLSL shader compiler error

Shouldn't make a performance difference for us because we cache `pos`
anyway, and most compute shaders will probably cache all of their
samples to shmem. Might have to re-visit this when we have an actual use
case for repeated sampling inside CS though. (RAVU + anti-ringing is a
possible candidate for that)
2017-08-03 18:50:07 +02:00
Niklas Haas
83f3910398
vo_opengl: make compute shaders more flexible
This allows users to do their own custom sample writing, mainly meant to
address use cases such as RAVU. Also clean up the compute shader code a
bit.
2017-08-03 18:27:36 +02:00
wm4
e7d31d12be vo_opengl: add legend for texture format debug dump 2017-08-03 16:19:57 +02:00
wm4
1479c7bd0d vo_opengl: give special Apple name a more appropriate name
Or less appropriate, as some would argue. The new name is short for
"Apple YUV packed".

(This format is needed only for hardware decoding on rather old Apple
hardware, and a very annoying special case.)
2017-08-03 16:19:56 +02:00
wm4
ffe0526064 vo_opengl: simplify/fix user shader textures
This broke float textures, which were actually used by some shaders.
There were probably some other bugs as well.

Lots of code can be avoided by using ra_tex_params directly, so do that.

The main change is that COMPONENT/FORMAT are replaced by a single FORMAT
directive, which takes different parameters now. Due to the mess with
16/32 bit float textures, and because we want to support other APIs than
just GL in the future, it's not really clear how this should be handled,
and the nice component/type separation makes things actually harder. So
just jump the gun and use the ra_format.name names, which were
originally meant mostly for debugging. (This is probably something that
will be regretted later.)

Still only superficially tested, but seems to work.

Fixes #4708.
2017-08-03 16:19:49 +02:00
Niklas Haas
2bcf04a7bd
vo_opengl: fix constexprs on ANGLE
I hate GLES
2017-08-03 14:27:38 +02:00
Niklas Haas
8f484567fc vo_opengl: fix HLG OOTF inverse
Got the "sign" of the second multiplication wrong.
2017-08-03 14:26:35 +02:00
Niklas Haas
5e1e7d32e8
vo_opengl: generalize HDR tone mapping to gamut mapping
Since this code was already written for HDR, and is now per-channel
(because it works better for HDR as well), we can actually reuse this to
get very high quality gamut mapping without clipping. The only required
change is to move the tone mapping from before the gamut map to after
the gamut map. Additonally, we need to also account for changes in the
signal range as a result of applying the CMS when we compute ref_peak,
which is fortunately pretty easy because we only need to consider the
case of primaries mapping to themselves.

Since `HDR` no longer really makes sense as a label, rename it to
`--tone-mapping` in general. Also fits better with
`--tone-mapping-desat` etc.

Arguably we could also rename `--hdr-compute-peak`, but that option is
basically only useful for HDR content anyway because we don't need
information about the signal range for gamut mapping.

This (finally!) gives us reasonably high quality gamut mapping even in
the absence of an ICC profile / 3DLUT.
2017-08-03 12:46:57 +02:00
Niklas Haas
6074cfdfd4
vo_opengl: implement HLG OOTF inverse
Huge thanks to @rusxg for finding this solution, which was previously
believed not to exist. Of course, we still don't actually need it, but I
don't want to leave this half-implemented in case somebody does in the
future.
2017-08-03 12:05:37 +02:00
Alex Notes
bda32d99d7 cocoa: fix the support of multiple renderers (GPU switch)
So far, switching between integrated and discrete GPU would cause the
kernel to kill mpv due to an indecipherable buffer error. The technical
note TN2229 from Apple recommends to enable OpenGL Offline Renderers for
every Mac with more GPUs than displays to handle the switch between GPU.

By ordering the array from the least commonly rejected to the most,
we can sequentially remove PixelFormat attributes to fit the host.

Fixes #2371
2017-07-31 20:23:58 +02:00
wm4
53188a14bf vo_opengl: manage user shader textures with ra
Drops some features I guess, no idea if those were needed. Untested due
to lack of test cases.
2017-07-30 11:38:52 +02:00
wm4
5429dbf2a2 vo_opengl: fix dither texture filter
Should be GL_NEAREST, not GL_LINEAR.
2017-07-30 09:43:41 +02:00
wm4
ab1ffa1382 vo_opengl: manage ICC LUT texture via ra
Also move the capability check to gl_video_get_lut3d(), because it
seems more convenient (ra won't have a _CAP_EXT16).
2017-07-29 21:23:31 +02:00
wm4
37b7b32d61 vo_opengl: manage scaler LUT textures via ra
Also fix the RA_CAP_ bitmask nonsense.
2017-07-29 20:15:59 +02:00
wm4
8494fdadae vo_opengl: manage dither texture via ra
Also add some more helpers.

Fix the broken math.h include statement.

utils.c uses ra_gl.h internals, which it shouldn't, and which will be
removed again as soon as this code gets converted to ra fully.
2017-07-29 20:14:48 +02:00
wm4
0f9fcf0ed4 vo_opengl: do not use GL format conversion on texture upload
The dither texture data is created as a float array, but uploaded to a
texture with GL_R16 as internal format. We relied on GL to do the
conversion from float to uint16_t. Not all GL variants even support
this: GLES does not provide this conversion (one of the reasons why this
code has a float16 code path). Also, ra is not going to do this. So just
convert on the fly.

Still keep the float16 texture format fallback, because not all GLES
implementations provide GL_R16.

There is some possibility that we'll need to provide some kind of upload
conversion anyway for float->float16. We still rely on GL doing this
implicitly, and all GL variants support it, but with RA there might be
the need for explicit conversion. Even then, it might be best to reduce
the number of conversion cases. I'll worry about this later.
2017-07-29 20:12:43 +02:00
wm4
6fcc09ff3d vo_opengl: use ra_* for format negotiation too
Format handling via ra_* was added earlier, but the format negotiation
part was forgotten.

Actually move some aspects of it to ra_get_imgfmt_desc(). Also make sure
the unorm and float formats selected by the common format lookup
functions are linear filterable. (For OpenGL, this is implicitly
guaranteed, so it wasn't done before.) Whether these assumptions should
be checked/enforced in the ra code at all is a bit fuzzy, but with ra
being helper code only for the actual video renderer, it's probably
justified.
2017-07-29 20:11:51 +02:00
Niklas Haas
345bb193fe
vo_opengl: support loading custom user textures
Parsing the texture data as raw strings makes the textures the most
portable and self-contained. In order to facilitate different types of
shaders, the parse_user_shader interaction has been changed to instead
have it loop through blocks and call the passed functions for each valid
block parsed. This is more modular and also cleaner, with better code
separation.

Closes #4586.
2017-07-27 23:51:05 +02:00
Niklas Haas
f1af6e53f0 vo_opengl: slightly refactor user_shaders code
- Each struct tex_hook now stores multiple hooks, this allows us to
  avoid the awkward way of the current code has to add the same pass
  multiple times.

- As a consequence, SHADER_MAX_HOOKS was split up into SHADER_MAX_PASSES
  (number of tex_hooks) and SHADER_MAX_HOOKS (number of hooked textures
  per tex_hook), and both numbers decreased correspondingly.

- Instead of having a weird free() callback, we can just leverage
  talloc's recursive free behavior. The only user is the user shaders code
  anyway.
2017-07-27 23:45:17 +02:00
Niklas Haas
ea76f79e5d
vo_opengl: tone map on the maximum signal component
This actually makes sure we don't decolor due to clipping even when the
signal itself exceeds the luma by a significant factor, which was pretty
common for saturated blues (and to a lesser degree, reds) - most
noticeable in skies etc.

This prevents the turn-the-sky-cyan effect of mobius tone mapping, and
should also improve the other tone mapping modes in quality.
2017-07-27 09:40:12 +02:00
Niklas Haas
e1cc43182c
vo_opengl: fix mpgl_caps bit check
As pointed out by @bjin, this would match if _any_ of the reqs are set.
Need to test for explicit equality.
2017-07-27 00:38:54 +02:00
wm4
81851febc4 vo_opengl: start work on rendering API abstraction
This starts work on moving OpenGL-specific code out of the general
renderer code, so that we can support other other GPU APIs. This is in
a very early stage and it's only a proof of concept. It's unknown
whether this will succeed or result in other backends.

For now, the GL rendering API ("ra") and its only provider (ra_gl) does
texture creation/upload/destruction only. And it's used for the main
video texture only. All other code is still hardcoded to GL.

There is some duplication with ra_format and gl_format handling. In the
end, only the ra variants will be needed (plus the gl_format table of
course). For now, this is simpler, because for some reason lots of hwdec
code still requires the GL variants, and would have to be updated to
use the ra ones.

Currently, the video.c code accesses private ra_gl fields. In the end,
it should not do that of course, and it would not include ra_gl.h.

Probably adds bugs, but you can keep them.
2017-07-26 11:31:43 +02:00
Niklas Haas
5904eddb38
vo_opengl: describe the texture uploading mode
Be a bit more transparent here, which is especially helpful when people
are sending me screenshots of stats pages.
2017-07-26 02:42:23 +02:00
Niklas Haas
b31020b193
vo_opengl: check against shmem limits
The radius check was not strict enough, especially not for all
platforms. To fix this, actually check the hardware capabilities instead
of relying on a hard-coded maximum radius.
2017-07-26 01:54:33 +02:00
James Ross-Gowan
9875f14ad4 vo_opengl: fix image uniforms for older OpenGL
This explicitly enables the GL_ARB_shader_image_load_store extension,
which seems to fix compute shaders for Intel/GL 3.0.
2017-07-26 08:02:03 +10:00
Niklas Haas
49a648447f vo_opengl: cosmetic change 2017-07-25 20:14:03 +02:00
Niklas Haas
f2809e19f0 vo_opengl: add PRINTF_ATTRIBUTE to gl_sc_ssbo
Doesn't uncover any bugs, but apparently we're getting in the habit of
this anyway.
2017-07-25 06:35:10 +02:00
Niklas Haas
62de84cbe3
vo_opengl: kill off FBOTEX_COMPUTE again
The textures not having an FBO actually caused regressions when trying
to render the subtitles on top of this texture (--blend-subtitles),
which still relied on an FBO.

So just kill off the logic entirely. Why worry about a single FBO wasted
when we're allocating like 10 anyway.

Fixes #4657.
2017-07-25 06:32:29 +02:00
Niklas Haas
d099e037ef
vo_opengl: fix incoherent SSBO usage
According to the OpenGL spec, atomic access to SSBO variables is *not*
guaranteed to be coherent, even when reusing the same SSBO attached to
the same shader across different frames. So we actually need a
glMemoryBarrier here, at least in theory.
2017-07-25 06:11:57 +02:00
Niklas Haas
6c06e7e2a0
vo_opengl: cosmetic fix 2017-07-25 05:23:52 +02:00
Niklas Haas
cd226bdfd8 vo_opengl: fix incoherent texture usage
This bug slipped past my attention because nvidia ignores memory
barriers, but this is not necessarily always the case. Since
image_load_store is incoherent (specifically, writing to images from
compute shaders is incoherent) we need to insert a memory barrier to
make it coherent again. Since we only care about texture fetches, that's
the only barrier we need.
2017-07-25 05:22:29 +02:00
Niklas Haas
241d5ebc46
vo_opengl: adjust the rules for linearization
Two changes, compounded into one since they affect the same logic:

1. Never use linearization for HDR downscaling
2. Always use linearization for interpolation

Instead of fixing p->use_linear at the beginning of pass_render_frame,
we flip it on "dynamically" as needed. I plan on killing this
p->use_linear frame (along with other per-pass metadata) and moving them
into their own struct for tracking the "current" state of the video, but
that's a separate/upcoming refactor.

As a small bonus, reduce some code duplication in the interpolation
logic.

Fixes #4631
2017-07-24 23:26:15 +02:00
Bin Jin
13ef6bcf6f vo_opengl: enable compute shader for mesa
Mesa 17.1 supports compute shader but not full specs of OpenGL 4.3.
Change the code to detect OpenGL extension "GL_ARB_compute_shader"
rather than OpenGL version 4.3.

HDR peak detection requires SSBO, and polar scaler requires 2D array
extension. Add these extensions as requirement as well.
2017-07-25 04:07:26 +08:00
Niklas Haas
0c84ee01d5
vo_opengl: support user compute shaders
These are identical to regular fragment shader hooks, but with extra
metadata indicating the preferred block size.
2017-07-24 17:19:34 +02:00
Niklas Haas
f338ec4591 vo_opengl: implement compute shader based EWA kernel
This performs almost 50% faster on my machine (!!), from 4650μs down to
about 3176μs for ewa_lanczossharp.

It's possible we could use a similar approach to speed up the separable
scalers, although with vastly simpler code. For separable scalers we'd
also have the additional huge benefit of only needing padding in one
direction, so we could potentially use a big 256x1 kernel or something
to essentially compute an entire row at once.
2017-07-24 17:19:31 +02:00
Niklas Haas
b196cadf9f vo_opengl: support HDR peak detection
This is done via compute shaders. As a consequence, the tone mapping
algorithms had to be rewritten to compute their known constants in GLSL
(ahead of time), instead of doing it once. Didn't affect performance.

Using shmem/SSBO atomics in this way is extremely fast on nvidia, but it
might be slow on other platforms. Needs testing.

Unfortunately, setting up the SSBO still requires OpenGL calls, which
means I can't have it in video_shaders.c, where it belongs. But I'll
defer worrying about that until the backend refactor, since then I'll be
breaking up the video/video_shaders structure anyway.
2017-07-24 17:19:31 +02:00
Niklas Haas
aad6ba018a vo_opengl: support compute shaders
These can either be invoked as dispatch_compute to do a single
computation, or finish_pass_fbo (after setting compute_size_minimum) to
render to a new texture using a compute shader. To make this stuff all
work transparently, we try really, really hard to make compute shaders
as identical to fragment shaders as possible in their behavior.
2017-07-24 17:19:31 +02:00
Niklas Haas
eb54d2ad4d vo_opengl: cut down on FBOTEX_FUZZY abuse
Don't use FBOTEX_FUZZY where the FBO is sized according to
p->texture_w/h, since this changes infrequently (and when it does, we
need to reset everything anyway). No real reason to make this change
other than that it possibly prevents nasty surprises in the future, so I
feel more comfortable about it.
2017-07-24 16:41:38 +02:00
wm4
24dc91907a common, vo_opengl: add/use helper for formatted strings on the stack
Seems like I really like this C99 idiom. No reason not to generalize it
do snprintf(). Introduce mp_tprintf(), which basically this idiom to
snprintf(). This macro looks like it returns a string that was allocated
with alloca() on the caller site, except it's portable C99/C11. (And
unlike alloca(), the result is valid only within block scope.)

Use it in 2 places in the vo_opengl code. But it has the potential to
make a whole bunch of weird looking code look slightly nicer.
2017-07-24 08:12:42 +02:00
wm4
3d0f86145c vo_opengl: check format on some printf-like calls
Fix 1 incorrect use.
2017-07-24 08:08:02 +02:00
wm4
64d56114ed vo_opengl: add direct rendering support
Can be enabled via --vd-lavc-dr=yes. See manpage additions for what it
does.

This reminds of the MPlayer -dr flag, but the implementation is
completely different. It's the same basic concept: letting the decoder
render into a GPU buffer to avoid a copy. Unlike MPlayer, this doesn't
try to go through filters (libavfilter doesn't support this anyway).
Unless a filter can work in-place, DR will be silently disabled. MPlayer
had very complex semantics about buffer types and management (which
apparently nobody ever understood) and weird restrictions that mostly
limited it to mpeg2 style codecs. The mpv code does not do any of this,
and just lets the decoder allocate an arbitrary number of untyped
images. (No MPlayer code was used.)

Parts of the code based on work by atomnuker (starting point for the
generic code) and haasn (some GL definitions, some basic PBO code, and
correct fencing).
2017-07-24 04:32:55 +02:00
wm4
bbfd9b5a29 vo_opengl: osd: remove stale declaration
Was missed in the previous changes.
2017-07-23 00:02:02 +02:00
wm4
87e9cd13aa vo_opengl: add printf format checking to pass_describe() 2017-07-22 22:08:23 +02:00
wm4
95885bbaa2 vo_opengl: make VAO helper private, remove old VAO mechanism
The struct describing vertex attributes is still public, of course.
2017-07-22 21:56:23 +02:00
wm4
d3dfffdf02 vo_opengl: osd: use new VAO mechanism
In addition to using the new VAO mechanism introduced in the previous
commit, this tries to keep the OSD code self-contained. This doesn't
work all too well (because of the pass and CMS stuff), but it's still
better than before.
2017-07-22 21:55:38 +02:00
wm4
a24df94fed vo_opengl: add mechanism to create/cache VAO on the fly
This removes VAO handling from video.c. Instead the shader cache will
create the VAO as needed. The consequence is that this creates a VAO
per shader, which might be a bit wasteful, but doesn't matter anyway.
2017-07-22 21:51:32 +02:00
wm4
1c81311673 vo_opengl: osd: refactor and simplify
Reduce this to 1 draw call per OSD pass. This removes the need for some
annoying special handling regarding 3D video support (we supported
duplicating the OSD/subtitles for side-by-side 3D output etc.).

Remove the unneeded texture sampler uniform thing.
2017-07-22 21:46:59 +02:00
Niklas Haas
51014e1c03
vo_opengl: avoid constant divisions
These are apparently expensive on some drivers which are not smart
enough to turn x/42 into x*1.0/42. So, do it for them.

My great test framework says it's okay
2017-07-17 05:29:16 +02:00
Niklas Haas
44391af7df
vo_opengl: style
Use uintptr_t instead of size_t. Shouldn't matter, but is cleaner.
2017-07-16 18:32:48 +02:00
Niklas Haas
cea8c86f18
vo_opengl: use MP_ALIGN_UP instead of FFALIGN
Consistency/style
2017-07-16 17:47:06 +02:00
Niklas Haas
dead206873 vo_opengl: use glBufferSubData instead of glMapBufferRange
Performance seems pretty much unchanged but I no longer get nasty spikes
on NUMA systems, probably because glBufferSubData runs in the driver or
something.

As a simplification of the code, we also just size the PBO to always
have the full size, even for cropped textures. This seems slower but not
by relevant amounts, and only affects e.g. --vf=crop. It also slightly
increases VRAM usage for textures with big strides.

This new code path is especially nice because it no longer depends on
GL_ARB_map_buffer_range, and no longer uses any functions that can
possibly fail, thus simplifying control flow and seemingly deprecating
the manpage's claim about possible image corruption.

In theory we could also reduce NUM_PBO_BUFFERS since it doesn't seem
like we're streaming uploads anyway, but leave it in there just in
case some drivers disagree...
2017-07-16 17:46:24 +02:00
Niklas Haas
8e20ef4292
vo_opengl: update BufferData usage hints
STREAM is better than DYNAMIC because we're only using it once per
frame. As for COPY vs DRAW, that was pretty much incorrect to begin with
- but surprisngly, COPY is actually faster (sometimes significantly so,
e.g. on my NUMA system).

After testing, the best I can gather is that it has to do with the fact
that COPY requires fewer redundant memcpy()s, and also 3x reduce RAM
bandwidth (in theory).

Anyway, that bit shouldn't introduce any regressions, it's just a
documentation update. Maybe I'll change my mind about the comment again
the future, it's really hard to tell. Vulkan, please save us!
2017-07-15 23:54:20 +02:00
Niklas Haas
b93bcce5df
vo_opengl: coalesce intra-plane PBOs
Instead of allocating three PBOs and cycling through them, we allocate
one PBO that's three times as large, and cycle through the subregion
offsets.

This results in arguably simpler code and faster initialization
performance. Especially for 4K textures, initializing PBOs can take
quite some time (e.g. 180ms -> 110ms). For 1080p, it's more like 66ms ->
52ms for me.

The alignment to 4096 is completely unnecessary by spec, but we do it
anyway just for peace of mind.
2017-07-15 22:11:48 +02:00
Niklas Haas
18c74f7dfe
vo_opengl: generalize --scale-clamp etc.
This can help fight ringing without completely killing it, thus
providing a middle-ground between ringing and aliasing.
2017-07-12 19:08:58 +02:00
Niklas Haas
e186567329
vo_opengl: remove redundant gl_video_setup_hooks call
This is unnecessary to call from gl_video_resize, because the hooks only
(possibly) change when the actual vo_opengl options change. This used to
be required back when mpv still had prescaling built in, but since that
was all moved to user shaders and the code removed, this is a left-over
artifact.
2017-07-12 18:46:09 +02:00
Aman Gupta
4e93046ddf vo_opengl: fix type of glsl variable frame 2017-07-11 04:46:37 +02:00
wm4
26f56b5a5d vo_opengl: don't make assumptions about plane order
The renderer code doesn't list a fixed set of supported formats, but
supports anything that is described by AVPixFmtDescriptor and follows a
number of constraints.

Plane order is not included in those constraints. This means the planes
could be in random order, rather than what the vo_opengl renderer
happens to assume. For example, it assumes that the 4th plane is alpha,
even though alpha could be on any plane. Likewise it assumes that plane
0 was always luma, and planes 2/3 chroma. (In earlier iterations of
format support, this was guaranteed by MP_IMGFLAG_YUV_P, but this is not
used anymore.)

Explicitly set the plane semantics (enum plane_type) by the component
descriptors and the used colorspace. The behavior should mostly not
change, but it's less likely to break when FFmpeg adds new pixel
formats.
2017-07-10 17:56:43 +02:00
wm4
02468dcbde vo_opengl: hwdec_dxva2egl: probe whether ANGLE mapping works
With some newer ANGLE builds, mapping can fail with "Failed to create
EGL surface" during playback. The reason is unknown, and it might just
be an ANGLE bug. Probe whether it works at init time to avoid the
problem.
2017-07-10 15:32:09 +02:00
Niklas Haas
f76f37991f
vo_opengl: fix dumb_mode chroma transformation fixup
In commit 6eb0bbe this was changed from xs[n] to use gl_format.chroma_w
indiscriminately, which broke chroma rendering when zooming/cropping.
The solution is to only use chroma_w for chroma planes.

Fixes #4592.
2017-07-10 13:46:41 +02:00
Niklas Haas
f4178b90c7
vo_opengl: describe the remainder passes after user shaders
On optional hook points, we store to a temp FBO and then read from it
again to complete any operations that may still be left (e.g.
sigmoidization after MAIN/LINEAR).

In theory this mechanism should be reworked to avoid the temporary FBO
until the next time we actually need one - and also skip redundant
passes if we the next thing we need *is* a FBO - but both are those are
tricky. Anyway, in the meantime, at least we can label the
(semi-)redundant passes that get generated when using user shaders.
2017-07-09 08:48:29 +02:00
Niklas Haas
8c0162e762
vo_opengl: support tone-mapping-param for clip
This just indicates a fixed linear coefficient to multiply into the
signal, similar to the old option --target-brightness (but the inverse
thereof). Good for testing purposes, which is why I added it. (This also
corresponds somewhat to what zimg does)
2017-07-07 21:00:21 +02:00
Niklas Haas
7c1db05cbb
vo_opengl: rework --opengl-dumb-mode
It's now possible to request non-dumb mode as a user, even when not
using any non-dumb features. This change is mostly intended for testing,
so I can easily switch between dumb and non-dumb mode on default
settings. The default behavior is unaffected.
2017-07-07 14:46:46 +02:00
Niklas Haas
9a49a35453 vo_opengl: correct off-by-one in scale=oversample
This caused a single pixel shift to the top-left, introduced in e3e03d0f3.
2017-07-07 13:47:47 +02:00
wm4
ae4c0134ed vo_opengl: do not use vaapi-over-GLX
This backend is selected if vaapi is available, but vaapi-over-EGL is
not. This causes various issues around the forced RGB conversion, which
is done with fixed, usually incorrect parameters.

It seems the existing auto probing check is too weak, and doesn't really
prevent it from getting loaded. Fix this by adding a flag to not ever
load this during auto probing.

I'm still not deleting it, because it's useful for testing on nvidia
machines.

See #4555.
2017-07-07 12:29:29 +02:00
Niklas Haas
9c9d3e7b25
vo_opengl: prevent desat from blowing up for negative
The current algorithm blew up when the color was negative, such as the
case when downscaling with dscale=mitchell or other algorithms that
introduce negative ringing. The simplest solution is to just slightly
change the calculation to force both parameters to be in-range.
2017-07-07 11:26:30 +02:00
Niklas Haas
aa2bdec26c
vo_opengl: also expose NAME_mul for user shaders
This is exposed so that bjin/mpv-prescalers can use textureGatherOffset
for performance.

Since there are now quite a lot of parameters where it isn't quite clear
why they're all defined, add a paragraph to the man page that explains
them a bit.
2017-07-06 11:30:33 +02:00
Niklas Haas
ef43854b34
vo_opengl: prevent division by zero in shader
In theory the max() should clamp it away anyway but I believe division
by zero is UB so just avoid it altogether.
2017-07-06 05:59:08 +02:00
Niklas Haas
9e04018f92
vo_opengl: add --tone-mapping-desaturate
This helps prevent unnaturally, weirdly colorized blown out highlights
for direct images of the sunlit sky and other way-too-bright HDR
content. I was debating whether to set the default at 1.0 or 2.0, but
went with the more conservative option that preserves more detail/color.
2017-07-06 05:43:00 +02:00
Niklas Haas
6f77444f6c
vo_opengl: get rid of weird double-bind in pass_read_fbo
This logic doesn't really make sense. copy_img_tex already binds the
texture, so why would we bind it a second time? Furthermore, nothing
actually uses this return value. Must have been some left-over artifact
of a previous iteration of this function. Anyway, it's harmless, just
nonsensical. So remove it.
2017-07-05 11:22:00 +02:00
Niklas Haas
6e25934a8c vo_opengl: remove redundant left-over line
The pass_read_fbo immediately below replaces it
2017-07-05 11:21:58 +02:00
Niklas Haas
ad0d6caac7 vo_opengl: use textureGatherOffset for polar filters
This is more efficient on my machine (nvidia), but only when applied to
groups of exactly 4 texels. So we switch to the more efficient
textureGather for groups of 4. Some notes:

- textureGatherOffset seems to be faster than textureGather by a
  non-negligible amount, but for some reason, textureOffset is still
  slower than a straight-up texture
- textureGather* requires GLSL 400; and at least on nvidia, this
  requires actually allocating a GL 4.0 context.
- the code in opengl/common.c that clamped the GLSL version to 330 is
  deprecated, because the old user shader style has been removed
  completely in the meantime
- To combat the growing complexity of the polar sampling code, we drop
  the antiringing functionality from EWA shaders completely, since it
  never really worked well for EWA to begin with. (Horrific artifacting)
2017-07-05 11:21:58 +02:00
Niklas Haas
5df3576856
vo_opengl: make the pass info mechanism more robust
- change asserts to silent exits
- check all pointers before use
- move the p->pass initialization code to the right place

This should hopefully cut down on the amount of crashing by making the
code fundamentally more robust, while also fixing a concrete issue where
opengl-cb failed to initialize p->pass.
2017-07-03 17:14:06 +02:00
Niklas Haas
8854a2bef6
filter_kernels: add radius cutoff functionality
This allows filter functions to be prematurely cut off once their
contributions start becoming insignificant. This effectively prevents
wasted GPU time sampling from parts of the function that are essentially
reduced to zero by the window function, providing anywhere from a 10% to
20% speedup. (5700μs -> 4700μs for me)
2017-07-03 11:51:37 +02:00
wm4
e4bc563fd2 options: change everything again
Fucking bullshit.
2017-07-02 16:29:45 +02:00
Niklas Haas
f281ecc4c0
vo_opengl: describe vdpau reinterleaving pass
This shows up as (unknown pass) otherwise.
2017-07-01 09:45:14 +02:00
Niklas Haas
69289aec6c
vo_opengl: fix some more pass_info_reset issues
2f41c4e8 exposed some other edge cases as well. Globally resetting the
pass info was not the right way to go about it, because we don't know in
advance what the frame type is going to be - at least not with the
current code structure. (In principle, we could separately indicate the
frame type and the pass type and then only reset it on the first
actual pass_describe call, but that's annoying as well)

Also fixes a latent issue where p->pass was never initialized, which
broke the MP_DBG debugging code in some cases.
2017-07-01 04:27:09 +02:00
Niklas Haas
2f41c4e81b
vo_opengl: call pass_info_reset earlier
Omitting this call resulted in a crash when has_frame was false. But we
can just call it way earlier, because there's really no reason not to.
2017-07-01 03:32:14 +02:00
Niklas Haas
6a12b1fdc3
vo_opengl: merge uploading and rendering
Since all existing code does gl_video_upload immediately followed by
pass_render_frame, we can just move the upload into pass_render_frame
itself, which arguably makes more sense anyway.
2017-07-01 00:59:15 +02:00
Niklas Haas
dd78cc6fe7 vo_opengl: refactor vo performance subsystem
This replaces `vo-performance` by `vo-passes`, bringing with it a number
of changes and improvements:

1. mpv users can now introspect the vo_opengl passes, which is something
   that has been requested multiple times.

2. performance data is now measured per-pass, which helps both
   development and debugging.

3. since adding more passes is cheap, we can now report information for
   more passes (e.g. the blit pass, and the osd pass). Note: we also
   switch to nanosecond scale, to be able to measure these passes
   better.

4. `--user-shaders` authors can now describe their own passes, helping
   users both identify which user shaders are active at any given time
   as well as helping shader authors identify performance issues.

5. the timing data per pass is now exported as a full list of samples,
   so projects like Argon-/mpv-stats can immediately read out all of the
   samples and render a graph without having to manually poll this
   option constantly.

Due to gl_timer's design being complicated (directly reading performance
data would block, so we delay the actual read-back until the next _start
command), it's vital not to conflate different passes that might be
doing different things from one frame to another. To accomplish this,
the actual timers are stored as part of the gl_shader_cache's sc_entry,
which makes them unique for that exact shader.

Starting and stopping the time measurement is easy to unify with the
gl_sc architecture, because the existing API already relies on a
"generate, render, reset" flow, so we can just put timer_start and
timer_stop in sc_generate and sc_reset, respectively.

The ugliest thing about this code is that due to the need to keep pass
information relatively stable in between frames, we need to distinguish
between "new" and "redrawn" frames, which bloats the code somewhat and
also feels hacky and vo_opengl-specific. (But then again, this entire
thing is vo_opengl-specific)
2017-07-01 00:58:27 +02:00
wm4
f003d8ea36 d3d: UWP support for D3D11VA
For some braindead reason, Microsoft decided to prevent you from
dynamically loading system libraries. This makes portability harder.

And we're talking about portability between Microsoft OSes!
2017-06-30 18:57:37 +02:00
wm4
dd408e68ed d3d: make DXVA2 support optional
This partially reverts the change from a longer time ago to always build
DXVA2 and D3D11VA together.

To make it simpler, we change the following:
- building with ANGLE headers is now required to build D3D hwaccels
- if DXVA2 is enabled, D3D11VA is still forcibly built
- the CLI vo_opengl ANGLE backend is now under --egl-angle-win32

This is done to reduce the dependency mess slightly.
2017-06-30 18:57:37 +02:00
wm4
1ad036a2ef video: get rid of swapped packed YUV
Another legacy annoyance. The only place where packed YUV is still
important is slightly older Apple hardware or drivers, which require
it for efficient hardware decoding.
2017-06-30 18:01:29 +02:00
wm4
6eb0bbe312 vo_opengl: remove mp_imgfmt_desc and IMGFLAG_ usage
These were weird due to their past, and often undefined or ill-defined.
Time to get rid of them.
2017-06-30 17:56:42 +02:00
wm4
0c0a06140c vo_opengl: restructure format setup
Instead of setting up a weird swizzle (which is linked to how the
internal renderer code works, rather than the generic format code), add
per-component mapping to gl_imgfmt_desc.

The renderer still computes the weird swizzle, but at least it's
confined to itself. Also, it appears the hwdec backends don't need this
anymore.

It's really nice that the messy init_format() goes away too.
2017-06-30 17:07:55 +02:00
wm4
91583fccac options: change path list options, and document list options
The changes to path list options is basically getting rid of the need to
pass multiple paths to a single option. Instead, you can use the option
multiple times. The old behavior can be used by using the -set suffix
with the option.

Change some options to path lists. For example --script is now append by
default, and if you use --script-set, you need to use ":"/";" as
separator instead of ",".

--sub-paths/--audio-file-paths is a deprecated alias now, and will break
if the user tries to pass multiple paths to it. I'm assuming that if
these are used, most users will pass only 1 path anyway.

--opengl-shaders has more compatibility handling, since it's probably
rather common that users pass multiple options to it.

Also document all that in the manpage.

I'll probably regret this later, as it somewhat increases the complexity
of the option parser, rather than increasing it.
2017-06-30 16:39:36 +02:00
wm4
1dffcb0167 vo_opengl: rely on FFmpeg pixdesc a bit more
Add something that allows is to extract the component order from various
RGBA formats. In fact, also handle YUV, GBRP, and XYZ formats with this.

It introduces a new struct mp_regular_imgfmt, that hopefully will
eventually replace struct mp_imgfmt_desc. The latter is still needed by
a lot of code though, especially generic code. Also vo_opengl still uses
the old one, so this commit is sort of incomplete.

Due to its genericness, it's also possible that this commit introduces
rendering bugs, or accepts formats it shouldn't accept.
2017-06-29 20:52:05 +02:00
Niklas Haas
0af561953a
vo_opengl: unify user_shaders constants
Commit 3fb6380 was supposed to increase MAX_TEXTURE_HOOKS but instead
increased SHADER_MAX_HOOKS, since I forgot that they were separate (for
whatever reason).

To prevent this mistake from happening again, and to unify the location
in which user_shader-specific #defines are placed, get rid of the two
constants in opengl/video.c and move/reuse them from user_shaders.h
instead.

Also bump up MAX_SAVED_TEXTURES (now SHADER_MAX_SAVED) slightly as a
precaution against adding more passes to vo_opengl. I think we're
already flirting with the limit.
2017-06-28 00:01:25 +02:00
Niklas Haas
db36aa06f4
vo_opengl: tone map using only luminance information
This is even better at preventing discoloration than tone mapping on the
XYZ image. Partly inspired by the HLG OOTF. Also simplifies the way we
tone map, and moves this logic to the pass_tone_map function where it
belongs.

This also fixes what could arguably be considered a bug in the HLG
implementation when using HLG for non-BT.2020 colorspaces, which is not
permitted by spec but thinkable in theory. Although in this case, I
guess it will be arbitrary whether people use the BT.2020-normalized
luma coefficients or change it to fit the colorspace, so I guess either
way could be considered "right", depending on what people end up doing.
Either way, in lieue of standard practice, we do what makes the most
sense (to me), and hopefully others will follow.

The downside is that we upload an extra vec3 uniform even if we don't
use it, but eliminating that would be ugly.
2017-06-27 01:05:43 +02:00
Niklas Haas
7bfeb8bd26
vo_opengl: silence -Wmaybe-uninitialized false positive
These can never be uninitialized because the enum cases are exhaustive and
the fallback is in the correct order, but GCC is too dumb to understand
this.

Also explicitly initialize tex_type, because while GCC doesn't warn
about it (for some reason), maybe it will in the future.
2017-06-24 22:51:48 +02:00
Niklas Haas
3fb6380aa8
vo_opengl: bump up SHADER_MAX_HOOKS
Apparently people are running into the current limit. 64 ought to be
enough for everybody.
2017-06-24 01:55:21 +02:00
quilloss
2ba2062e5b context_dxinterop: lock rendertarget after present when swapping buffers
Moves the DXLockObjectsNV call to after PresentEx. This fixes an issue
where the presented image is a single frame late. This may be due to
DXLockObjectsNV locking the render target before StretchRect is done.
The spec indicates that the lock call should provide synchronization
for the resource, so this may be due to a driver bug.
2017-06-18 20:54:44 +02:00
Niklas Haas
2e45b8fa1a vo_opengl: implement sony s-log2 trc
Apparently this is virtually identical to Panasonic's V-Log, but using
the constants from S-Log1 and an extra scaling coefficient to make the
S-Log1 curve less limited. Whatever floats their NIH boat, I guess.

Source: https://pro.sony.com/bbsccms/assets/files/micro/dmpc/training/S-Log2_Technical_PaperV1_0.pdf
2017-06-18 20:54:44 +02:00
Niklas Haas
326e02e955 vo_opengl: implement sony s-log1 trc
Source: https://pro.sony.com/bbsccms/assets/files/mkt/cinema/solutions/slog_manual.pdf

Not 100% confident in the implementation since the values from the spec
seem to be very subtly off (~1%), but it should be close enough for
practical purposes.
2017-06-18 20:54:44 +02:00
Niklas Haas
c3f32f3a6e vo_opengl: tone map in linear XYZ instead of RGB
This preserves channel balance better and helps reduce discoloration due
to nonlinear tone mapping.

I wasn't sure whether to stuff this inside pass_color_manage or
pass_tone_map but decided for the former because adding the extra
mp_csp_prim would have made the signature of the latter longer than
80col, and also because the `mp_get_cms_matrix` below it basically does
the same thing anyway, so it doesn't look that out of place. Also why is
this justification longer than the actual description of the algorithm
and what it's good for?
2017-06-18 20:54:44 +02:00
Niklas Haas
1f3000b03c vo_opengl: implement support for OOTFs and non-display referred content
This introduces (yet another..) mp_colorspace members, an enum `light`
(for lack of a better name) which basically tells us whether we're
dealing with scene-referred or display-referred light, but also a bit
more metadata (in which way is the scene-referred light expected to be
mapped to the display?).

The addition of this parameter accomplishes two goals:

1. Allows us to actually support HLG more-or-less correctly[1]
2. Allows people playing back direct “camera” content (e.g. v-log or
   s-log2) to treat it as scene-referred instead of display-referred

[1] Even better would be to use the display-referred OOTF instead of the
idealized OOTF, but this would require either native HLG support in
LittleCMS (unlikely) or more communication between lcms.c and
video_shaders.c than I'm remotely comfortable with

That being said, in principle we could switch our usage of the BT.1886
EOTF to the BT.709 OETF instead and treat BT.709 content as being
scene-referred under application of the 709+1886 OOTF; which moves that
particular conversion from the 3dlut to the shader code; but also allows
a) users like UliZappe to turn it off and b) supporting the full HLG
OOTF in the same framework. But I think I prefer things as they are
right now.
2017-06-18 20:54:44 +02:00
Niklas Haas
fe1227883a csputils: rename HDR curves
st2084 and std-b67 are really weird names for PQ and HLG, which is what
everybody else (including e.g. the ITU-R) calls them. Follow their
example.

I decided against naming them bt2020-pq and bt2020-hlg because it's not
necessary in this case. The standard name is only used for the other
colorspaces etc. because those literally have no other names.
2017-06-18 20:54:44 +02:00
Niklas Haas
c335e84230 video: refactor HDR implementation
List of changes:

1. Kill nom_peak, since it's a pointless non-field that stores nothing
   of value and is _always_ derived from ref_white anyway.

2. Kill ref_white/--target-brightness, because the only case it really
   existed for (PQ) actually doesn't need to be this general: According
   to ITU-R BT.2100, PQ *always* assumes a reference monitor with a
   white point of 100 cd/m².

3. Improve documentation and comments surrounding this stuff.
4. Clean up some of the code in general. Move stuff where it belongs.
2017-06-18 20:48:23 +02:00
wm4
b6d0b57e85 Drop/move img_fourcc.h
This file is an leftover from when img_format.h was changed from using
the ancient FourCCs (based on Microsoft multimedia conventions) for
pixel formats to a simple enum. The remaining cases still inherently
used FourCCs for whatever reasons.

Instead of worrying about residual copyrights in this file, just move it
into code we don't want to relicense (the ancient Linux TV code). We
have to fix some other code depending on it. For the most part, we just
replace the MP_FOURCC macro with libavutil's MKTAG (although the macro
definition is exactly the same). In demux_raw, we drop some pre-defined
FourCCs, but it's not like it matters. (Instead of
--demuxer-rawvideo-format use --demuxer-rawvideo-mp-format.)
2017-06-18 15:13:45 +02:00
wm4
182bbb5917 vo_opengl: fall back to ordered dither instead of blowing up
In GLES 2 mode, we can do dither, but "fruit" dithering is still out of
the question, because it does not support any high depth textures.
(Actually we probably could use an 8 bit texture too for this, at least
with small matrix sizes, but it's still too much of a pain to convert
the data, so why bother.)

This is actually a regression; before this, forcibly enabling dumb mode
due to low GL caps actually happened to avoid this case.

Fixes #4519.
2017-06-17 13:55:07 +02:00
Niklas Haas
4e12baf3d7 vo_opengl: change default tone mapping algorithm
d8a3b10f4 was supposed to change this (as reflected in the man page and
commit message), I just forgot.
2017-06-10 13:48:35 +02:00
Niklas Haas
d8a3b10f45
vo_opengl: add new HDR tone mapping algorithm
I call it `mobius` because apparently the form f(x) = (cx+a)/(dx+b) is
called a Möbius transform, which is the algorithm this is based on. In
the extremes it becomes `reinhard` (param=0.0 and `clip` (param=1.0),
smoothly transitioning between the two depending on the parameter.

This is a useful tone mapping algorithm since the tunable mobius
transform allows the user to decide the trade-off between color accuracy
and detail preservation on a continuous scale. The default of 0.3 is
already far more accurate than `reinhard` while also being reasonably
good at preserving highlights, without suffering from the overall
brightness drop and color distortion of `hable`.

For these reasons, make this the new default. Also expand and improve
the documentation for these tone mapping functions.
2017-06-09 11:27:28 +01:00
wm4
0754cbc83e d3d: add support for new libavcodec hwaccel API
Unfortunately quite a mess, in particular due to the need to have some
compatibility with the old API. (The old API will be supported only in
short term.)
2017-06-08 21:51:25 +02:00
Philip Langdale
7424651b96 vo_opengl: hwdec_cuda: Support separate decode and display devices
In a multi GPU scenario, it may be desirable to use different GPUs
for decode and display responsibilities. For example, if a secondary
GPU has better video decoding capabilities.

In such a scenario, we need to initialise a separate context for each
GPU, and use the display context in hwdec_cuda, while passing the
decode context to avcodec.

Once that's done, the actually hand-off between the two GPUs is
transparent to us (It happens during the cuMemcpy2D operation which
copies the decoded frame from a cuda buffer to the OpenGL texture).

In the end, the bulk of the work is around introducing a new
configuration option to specify the decode device.
2017-06-03 16:41:03 +02:00
wm4
83a9b0bc48 videotoolbox: support new libavcodec API
The new API has literally no advantages (other than that we can drop
mp_vt_download_image and other things later), but it's sort-of uniform
with the other hwaccels.

"--videotoolbox-format=no" is not supported with the new API, because it
doesn't "fit in". Probably could be added later again.

The iOS code change is untested (no way to test).
2017-05-24 15:25:48 +02:00
James Ross-Gowan
6a00059850 context_angle: fix fallback to D3D9 device
This was broken in e0250b9604. In some cases, device creation will
succeed, but creating an EGL context on the device will fail. With
--angle-renderer=auto, it should try to create the context again on a
D3D9 device.

This fixes mpv in Windows Vista on VirtualBox for me.
2017-05-16 22:59:15 +10:00
wm4
2b616c0682 vo_opengl: drop TLS usage
TLS is a headache. We should avoid it if we can.

The involved mechanism is unfortunately entangled with the unfortunate
libmpv API for returning pointers to host API objects. This has to be
kept until we change the API somehow.

Practically untested out of pure laziness. I'm sure I'll get a bunch of
reports if it's broken.
2017-05-11 17:47:33 +02:00
wm4
1143f2877a d3d11: change mp_image plane pointer semantics
Until now, the texture pointer was stored in plane 1, and the texture
array index was in plane 2. Move this down to plane 0 and plane 1. This
is to align it to the new WIP D3D11 decoding API in Libav, where we
decided that there is no reason to avoid setting plane 0, and that it
would be less weird to start at plane 0.
2017-05-04 01:13:03 +02:00
wm4
2eaa43e5da vo_opengl: another attempt at removing the overlay correctly
This reverts commit 142b2f23d4, and replaces it with another try. The
previous attempt removed the overlay on every rendering, because the
normal rendering path actually unrefs the mp_image. Consequently,
unmap_current_image() was completely inappropriate for removing the
overlay.
2017-05-02 17:10:26 +02:00
wm4
142b2f23d4 vo_opengl: make sure overlays are removed on gl_video_config()
This should make vo_opengl_cb uninit remove the frame, even if the
renderer and OpenGL state remains active.
2017-04-29 15:09:40 +02:00
wm4
010c7d4992 vo_opengl: context_drm_egl: remove unnecessary include
Could be broken after the previous commit removed finding the GL include
dir.
2017-04-26 17:43:23 +02:00
wm4
f59371de21 video: drop vaapi/vdpau hw decoding support with FFmpeg 3.2
This drops support for the old libavcodec APIs. Now FFmpeg 3.3 or FFmpeg
git is required. Libav has no release with the new APIs yet, so for
Libav git as of a few weeks or months ago or so is required if you want
to use Libav.

Not much actually changes in hwdec_vaegl.c - some code is removed, but
the reindentation inflates the diff.
2017-04-23 16:07:03 +02:00
wm4
4d1eab6e55 vo_opengl: fix crash by coping temporal_dither_period for dumb mode too
Specifically, this field must never be 0 (and the option can naturally
not be 0 in any way, unless it wasn't initialized correctly).
2017-04-21 07:16:26 +02:00
wm4
bba08e38ff vo_opengl: move X11 backends before Wayland
Wayland is still too amateurish, and multiple features don't work,
including critical ones. There is no solution in sight, so prefer X11.
(Which seems to mostly work ok via xwayland.)

Once all problems are solved, the defaults can be switched back.
2017-04-16 02:35:12 +02:00
wm4
7b84297699 vo_opengl: minor cosmetics 2017-04-14 17:35:27 +02:00
wm4
759ac6cc93 vo_opengl: add option for caching shaders on disk
Mostly because of ANGLE (sadly).

The implementation became unpleasantly big, but at least it's relatively
self-contained.

I'm not sure to what degree shaders from different drivers are
compatible as in whether a driver would randomly misbehave if it's fed
a binary created by another driver. The useless binayFormat parameter
won't help it, as they can probably easily clash. As usual, OpenGL is
pretty shit here.
2017-04-08 16:43:56 +02:00
wm4
e7940ddbf3 vo_opengl: fix a confused comment 2017-04-08 16:43:56 +02:00
wm4
71caa0b79b vo_opengl: remove two unused symbols 2017-04-08 16:43:56 +02:00
wm4
eb83ee4a4a vo_opengl: add our own copy of OpenGL headers
gl_headers.h is basically header_fixes.h done consequently. It contains
all OpenGL defines (and some typedefs) we need. We don't include GL
headers provided by the system anymore.

Some care has to be taken by certain windowing APIs including all of
gl.h anyway. Then the definitions could clash. Fortunately, redefining
preprocessor symbols to the same content is allowed and ignored. Also,
redefining typedefs to the same thing is allowed in C11. Apparently the
latter is not allowed in C99, so there is an imperfect attempt to avoid
the typedefs if required API symbols are apparently present already.

The nost risky part about this are the standard typedefs and GLAPIENTRY.
The latter is different only on win32 (and at least consistently so).
The typedefs are mostly based on stdint.h typedefs, which khrplatform.h
clumsily emulates on platforms which don't have it. The biggest
difference is that we define GLsizeiptr directly to ptrdiff_t, instead
of checking for the _WIN64 symbol and defining it to long or long long.

This also typedefs GLsync to __GLsync, just like the khronos headers.
Although symbols prefixed with __ are implementation reserved, khronos
also violates this rule, and having the same definition as khronos will
avoid problems on duplicate definitions.

We can simplify the build scripts too. The ios-gl check seems a bit
wrong now (what we really want to test for is EAGLContext), but I can't
test and thus can't improve it.

cuda_dynamic.h redefined two GL symbols; just include the new headers
directly instead.
2017-04-07 15:09:27 +02:00
wm4
c9d3a79187 vo_opengl: add a generic EGL function loader function
This is pretty trivial, but also quite annoying due to details like
mismatching eglGetProcAddress() function signature (most callers just
cast the function pointer), and ARM/Linux hacks. So move them all to one
place.
2017-04-06 14:50:19 +02:00
wm4
4e6867c771 vo_opengl: fix windows build if GLES3 is detected
With the recent GLES3 header detection, and if ANGLE is in the search
path, the ANGLE headers will be used over the desktop GL ones. It
appears the ANGLE headers do not include <windows.h>, which leads to the
dxinterop code to fail building. Oops.

Fix this by including <windows.h> is dxinterop is compiled in.
2017-04-06 13:19:59 +02:00
wm4
ebecf9c2d6 vo_opengl: header_fixes.h: merge IOS GLES block
It appears we expect IOS to provide GLES 3. The IOS block contains all
symbols from the GLES block. Weirdly not all, so it's possible that some
symbols will be redefined, which is annoying, but harmless. I don't have
an iOS setup to test, otherwise it's likely that a modification of the
IOS include statements would take care of this.
2017-04-06 08:40:17 +02:00
wm4
755ce9dac5 build: replace android-gl check with a standard GLES3 check
There's no reason to make it Android specific, as it uses standard
include paths.
2017-04-06 08:35:47 +02:00
wm4
1c0bd59bc2 vo_opengl: use 16 bit textures with angle
Regression due to 03fe506. It accidentally changed the default value if
glGetTexLevelParameteriv() is not available, which is the case with
ANGLE.
2017-04-03 18:12:42 +02:00
James Ross-Gowan
439e2b43c3 vo_opengl: angle: add --angle-flip to set the present model
DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL might be buggy on some hardware.
Additionaly DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL might be supported on some
Windows 7 systems with the platform update, but it might have poor
performance. In these cases, the user might want to disable the use of
DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL swap chains with --angle-flip=no.
2017-03-26 21:50:01 +11:00
igv
16e8ecb031 vo_opengl: replace uniform variable image_size with input_size
input_size can be the size of a cropped image

Signed-off-by: wm4 <wm4@nowhere>
2017-03-25 13:40:04 +01:00
igv
6b8dadd7c1 vo_opengl: add tex_offset uniform variable to user shaders 2017-03-25 13:39:21 +01:00
igv
2aae5ce0ba vo_opengl: make size of a cropped source image available to user shaders 2017-03-25 13:39:15 +01:00
wm4
b0cbda84ed vo_opengl: add a backend start_frame callback for context_vdpau
Might be useful for other backends too. For context_vdpau, resize
handling, presentation, and handling the mapping state becomes somewhat
less awkward.
2017-03-20 13:37:47 +01:00
wm4
8fb9cc2534 vo_opengl: read framebuffer depth from actual FBO used for rendering
In some cases, such as when using the libmpv opengl-cb API, or with
certain vo_opengl backends, the main framebuffer is never accessed.
Instead, rendering is done to a FBO that acts as back buffer. This meant
an incorrect/broken bit depth could be used for dithering.

Change it to read the framebuffer depth lazily on the first render call.

Also move the main FBO field out of the GL struct to MPGLContext,
because the renderer's init function does not need to access it anymore.
2017-03-20 13:31:28 +01:00
wm4
03fe50651b vo_opengl: move some init_gl code to utility functions 2017-03-20 13:20:35 +01:00
wm4
7e4a73c8e4 vo_opengl: add a --opengl-es=force2 option
Useful for testing. Unfortunately, the nVidia EGL driver ignores this,
and returns a GLES 3.2 context anyway (which it is allowed to do). Might
still be useable with ANGLE, which will really give you a GLES 2 context
if you ask for it.
2017-03-20 04:57:51 +01:00
wm4
f8861f681f vo_opengl: properly respect dither option if dumb mode is used
When dumb mode is used (the "simple" rendering path), respect the dither
options. Options should never be ignored (except in GLESv2 mode); either
they should be respected in dumb mode, or they should disable dumb mode.
In this case, the former applies.
2017-03-20 04:46:18 +01:00
wm4
84bf6aabf0 vo_opengl: context_vdpau: resize output surfaces lazily
This actually fixes the dreaded errors during resizing. It works pretty
much like before, except each surface is reallocated before it's used.
It implies surfaces with the old size remain in the presentation queue
and will be displayed.
2017-03-19 09:26:10 +01:00
wm4
d2dc29f221 vo_opengl: context_vdpau: minor fixes
Don't assume 0 is an invalid object handle. vdpau with its weird API
design makes all objects indexes, with 0 being a perfectly valid and
common value. You need to use VDP_INVALID_HANDLE, which is not 0.

Don't crash if init fails at vdpau initialization. It's because
mp_vdpau_destroy(NULL) crashes. Simplify it.

Destroy output surface backed FBO before output surface. Also, strictly
bookkeep the map/unmap calls (and unmap surfaces before destroying the
FBO/texture). I can't see a change in the weird errors when resizing the
window, but I guess it's slightly more correct.

Add the GL_WRITE_DISCARD_NV symbol to header_fixes.h, because we might
fail compilation with headers that do not contain the vdpau extension
(well, probably doesn't matter).
2017-03-19 09:03:13 +01:00
wm4
03933f3564 vo_opengl: fix some undefined behavior
The gl_timer_last_us() function could access samples[-1]. Fix by
coercing to unsigned, so the % will put it into index [0,max). The
real value returned in this corner case doesn't mean too much, I
guess.
2017-03-18 20:22:50 +01:00
wm4
c3248369ac vo_opengl: add experimental vdpauglx backend
As the manpage says, this has no value other than adding bugs.

It uses code based on context_x11.c, and basically does very stripped
down context creation (no alpha support etc.). It uses vdpau for
display, and maps vdpau output surfaces as FBOs to render into them.

This might be good to experiment with asynchronous presentation. For
now, it presents synchronously, with a 4 frame delay (which should whack
off A/V sync). The forced 4 frame delay is probably also why interaction
feels slower.

There are some weird vdpau errors on resizing and uninit. No idea what
causes them.
2017-03-18 17:43:57 +01:00
wm4
be8c9485b6 vo_opengl: add log field to MGLContext
Should have done this 1000 years ago. Now GL backends can use mp_log
macros directly on the MPGLContext, instead of doing stupid things like
for example MP_WARN(ctx->vo, ...).
2017-03-18 17:40:57 +01:00
Philip Sequeira
a2a5fa4545 options: add M_OPT_FILE to some more file options
(Helps shell completion.)
2017-03-06 15:41:06 +01:00
Nicholas J. Kain
e226041355 filter_kernels: Keep f.radius in terms of dest/filter coords.
The existing code modifies f.radius so that it is in terms of the
filter sample radius (in the source coordinate space) and has
some small errors because of this behavior.

This commit changes f.radius so that it is always in terms of
the filter function radius (in the destination coordinate space).

The sample radius can always be derived by multiplying f.radius
by filter_scale, which is the new, more descriptive name for the
previous inv_scale.
2017-03-06 03:31:40 +00:00
Akemi
2292501533 cocoa: add option to force dedicated GPU
Fixes #3242
2017-02-27 23:53:53 +01:00
Akemi
b5ca8c41cc osx: drop support for OS X 10.7 and earlier 2017-02-27 23:53:53 +01:00
wm4
35498d5957 vo_opengl: hwdec_d3d11egl: make it work with some ANGLE DLL versions
What a fucking waste of time. It depends on with which headers you
compile as well, so the situation is worse and more confusing than
you'd think. God knows what brain fart made them change the numeric
ID without changing the extension name or any other ways to keep
ABI-compatibility and without any warning.
2017-02-27 15:40:11 +01:00
wm4
2b6ac866c0 vo_opengl: use misc/ctype.h instead of <ctype.h>
Locale-independent, and doesn't have the char vs. unsigned char problem.
(Although in this case, the code was fine, because bstr.start is
unsigned char.)
2017-02-25 14:33:27 +01:00
wm4
79272e1469 Fix two typos
They're unrelated. Sue me.
2017-02-20 08:47:17 +01:00
wm4
6e2d3d9919 vo_opengl: remove dxva2 dummy hwdec backend
This was a hack to let libmpv API users pass a d3d device to mpv. It's
not needed anymore for 2 reasons:

1. ANGLE does not have this problem
2. Even native GL via nVidia (where this failed) seems to not require
   this anymore
2017-02-20 08:39:08 +01:00
Aman Gupta
53fab20c6d vo_opengl: implement videotoolbox hwdec on iOS
Implements --hwdec=videotoolbox on iOS. Similar to hwdec_osx.c, but
using CVPixelBuffer APIs available on iOS instead of the equivalent
IOSurface APIs in macOS.
2017-02-17 11:43:24 -08:00
wm4
f02752c0d5 vo_opengl: don't crash on unsupported formats
Regression from recent refactor.
2017-02-17 19:48:29 +01:00
wm4
d8bf000d29 vo_opengl: hwdec_vaegl: use new format setup function
Plus add a helper.
2017-02-17 17:26:01 +01:00
wm4
e59e917e71 vo_opengl: hwdec_osx: use new format setup function
We can drop the custom table.

For some reason, the interop does not accept GL_RGB_RAW_422_APPLE as
internal format for GL_RGB_422_APPLE, so switch the format table to use
GL_RGB (this way both interop and real textures work the same).

Another victim of the apparent requirement of exactly matching texture
formats is kCVPixelFormatType_32BGRA. vo_opengl wants to handle this as
normal RGBA texture, with a swizzle applied in the shader.
CGLTexImageIOSurface2D() rejects this, because it wants the exact
internal format. Just drop the format, because it's useless anyway.

(Maybe this is a bit too fragile...)
2017-02-17 17:08:37 +01:00
wm4
91a2ddfdd7 vo_opengl: hwdec_cuda: use new format setup function
Gives us automatically support for all formats vo_opengl supports.
2017-02-17 16:39:10 +01:00
wm4
eda69f5333 vo_opengl: move texture mapping of pixel formats to helper function
All supported pixel formats have a specific "mapping" of CPU data to
textures. This function determines the number and the formats of these
textures. Moving it to a helper will be useful for some hardware decode
interop backends, since they all need similar things.
2017-02-17 16:28:31 +01:00
wm4
9c54b224d8 vo_opengl: handle GL_LUMINANCE_ALPHA and integer textures differently
GL_LUMINANCE_ALPHA is the only reason why per-plane swizzles exist.
Remove per-plane swizzles (again), and regrettably handle them as
special cases (again). Carry along the logical texture format (called
gl_format in some parts of the code, including the new one).

We also don't need a use_integer flag, since the new gl_format member
implies whether it's an integer texture. (Yes, the there are separate
logical GL formats for integer textures. This aspect of the OpenGL API
is hysteric at best.)

This should change nothing about actual rendering logic and GL API
usage.
2017-02-17 16:28:31 +01:00
wm4
b5eb326dcb videotoolbox: fix RGB format
Wrong colors. This didn't matter for the OpenGL interop code, because
the CV format was mapped to the correct texture format.
2017-02-17 14:14:58 +01:00
wm4
2b5577901d videotoolbox: remove weird format-negotiation between VO and decoder
Originally, there was probably some sort of intention to restrict it to
formats supported by the interop, or something. But in the end it was
overcomplicated nonsense.

In the future, we could use mp_hwdec_ctx.supported_formats or other
mechanisms to handle this in a better way.

mp_hwdec_ctx.ctx is not set to a dummy pointer - hwdec_devices_load() is
only used to detect whether to vo_opengl interop is present, and the
common hwdec code expects that the .ctx field is not NULL.

This also changes videotoolbox-copy to use --videotoolbox-format,
instead of the FFmpeg-set default.
2017-02-17 14:14:22 +01:00
wm4
1e4fd996bb videotoolbox: factor some duplicated code
The code for copying a videotoolbox surface to mp_image was duplicated
(with some minor differences - I picked the hw_videotoolbox.c version,
because it was "better"). mp_imgfmt_from_cvpixelformat() is somewhat
duplicated with the vt_formats[] table, but this will be fixed in a
later commit, and moving the function to shared code is preparation.
2017-02-17 13:32:27 +01:00
wm4
3e474f4654 vo_opengl: hwdec_vaegl: fix potentially undefined memory access 2017-02-14 13:30:48 +01:00
James Ross-Gowan
8bc335e3db vo_opengl: angle: log the device/surface implementation
This should be useful for debugging, since otherwise it's hard to tell
which implementation has been auto-detected or if any failed to init.
2017-02-12 02:40:49 +11:00
James Ross-Gowan
061b752217 vo_opengl: egl_helpers: fix for non-Windows
Whoops. Fixes #4119
2017-02-08 01:27:17 +11:00
James Ross-Gowan
e0250b9604 vo_opengl: angle: rewrite with custom swap chain
This replaces the old backend that exclusively used EGL windowing with
one that can also use ANGLE's ability to render to directly to a
texture. The advantage of this is that it allows mpv to create the swap
chain itself and this allows mpv to use a flip-mode swap chain on a HWND
(which avoids problems with DirectComposition) and to use a longer swap
chain that has six backbuffers by default (which reportedly fixes
problems with rendering 24fps video on 24Hz monitors.)

Also, "screenshot window" should now work on DXGI 1.2 and up (Windows 8
and up.)
2017-02-07 22:45:07 +11:00
James Ross-Gowan
bed94df521 vo_opengl: dxinterop: use the new SAFE_RELEASE macro 2017-01-30 00:22:30 +11:00
wm4
443d3a91d3 vaapi: remove central lock around vaapi API calls
The lock was disabled recently. This commit gets rid of the dummied out
calls. The main reason for removing it is that there is no apparent need
for it anymore, and the new FFmpeg vaapi code does not use or provide
such a lock (there are some places which we cannot control and which do
vaapi API calls, like frame destructors).
2017-01-28 18:27:30 +01:00
wm4
9980b11058 vo_opengl: egl_helpers: fix variable name
It was basically inverted. Not sure how this even happened. Hopefully
it's more an "I don't know what I was doing" instead of an "I don't know
what I am doing" case.
2017-01-26 11:33:58 +01:00
wm4
fd203ff16a options: refacactor how --opengl-dwmflush is declared
Same deal as previous commit, except this time we just readd it as lone
global option, and read it directly.
2017-01-20 14:03:34 +01:00
wm4
d890e0731c options: refactor how --opengl-dcomposition is declared
vo_opengl used to have it as sub-option, which made it very hard to pass
down option values to backends in a generic way (even if these options
were completely backend-specific). For --opengl-dcomposition we used a
VOFLAG to deal with this. Fortunately, sub-options are gone, and we can
just add it as global option.

Move the option to context_angle.c and add it as global option. I
thought about adding a mechanism to let backends declare options, which
would get magically picked up my m_config instead of having to add them
to the global option list manually (similar to VO vo_driver.options),
but decided against this complexity just for 1 or 2 backends. Likewise,
it could have been added as a single option to avoid the boilerplate of
an option struct, but then again there are probably going to be more
angle suboptions, and it's cleaner.
2017-01-20 13:40:59 +01:00
wm4
9d68d8fb0f vo_opengl, vo_opengl_cb: better hwdec interop backend selection
Introduce the --opengl-hwdec-interop option, which replaces
--hwdec-preload. The new option allows explicit selection of the interop
backend.

This is relatively complex, and I would have preferred not to add this,
but it's probably useful to debug certain problems. In exchange, the
"new" option documents that pretty much any but the simplest use of it
will not be forward compatible.
2017-01-17 15:48:56 +01:00
wm4
ff9f2c4b6e vdpau: use libavutil for surface allocation during decoding
Use the libavutil vdpau frame allocation code instead of our own "old"
code. This also uses its code for copying a video surface to normal
memory (used by vdpau-copy).

Since vdpau doesn't really have an internal pixel format, 4:2:0 can be
accessed as both nv12 and yuv420p - and libavutil prefers to report
yuv420p. The OpenGL interop has to be adjusted accordingly.

Preemption is a potential problem, but it doesn't break it more than it
already is.

This requires a bug fix to FFmpeg's vdpau code, or vdpau-copy (as well
as taking screenshots) will fail. Libav has fixed this bug ages ago.
2017-01-17 15:48:56 +01:00
wm4
8a23892d04 vo_opengl: hwdec_cuda: add yuv420p support
Because it allows easier testing of filters + hwdec.

Make the texture setup code a bit more generic so it doesn't get too
much of a mess. We also use the GL renderer utility function
gl_find_unorm_format(), which saves us additional work with OpenGL's
semi-redundant format specifiers.
2017-01-16 16:10:39 +01:00
wm4
6b00663755 vo_opengl: hwdec_cuda: export AVHWDeviceContext
So we can use it for filtering later.
2017-01-16 16:10:39 +01:00
wm4
6c28824a92 vo_opengl: hwdec_vaegl: add a lie for compatibility
EGL rendering + new decode API didn't work due to a certain libva bug
with sort-of legacy API use hitting again. It will report the wrong
vaapi pixel format. It's old code and always nv12 anyway, so stop
worrying about it.
2017-01-13 18:43:35 +01:00
wm4
812128bab7 vo_opengl, vaapi: properly probe 10 bit rendering support
There are going to be users who have a Mesa installation which do not
support 10 bit, but a GPU which can decode to 10 bit. So it's probably
better not to hardcode whether it is supported.

Introduce a more general way to signal supported formats from renderer
to decoder. Obviously this is imperfect, because it still isn't part of
proper format negotation (for example, what if there's a vavpp filter,
which accepts anything). Still slightly better than before.

I don't know any way to probe for vaapi dmabuf/EGL dmabuf support
properly (in particular testing specific formats, not just general
availability). So we stay with the current approach and try to create
and map dummy surfaces on init to probe for support. Overdo it and check
all formats that AVHWFramesConstraints reports, instead of only NV12 and
P010 surfaces.

Since we can support unknown formats now, add explicitly checks to the
EGL/dmabuf mapper code to reject unsupported formats. I also noticed
that libavutil signals support for RGB0/BGR0, but couldn't get it to
work. Remove the DRM formats that are unused/didn't work the way I tried
to use them.

With this, 10 bit decoding + rendering should work, provided you have
a capable CPU and a patched Mesa. The required Mesa patch adds support
for the R16 and GR32 formats. It was sent by a Kodi developer to the
Mesa developer mailing list and was not accepted yet.
2017-01-13 18:43:35 +01:00
wm4
88dfb9a5e7 vo_opengl: hwdec_vaegl: remove redundant vaapi surface format check
For surfaces allocated by libavutil, we assume that the sw_format (i.e.
in hw_subfmt in mp_image_params) is always correct. The API guarantees
that it explicitly sets the equivalent vaapi format on surface
allocation.

For surfaces allocated by mpv's old vaapi code, we explicitly retrieve
the format right after decoding. Unless the driver magically changes the
format asynchronously, it will still be correct once the surface reaches
the renderer.

In both cases, checking the format again is obviously redundant. In
addition, it doesn't require us to maintain a libva fourcc <-> mpfmt
table and the va_fourcc_to_imgfmt() function. This also unbreaks 10 bit
rendering support (still disabled by default).
2017-01-13 10:28:58 +01:00
wm4
6de00d10c8 vo_opengl: hwdec_vaegl: fix terminology in comment
Bad idea to call a component "pixel" - that's true only for the Y plane.
2017-01-13 10:22:11 +01:00
Mark Thompson
14010e7cf6 vo_opengl: hwdec_vaegl: DRM_FORMAT_GR16 was renamed to DRM_FORMAT_GR32
Signed-off-by: wm4 <wm4@nowhere>
2017-01-13 10:18:54 +01:00
wm4
5174add43a vo_opengl: hwdec_vaegl: add experimental P010 support
This does not work, because Mesa has no support for the proposed
DRM_FORMAT_R16 and DRM_FORMAT_GR16 formats. It's also untested of
course.

As long as video/decode/vaapi.c doesn't hand down P010 surfaces, this is
fine anyway.

This can be tested by removing the code that disables P010 output:

diff --git a/video/decode/vaapi.c b/video/decode/vaapi.c
--- a/video/decode/vaapi.c
+++ b/video/decode/vaapi.c
@@ -55,13 +55,6 @@ static int init_decoder(struct lavc_ctx *ctx, int w, int h)

     assert(!ctx->avctx->hw_frames_ctx);

-    // If we use direct rendering, disallow 10 bit - it's probably not
-    // implemented yet, and our downstream components can't deal with it.
-    if (!p->own_ctx && required_sw_format != AV_PIX_FMT_NV12) {
-        MP_WARN(ctx, "10 bit surfaces are currently supported.\n");
-        return -1;
-    }
-
2017-01-12 14:01:09 +01:00
wm4
fb4ae3c06c cuda: use libavutil functions for copying hw surfaces to memory
mp_image_hw_download() is a libavutil wrapper added in the previous
commit. We drop our own code completely, as everything is provided by
libavutil and our helper wrapper.

This breaks the screenshot code, so that has to be adjusted as well.
2017-01-12 13:59:35 +01:00
wm4
854651f4f5 drm: include <poll.h> instead of <sys/poll.h>
I'm not sure what systems have <sys/poll.h> (maybe there are historical
reasons why some would), but POSIX defines <poll.h>. Although this code
is full of highly OS specific calls (like ioctl()), there's no reason
not to use the more standard include path.
2017-01-09 16:21:28 +01:00
wm4
e0f25010c7 vo_opengl: replace 2 memsets
Cosmetic change.
2017-01-08 11:22:55 +01:00
Rostislav Pehlivanov
c17c26f404 context_wayland: do not call vo_wayland_request_frame() upon bufferswap
vo_wayland_wait_events() is going to return when its time to swap the
buffers anyway, calling request_frame() before makes no sense.

Fixes the constant high CPU usage by the compositor when mpv is paused
and the window is in view.
2017-01-07 10:29:15 +00:00
wm4
0067d1dbef vo_opengl: egl: handle potential eglChooseConfig failures
This is actually a pretty important fix. eglChooseConfig() might be the
first thing that fails when porobing for desktop GL / ES2 / ES3 support,
because EGL_RENDERABLE_TYPE is set values specific to the underlying
APIs.

Not sure how the hell this worked before. EGL 1.4 implementations
certainly could fail the call with EGL_BAD_ATTRIBUTE if
EGL_RENDERABLE_TYPE has EGL_OPENGL_ES3_BIT set. It's quite possible that
many EGL implementations tolerate invalid EGLConfig values steming from
uininitialized EGLConfig values (and eglCreateWindowSurface() even is
specified to return EGL_BAD_CONFIG error code for "not valid"
EGLConfigs).
2016-12-31 14:58:46 +01:00
wm4
5ed4119057 vo_opengl: egl: fix depth size parameter
This was accidentally flipped from 0 to 1 in a previous commit. Actually
simply remove it, because 0 is the default value for this parameter
anyway.
2016-12-30 21:45:55 +01:00
wm4
b14fc38590 vo_opengl: x11egl: fix alpha mode
The way it should (probably) work is that selecting a RGBA framebuffer
format will simply make the compositor use the alpha. It works this way
on Wayland. On X11, this is... not done. Instead, both GLX and EGL
report two FB configs, which are exactly the same, except for the
platform-specific visual. Only the latter (non-default) points to a
visual that actually has alpha. So you can't make the pure GLX and EGL
APIs select alpha mode, and you have to override manually.

Or in other words, alpha was hacked violently into X11, in a way that
doesn't really make sense for the sake of compatibility, and forces API
users to wade through metaphorical cow shit to deal with it.

To be fair, some other platforms actually also require you to enable
alpha explicitly (rather than looking at the framebuffer type), but they
skip the metaphorical cow shit step.
2016-12-30 20:04:47 +01:00
wm4
58a0c43cf4 vo_opengl: x11: move RGBA visual test to x11_common.c
So that the EGL code can use it too.

Also print the actual FB config ID, instead of nonsense. (I _think_ once
in the past a certain GLX implementation just used numeric config IDs
casted to EGLConfig - or at least that would explain this nonsense.)
2016-12-30 20:04:32 +01:00
wm4
7033a4334b vo_opengl: egl_helpers: add a way to override config selection
Preparation for the following commits. Since at least theoretically the
config selection depends on the context type (EGL_RENDERABLE_TYPE has
separate bits for ES 2, ES 3, and desktop GL), doing it any other way
would be too painful.
2016-12-30 20:03:50 +01:00
wm4
d4e7b981bf vo_opengl: egl_helpers: add a way to pass more options
For X11 garbage we have to pass some annoying parameters to EGL context
creation. Add some sort of extensible API, so that adding a new
parameter doesn't break all callers. We still want to keep it as a
single function, because it's so nice isolating all the EGL nonsense API
boilerplate like this. (Did I mention yet that X11 and EGL are garbage?)

Also somewhat simplifies the vo_flags mess in the helper internals.
2016-12-30 20:03:17 +01:00
Niklas Haas
22a22322cb vo_opengl: partially fix rotation for 4:2:2 content
The chroma alignment renormalization code forgot to account for the fact
that the chroma subsampling ratio has to be rotated.

Unfortunately, doing it this way seems to have somewhat broken the
chroma offset rotation logic for odd-sized subsampled image files. While
this is a bug, it's much, much less noticeable, so it's not nearly as
important as the bug this change fixes. Either way, a future patch needs
to still revise this logic, ideally by redesigning the entire rotation
mechanism.
2016-12-28 15:10:48 +01:00
Philip Langdale
83c5f704e7 vo_opengl: hwdec_cuda: Don't include hwcontext headers
After various simplifications, these includes simply aren't needed
now.
2016-12-04 20:41:42 +01:00
wm4
09238a9bb5 vo_opengl: don't rely on viewport to contain window dimensions
Apparently we don't always set the viewport to window dimensions
anymore, e.g. if nothing is actually rendered. This means the viewport
can contain old values.

The window screenshot code uses the viewport values to guess the default
framebuffer dimensions. With --force-window --idle --no-osc (which draws
nothing and issues a glClear() command only), taking a screenshot would
yield an image with the wrong size and possibly garbage in it. Fix this
by explicitly passing the currently known window dimensions. Abusing the
values stored in the viewport was questionable anyway.
2016-12-02 15:26:45 +01:00
wm4
1a2319f3e4 options: remove deprecated sub-option handling for --vo and --ao
Long planned. Leads to some sanity.

There still are some rather gross things. Especially g_groups is ugly,
and a hack that can hopefully be removed. (There is a plan for it, but
whether it's implemented depends on how much energy is left.)
2016-11-25 21:17:25 +01:00
pavelxdd
98a257b3a8 angle_dynamic: silence warnings during compilation
If Angle is statically linked there were some warnings during compilation.

Fixes #3834
2016-11-25 09:49:49 +01:00
Philip Langdale
48a7c4be3a vo_opengl: hwdec_cuda: Prefix cuda symbols to avoid collisions
We want to avoid causing problems if libmpv is used in an application
that links cuda, or if the libav* libraries are linked with cuda,
as might happen if the scale_npp filter is used.
2016-11-24 20:15:57 +01:00
wm4
5aab17f833 vo_opengl: hwdec_cuda: make some init errors verbose
Improves autoprobe behavior. This is equivalent to other hwdec interop
wrappers. If CUDA is just not available, it should remain silent.
2016-11-24 18:14:55 +01:00
pavelxdd
e89382f5f6 vo_opengl: hwdec_cuda: fix crash when trying to use hwdec=cuda if cuda SDK is not present
If CUDA SDK wasn't installed, mpv crashed immediately with the message "Failed to load CUDA symbols"
2016-11-24 14:42:30 +01:00
Philip Langdale
7eacaf51f8 vo_opengl/cuda_dynamic: Use explicit cast to silence warnings on windows
Fixes #3834
2016-11-24 11:40:52 +01:00
Philip Langdale
3abb6f1fef wscript: Fix cuda test to actually work when cuda SDK is not present
The test ended up failing if cuda.h wasn't present, even if cuda.h
isn't used during the actual build.

This test is attempting to establish if the ffmpeg being built
against has dynlink_cuda support. While it might theoretically be
possible to build against the older normally-linked-cuda version
of ffmpeg, it seems more trouble than it's worth.
2016-11-23 20:48:26 +01:00
wm4
f696975fe3 angle_dynamic: minor simplification
Remove the inverted condition by swapping if branches.
2016-11-23 16:02:28 +01:00
Martin Herkt
f9668f5596
Support linking ANGLE 2016-11-23 04:09:16 +01:00
wm4
6b3d682f4e vo_opengl: hwdec_d3d11egl: fix ANGLE fallback define
This was a typo in the extensiuon spec and was probably always broken.
Could have led to broken builds when used with ancient ANGLE headers
(or possibly generic EGL headers).
2016-11-23 01:09:19 +01:00
Philip Langdale
f5e82d5ed3 vo_opengl: hwdec_cuda: Use dynamic loading for cuda functions
This change applies the pattern used in ffmpeg to dynamically load
cuda, to avoid requiring the CUDA SDK at build time.
2016-11-23 01:07:26 +01:00
Philip Langdale
585c5c34f1 vo_opengl: hwdec_cuda: Support P016 output surfaces
The latest 375.xx nvidia drivers add support for P016 output
surfaces. In combination with an ffmpeg change to return those
surfaces, we can display them.

The bulk of the work is related to knowing which format you're
dealing with at the right time. Once you know, it's straight forward.
2016-11-22 20:19:58 +01:00
Philip Sequeira
cf85191cb7 vo_opengl: blend against background color for --alpha=blend
Do it after color management, etc. so that it matches the color drawn in
the margins.
2016-11-13 18:21:27 +01:00
wm4
6a06e6002b vo_opengl: fix --blend-subtitles handling
The intention was that if --blend-subtitles is enabled, the frame should
always be re-rendered instead of using e.g. a cached scaled frame. The
reason is that subtitles can change anyway, e.g. if you pause and change
subtitle size and such.

On the other hand, if the frame is marked as repeated, it should always
use the cached copy. Actually "simplify" this and drop the cache only if
playback is paused (which frame->still indicates indirectly).

Also see PR #3773.
2016-11-07 22:49:24 +01:00
wm4
de72cb2c33 vo_opengl: fix redrawing with hardware decoding
unmap_current_image() is called after rendering. This essentially
invalidates the textures, so we can't assume that the image is still
present.

Also see PR #3773.
2016-11-07 22:49:16 +01:00
Niklas Haas
654721c27b filter_kernels: add ability to taper kernels/windows
This allows us to define the tukey window (and other tapered windows).

Also add a missing option definition for `wblur` while we're at it, to
make testing out window-related stuff easier.
2016-11-01 16:25:40 +01:00
wm4
17733bd5b8 vo_opengl: make frame reupload logic more robust
It's not that easy to decide whether a frame needs to be
reuploaded/rerendered. Using unique frame IDs for input makes it
slightly easier and more robust. This also removes the use of video PTS
in the interpolation path.

This should also avoid reuploading the video frame if it's just redrawn
in paused mode, or when using OSD/subtitles in cover art mode.
2016-11-01 16:25:40 +01:00
wm4
9ea9bdf130 vo_opengl: context_rpi: fix stdatomic usage
atomic_bool is not supported with e.g. atomic_fetch_and.

Fixes #3699. Untested.
2016-10-21 17:35:48 +02:00
wm4
202f695398 vo_opengl: partially re-enable glFlush() calls
It turns out the glFlush() call really helps in some cases, though only
in audio timing mode (where we render, then wait for a while, then
display the frame). Add a --opengl-early-flush=auto mode, which does
exactly that.

It's unclear whether this is fine on OSX (strange things going on
there), but it should be.

See #3670.
2016-10-21 17:23:26 +02:00
Dmitrij D. Czarkoff
ee2ba599e7 build: don't rely on "__thread" being always available with GCC
Thread-local storage in GCC is platform-specific, and some platforms that
are otherwise perfectly capable of running mpv may lack TLS support in GCC.

This change adds a test for GCC variant of TLS and relies on its result
instead of assumption.

Provided that LLVM's `__thread` support is similar to GCC, the test is
called "GCC/LLVM TLS".

Signed-off-by: wm4 <wm4@nowhere>
2016-10-20 17:51:57 +02:00
Aman Gupta
4bd3e51fbe opengl: compile against iOS OpenGLES implementation 2016-10-20 17:45:25 +02:00
rr-
403f489f6c vo_drm: change CLI options + refactors
- Change connector selection to accept human readable names (such as
  eDP-1, HDMI-A-2) rather than arbitrary numbers.
- Change GPU selection to accept GPU number rather than device paths.
- Merge connector and GPU selection into one --drm-connector.
- Add support for --drm-connector=help.
- Add support for --drm-* in EGL backend.
- Refactor KMS; reduce state sharing across drm_common.
2016-10-07 00:22:23 +02:00
Akemi
e543853a7f cocoa: add glFlush() to cocoa backend
The glFlush() call was made optional recently
since it's not needed in most cases. On OSX though
this is needed since we removed kCGLPFADoubleBuffer
from the context creation, so the glFlush() call
was added to the cocoa backend only.
The CGLFlushDrawable() call can be safely removed
since it only does something when a double
buffered context is used. Also fixes a small typo.
Fixes #3627.
2016-10-06 19:50:25 +02:00
wm4
53798b6465 vo_opengl: apply --opengl-early-flush in dumb mode too
In "dumb mode" (where most features are disabled and which only performs
some basic rendering) we explicitly copy a set of whitelisted options,
and leave all the other options at their default values. Add the new
--opengl-early-flush option to this whitelist. Also remove an option
field accidentally added in the commit adding --opengl-early-flush.
2016-10-05 20:35:00 +02:00
wm4
6789f9b094 vo_opengl: disable glFlush() by default, and add an option to enable it
It seems this can cause issues with certain platforms, so better to
disable it by default. The original reason for this isn't overly
justified, and display-sync mode should get rid of the need for it
anyway.

The new option is meant for testing, and will probably be removed if
nobody comes up and reports that enabling the option actually improves
anything.
2016-10-05 12:21:34 +02:00
wm4
202f14aa29 vo_opengl: hwdec_rpi: fix NULL pointer deref in certain cases
If a client API user provides the MPGetNativeDisplay callback, but
returns NULL for "MPV_RPI_WINDOW", this would crash.
2016-10-04 16:36:20 +02:00
rr-
1648ff8a0f vo_drm: refactor getting display fps
Reduces code duplication between OpenGL backend and DRM VO.

(The control() for OpenGL backend isn't sufficiently similar to the
VO's control() to consider merging it as a whole - I extracted only the
FPS code.)
2016-10-04 13:23:11 +02:00
wm4
486b3ce6f8 vo_opengl: minor simplification
The extra gl_transform_trans() has no apparent use.
2016-10-01 16:12:03 +02:00
wm4
82231fd74d vo_opengl: attempt to fix chroma offset under rotation and flipping
Other than being overly convoluted, this seems to make sense to me.
Except that to get the "rot" transform I have to set flip=true, which
makes no sense at all to me.
2016-10-01 16:07:51 +02:00
wm4
052584c9e2 vo_opengl: add debugging options for testing with padded textures 2016-10-01 12:09:18 +02:00
wm4
52fea2f909 vo_opengl: partially fix dumb-mode cropping with rotation
Combining rotation and cropping didn't work. It was just completely
broken.

I'm still not sure if this is correct. Chroma positioning seems to be
broken on rotation. There might also be a problem with non-mod-2 frame
sizes. Still, strictly an improvement for both rotated and non-rotated
rendering modes.

Also, this could probably be written in a more elegant way.
2016-09-30 22:19:01 +02:00
wm4
33c24b07e4 vo_opengl: vaegl: log more debugging infos 2016-09-30 14:36:42 +02:00
wm4
5f547e57e3 vo_opengl: rpi: remove dumb comment
It's not even true anymore.
2016-09-30 14:28:55 +02:00