1
mirror of https://github.com/mpv-player/mpv synced 2024-11-14 22:48:35 +01:00
Commit Graph

47244 Commits

Author SHA1 Message Date
wm4
d12264acc0 vo_libmpv: always create ctx->dispatch
Preparation for the next commit. Until now, it was only needed if DR was
involved. One reason for not always creating it was that you normally
must not use it if advanced_control is not enabled. This is why e.g.
VOCTRL_SCREENSHOT now checks for that variable; it still can't use
ctx->dispatch if the render API user did not enable it.
2019-09-20 14:53:21 +02:00
wnoun
35da5a4d8e render api: fix use-after-free
render api needs to wait for vo to be destroyed before frees the context.
The purpose of kill_cb is to wake up render api after vo is destroyed,
but uninit did that before kill_cb, so kill_cb tries using the freed
memory. Remove kill_cb to fix the issue as uninit is able to do the
work.
2019-09-20 13:54:17 +02:00
Cameron Cawley
db09d77e46 rpi: Update for modern systems 2019-09-20 11:39:06 +02:00
wm4
a46f83e982 Coypright: update list of files 2019-09-20 11:38:16 +02:00
wm4
257534ed18 vo: remove unused equalizer control remains
Equalizer control was redone in 03cf150ff3 (over 2 years
ago). Ever since, the equalizer control structs and the GET voctrl have
been unused. Only the SET voctrl is still used as notification mechanism
(actually a bad hack to avoid some further option change handling
complexity).

Remove the unused parts.
2019-09-20 00:45:17 +02:00
wm4
8e5cd62dca oml_sync: fix typo in comment
I think... Also reword another part of the text.
2019-09-20 00:32:29 +02:00
wm4
7000d91cf8 vf_fingerprint: use aligned_alloc instead of posix_memalign
I was assuming posix_memalign was the most portable function to use, but
MinGW does not provide it for some reason. Switch to C11 aligned_alloc()
which someone suggested was provided by MinGW (but actually isn't,
someone probably confused it with the incompatible _aligned_malloc),
and add a configure check.

Even though it turned out that MinGW doesn't provide it, the function
is slightly more elegant than posix_memalign(), so stay with it.
2019-09-19 23:09:02 +02:00
wm4
d75bdf070f demux_lavf: document intentional FFmpeg API violation
This field is documented as internal, so an API user should not
access it. However, this is the only way to get some read statistics
without replacing FFmpeg's entire HLS demuxer. (Using custom I/O as
workaround doesn't work: the HLS code uses some weird internal APIs
that cannot be provided by FFmpeg API users; I even made the author
of the relevant patch to provide a public API, but which was shot
down by another FFmpeg developer. So I take this as my right to
access this field.)

Mention this explicitly, as it affects ABI and API compatibility, and
I don't want that anyone claims this was a "mistake". Add some
explanations.
2019-09-19 20:37:05 +02:00
wm4
389f1b0ef3 packet: fix theoretical UB if called on "empty" packets
In theory, a 0 size allocation could have made it memset() on a NULL
pointer (with a non-0 size, which makes it crash in addition to
theoretical UB).

This should never happen, since even packets with size 0 should have an
associated allocation, as FFmpeg currently does. But avoiding this makes
the API slightly more orthogonal and less tricky, I guess.
2019-09-19 20:37:05 +02:00
wm4
9a7a6958ca Revert "demux/packet: fix demux_packet_shorten"
This reverts commit 95636c65e7.

This change shouldn't be needed, and in fact it's wrong. The FFmpeg API
function could do anything it wants with the packet, including changing
the packet data pointer. Likewise, it's not guaranteed that the
referenced packet's fields mirror the current state of the mpv packet
struct (the AVPacket is only kept for the AVBuffer and the side data
stuff).
2019-09-19 20:37:05 +02:00
wm4
e60bcf0d98 client API: fix some comments 2019-09-19 20:37:05 +02:00
wm4
0cd8ba72fe sd_lavc: support scaling for bitmap subtitles 2019-09-19 20:37:05 +02:00
wm4
4c5df406a8 demux: fix another incorrect BOF cache flag issue 2019-09-19 20:37:05 +02:00
wm4
380033f4a2 f_swscale: fix a typo 2019-09-19 20:37:05 +02:00
wm4
6b7ecb30d8 manpage: input.rst: fix a typo 2019-09-19 20:37:05 +02:00
wm4
9cfeafa89e video: add vf_fingerprint and a skip-logo script
skip-logo.lua is just what I wanted to have. Explanations are on the top
of that file. As usual, all documentation threatens to remove this stuff
all the time, since this stuff is just for me, and unlike a normal user
I can afford the luxuary of hacking the shit directly into the player.

vf_fingerprint is needed to support this script. It needs to scale down
video frames as part of its operation. For that, it uses zimg. zimg is
much faster than libswscale and generates more correct output. (The
filter includes a runtime fallback, but it doesn't even work because
libswscale fucks up and can't do YUV->Gray with range adjustment.)

Note on the algorithm: seems almost too simple, but was suggested to me.
It seems to be pretty effective, although long time experience with
false positives is missing. At first I wanted to use dHash [1][2], which
is also pretty simple and effective, but might actually be worse than
the implemented mechanism. dHash has the advantage that the fingerprint
is smaller. But exact matching is too unreliable, and you'd still need
to determine the number of different bits for fuzzier comparison. So
there wasn't really a reason to use it.

[1] https://pypi.org/project/dhash/
[2] http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
2019-09-19 20:37:05 +02:00
wm4
da612acacd command: make vf-metadata/af-metadata somewhat observable
Until now they weren't observable and never reported any updates. Apply
a shitty hack to make them mostly-observable. It relies on the "idle"
event, which is basically triggered on every frame displayed, or
similar. This can lead to property change notifications not being sent
quickly enough.

The cleaner solution would be adding a notification mechanisms from
filters, but I'm too lazy for that.
2019-09-19 20:37:05 +02:00
wm4
a05b847879 command: make vf-metadata/af-metadata not query metadata twice
For simplicity, these properties usually query the metadata from the
filter twice, even if it's not technically needed at all. The reason for
this is mostly the horrible (and legacy) sub-path access (which is why
tag_property() is so complex).

But for simple cases, we can easily avoid double querying, so do that.
The benefit is performance (well, won't matter), and supporting filters
that reset information on query (for later).
2019-09-19 20:37:05 +02:00
wm4
e1157cb6e8 video: generally try to align image data on 64 bytes
Generally, using x86 SIMD efficiently (or crash-free) requires aligning
all data on boundaries of 16, 32, or 64 (depending on instruction set
used). 64 bytes is needed or AVX-512, 32 for old AVX, 16 for SSE. Both
FFmpeg and zimg usually require aligned data for this reason.

FFmpeg is very unclear about alignment. Yes, it requires you to align
data pointers and strides. No, it doesn't tell you how much, except
sometimes (libavcodec has a legacy-looking avcodec_align_dimensions2()
API function, that requires a heavy-weight AVCodecContext as argument).

Sometimes, FFmpeg will take a shit on YOUR and ITS OWN alignment. For
example, vf_crop will randomly reduce alignment of data pointers,
depending on the crop parameters. On the other hand, some libavfilter
filters or libavcodec encoders may randomly crash if they get the wrong
alignment. I have no idea how this thing works at all.

FFmpeg usually doesn't seem to signal alignment internal anywhere, and
usually leaves it to av_malloc() etc. to allocate with proper alignment.
libavutil/mem.c currently has a ALIGN define, which is set to 64 if
FFmpeg is built with AVX-512 support, or as low as 16 if built without
any AVX support. The really funny thing is that a normal FFmpeg build
will e.g. align tiny string allocations to 64 bytes, even if the machine
does not support AVX at all.

For zimg use (in a later commit), we also want guaranteed alignment.
Modern x86 should actually not be much slower at unaligned accesses, but
that doesn't help. zimg's dumb intrinsic code apparently randomly
chooses between aligned or unaligned accesses (depending on compiler, I
guess), and on some CPUs these can even cause crashes. So just treat the
requirement to align as a fact of life.

All this means that we should probably make sure our own allocations are
64 bit aligned. This still doesn't guarantee alignment in all cases, but
it's slightly better than before.

This also makes me wonder whether we should always override libavcodec's
buffer pool, just so we have a guaranteed alignment. Currently, we only
do that if --vd-lavc-dr is used (and if that actually works). On the
other hand, it always uses DR on my machine, so who cares.
2019-09-19 20:37:05 +02:00
wm4
36dd2348a1 ta: destroy/free children in reverse order
This matters when talloc allocations set destructors. Before this
commit, destructors were called in the same order as they were added to
the parent allocations. Now it happens in reverse order.

I think this makes more sense. It's reasonable to assume that an
allocation that was added later may depend on any of the previous
allocations, so later additions should be destroyed first. (Of course
other orders are entirely possible too.)

Hopefully this doesn't fix or break anything, but I can't be sure (about
either of those). It's risky. (Then why do it?)

The destructor of a parent allocation is called before its children. It
makes sense and must stay this way, because in most cases, the
destructor wants to access the children.

This is a reason why I don't really like talloc (it wasn't my idea to
use talloc, is my excuse). Quite possible that destructors should be
removed from talloc entirely. Actually, this project should probably be
rewritten in Rust (or a better language), but that would be even more of
a pain; also, I think this is just the right level of suffering and
punishment.
2019-09-19 20:37:05 +02:00
wm4
0edccfd820 m_config: add assertion to a specific case
It seems using multiple prefixes for an option isn't supported out of
laziness (and shouldn't, because what the fuck). So assert() on this.

(Unfortunately this prefix nonsense is still needed. Especially AO and
VO options use this through the options_prefix field.)
2019-09-19 20:37:05 +02:00
wm4
ae797a21ec command: don't add deprecated CLI aliases to property list
A dumb thing that the cursed property-option bridge accidentally did.

Normal deprecated options on the other hand are fine in the property
list, because they're wanted for compatibility.
2019-09-19 20:37:05 +02:00
wm4
2f5dbaa832 options: deprecate --stream-record
It's inadequate for most uses. There are better mechanisms.
2019-09-19 20:37:05 +02:00
wm4
ad20f808af builtin.conf: add clarifications
Just in case a confused user or developer finds this file and wonders
why it's "incomplete".
2019-09-19 20:37:05 +02:00
wm4
8ffd1073a2 m_config: remove m_config_create_shadow
A previous commit changed m_config so that it always creates the shadow
thing, and the function's only remaining purpose was to initialize
mpv_global. It makes much more sense to do that at the caller, and it's
only 1 line of code too.
2019-09-19 20:37:05 +02:00
wm4
cd3394f039 m_config: further minor simplifications
Now m_config_shadow is fully independent from m_config (except for the
fact that m_config is still involved in its creation).
2019-09-19 20:37:05 +02:00
wm4
4e6ede3c53 m_config: simplify some minor crap
m_config has a m_config_option array, that is used for all option
access. The code maintaining shadow copies also tried to make use of it,
and did so by "cleverly" assigning each m_sub_options run a slice of
that array. But actually it's much simpler to, you know, directly access
the damn options.

This helps separation m_config and the general option code slightly.

Still seems to work after a superficial test, good enough.
2019-09-19 20:37:05 +02:00
wm4
059262c746 m_config: move group list to internal context
This is good because a private thing is not so public anymore, and it's
also preparation for further changes.

Some tricky memory management issues: m_config_data (i.e. config->data)
now depends on m_config_shadow, instead of m_config. In particular,
free_option_data() accesses the m_config_shadow.groups array. Obviously
it must be freed before m_config_shadow.
2019-09-19 20:37:05 +02:00
wm4
98627d3a32 io: remove Windows tmpfile() emulation
Unused now. The old stream cache used it, but it was removed.

On a side note, the demuxer cache uses mp_mkostemps(). It looks like our
Windows open() emulation handles this correctly by using CREATE_NEW, so
no functionality gets lost by the "new" approach. On the other hand, the
demuxer cache does not set FILE_FLAG_DELETE_ON_CLOSE, but instead tries
to delete the file after opening (POSIX style), which probably won't
work on Windows. But I'm not sure how to make it use the DELETE_ON_CLOSE
flag, so whatever.
2019-09-19 20:37:05 +02:00
wm4
8576374faa m_config: add/move some comments
Move the comments documenting exported functions to the header. It looks
like the header is the preferred place for that (although I don't really
appreciate headers where you lose the overview because of all the
documentation comments). Add comments to some undocumented prototypes.
2019-09-19 20:37:05 +02:00
wm4
b37a9685ad m_config: remove an unused function
This was one of those "shouldn't exist" type of functions that could
access internals that were supposed to be isolated away, but some code
needed to access it anyway.

It looks like the last use of it went away in 2016, shortly after it was
introduced.
2019-09-19 20:37:05 +02:00
wm4
e265c07547 vo: fix missed option updates under rare circumstances
Dear diary,

today I fixed a shitty bug that was all my fault because I made a
horrible mess. (Except it was a horrible mess before I even touched
this shit, but let's not blame others.)

Sometimes, updates to VO option that control video sizing (like panscan)
didn't update the screen correctly. They were delayed until the next
option change or so.

It turns out that if the option update happens at the "same" time as a
VOCTRL, update_opts() doesn't actually notify the vo_driver of the
change. This in turn happened because run_control() called
m_config_cache_update(). The latter function returns true if the options
changed since the last call, and update_opts() also calls it (on the
same config cache) for the same purpose. The update_opts() call, which
is triggered by a third mechanism, comes later, but the cache update
call will return false (as it should). Basically, given the config API,
you can't act differently on multiple update calls and expect it to
work. The skipped handling in update_opts() meant that the notification
required to apply the changed option wasn't run.

Fix this by simply calling update_opts() directly instead. Now there's
only 1 m_config_cache_update() call on this specific instance. Fix the
call in run_reconfig() too, so the previous sentence isn't a lie (but it
probably doesn't make a difference in practice due to certain details).

I'm not sure how I even ran into this sort-of race condition. The VOCTRL
that messed up the option update was VOCTRL_UPDATE_PLAYBACK_STATE, which
happens semi-regularly.

Why this config cache shit and all the other shit? Rediscovering this
crap wasn't pleasant. It's a bunch of hacks that became necessary when
the ancient MPlayer architecture made it hard to move the VO to a
separate thread.

All the VO code typically accesses vo->opts (whose fields all used to be
global variables in MPlayer). The frontend changes these on user input.
Putting locking around all the options would be a nightmare, and keeping
a copy of the options in the thread was much simpler. You need a way to
propagate option changes, notify the thread, and update the local copy
too. And the result of these thoughts was the config cache mechanism.

In this specific case, the relevant cache update call in update_opts()
triggers a VOCTRL_SET_PANSCAN to the VO driver, which isn't related to
its former function anymore. Instead, it causes the VO driver to update
the video sizing/placing options, which the generic VO code can't do.
(Mostly because the VO driver includes the windowing stuff and is
responsible for resizing etc. itself.)

VOCTRLs sent by the frontend are even worse. MPlayer had no real runtime
option change mechanism. Some options were vaguely duplicated by
properties, so you could effectively change those options at runtime.
Each of these options had its own VOCTRL, which still exist today, e.g.
VOCTRL_FULLSCREEN, or VOCTRL_ONTOP. I tried to make all options runtime
changeable, and to unify properties with options. But I couldn't be
bothered with updating all VO drivers to listen to option changes
directly, because that would be pretty tedious. So the property code is
still all there and sends the old VOCTRLs. But of course you need to
sync up the options, which is why the run_control() code did that.

(Unrelated: VO_EVENT_FULLSCREEN_STATE is the worst shithack of them all.
Currently, only the frontend can actually write to options (for awful
reasons), so if the fullscreen state changes due to outside interaction,
the VO driver can't update the corresponding option fields. So the VO
notifies the frontend with said VO_EVENT_, and the frontend then sends
VOCTRL_GET_FULLSCREEN, and updates the global copy of the option with
the value returned by that. I still like to think the situation is not
that bad considering the monstrous effort of converting single-threaded
code that had hundreds of options in global variables to multi-threaded
code with no global variables at all.)
2019-09-19 20:37:05 +02:00
wm4
82f2613ade command, demux: add AB-loop keyframe cache align command
Helper for the ab-loop-dump-cache command, see manpage additions.

This is kind of shit. Not only is this a very "special" feature, but it
also vomits more messy code into the big and already bloated demux.c,
and the implementation is sort of duplicated with the dump-cache code.
(Except it's different.) In addition, the results sort of depend what a
video player would do with the dump-cache output, or what the user wants
(for example, a user might be more interested in the range of output
audio, instead of the video).

But hey, I don't actually need to justify it. I'm only justifying it for
fun.
2019-09-19 20:37:05 +02:00
wm4
1dd0b2fe34 command: shuffle cache-dump start message
This is better?
2019-09-19 20:37:05 +02:00
wm4
f7678575a5 recorder: always mux all packets on discont/close
This is the muxer used by all 3 stream recording features (why are there
so many?). It tried hard to avoid writing broken files. In particular,
it buffered packets until it new there was a keyframe packet (which, in
mpv's/FFmpeg's definition, mean seek points from which decoding can
resume), or final EOF. The danger that was probably considered here was
that due to video frame reordering, not muxing some trailing, missing
packets of a keyframe range could lead to broken decoding or skipped
frames, so better discard packets belonging to an incomplete range.
Sounds like a good idea so far.

Unfortunately, this will drop an entire keyframe range even if the
current packet run is complete and mp_recorder_mark_discontinuity() is
called, simply because recorder.c can not know that the next packet
would have been a keyframe.

It seems better to mux all packets to avoid losing valid data, even if
it means that sometimes packets/frames will be missing from the file. It
benefits especially the dump-cache command, which will call the function
to signal a discontinuity after every range. Before this commit, it
discarded the last packets, even if they were perfectly fine.

(An alternative solution for dump-cache would have been a second
discontinuity marker function, that communicates that the current packet
range is complete. But this commit's solution is simpler and overall
more robust, at the danger of producing more semi-broken files.)

This may make some of the complex buffering/waiting logic in recorder.c
pointless.

Untested (in this final form).
2019-09-19 20:37:05 +02:00
wm4
9d97c4d814 manpage: mention that there's a Lua API for async commands
But don't tell the reader which those APIs are. Hope the user will just
search for "async" in the Lua section (lua.rst). But of course, nobody
will ever care about anything related to this.
2019-09-19 20:37:05 +02:00
wm4
023b5964b0 demux, command: add a third stream recording mechanism
That's right, and it's probably not the end of it. I'll just claim that
I have no idea how to create a proper user interface for this, so I'm
creating multiple partially-orthogonal, of which some may work better in
each of its special use cases.

Until now, there was --record-file. You get relatively good control
about what is muxed, and it can use the cache. But it sucks that it's
bound to playback. If you pause while it's set, muxing stops. If you
seek while it's set, the output will be sort-of trashed, and that's by
design.

Then --stream-record was added. This is a bit better (especially for
live streams), but you can't really control well when muxing stops or
ends. In particular, it can't use the cache (it just dumps whatever the
underlying demuxer returns).

Today, the idea is that the user should just be able to select a time
range to dump to a file, and it should not affected by the user seeking
around in the cache. In addition, the stream may still be running, so
there's some need to continue dumping, even if it's redundant to
--stream-record.

One notable thing is that it uses the async command shit. Not sure
whether this is a good idea. Maybe not, but whatever. Also, a user can
always use the "async" prefix to pretend it doesn't.

Much of this was barely tested (especially the reinterleaving crap),
let's just hope it mostly works. I'm sure you can tolerate the one or
other crash?
2019-09-19 20:37:05 +02:00
wm4
226e050b83 demux: move packet cache reading to a function
Useful for a following commit.
2019-09-19 20:37:05 +02:00
wm4
c43846f2cb screenshot: move message showing to common code
The screenshot command has this weird behavior that it shows messages
both on terminal and OSD by default, but that a command prefix can be
used to disable the OSD message.

Move this mechanism to common code, and make this available to other
commands too (although as of this commit only the screenshot commands
use it).

This gets rid of the weird screenshot_ctx.osd field too, which was sort
of set on a command, and sometimes inconsistently restored after the
command.
2019-09-19 20:37:05 +02:00
wm4
b55b9cb98c demux: move a seek helper to a separate function
It makes some slight sense and helps with one of the following commits.

Also rename that other function to make it sound less similar to
find_seek_target().
2019-09-19 20:37:05 +02:00
wm4
7893ab5a7e demux: minor simplification for backward cache size option
Always set max_bytes_bw to 0 if seekable cache is disabled, instead at
the place of its use. This is the only use of it, so the commit should
not change any behavior.

(Alternatively, this could drop the max_bytes_bw variable, use the
option directly, and keep the old code that resets it on use of the
cache is disabled.)
2019-09-19 20:37:05 +02:00
wm4
5c0a626dee demux: allow backward cache to use unused forward cache
Until now, the following could happen: if you set a 1GB forward cache,
and a 1GB backward cache, and you opened a 2GB file, it would prune away
the data cached at the start as playback progressed past the 50% mark.

With this commit, nothing gets pruned, because the total memory usage
will still be 2GB, which equals the total allowed memory usage of 1GB +
1GB.

There are no explicit buffers (every packet is malloc'ed and put into a
linked list), so it all comes down to buffer size computations. Both
reader and prune code use these sizes to decide whether a new packet
should be read / an old packet discarded. So just add the remaining free
"space" from the forward buffer to the available backward buffer. Still
respect if the back buffer is set to 0 (e.g. unseekable cache where it
doesn't make sense to keep old packets).

We need to make sure that the forward buffer can always append, as long
as the forward buffer doesn't exceed the set size, even if the back
buffer "borrows" free space from it. For this reason, always keep 1 byte
free, which is enough to allow it to read a new packet. Also, it's now
necessary to call pruning when adding a packet, to get back "borrowed"
space that may need to be free'd up after a packet has been added.

I refrained from doing the same for forward caching (making forward
cache use unused backward cache). This would work, but has a
disadvantage. Assume playback starts paused. Demuxing will stop once the
total allowed low total cache size is reached. When unpausing, the
forward buffer will slowly move to the back buffer. That alone will not
change the total buffer size, so demuxing remains stopped. Playback
would need to pass over data of the size of the back buffer until
demuxing resume; consider this unacceptable. Live playback would break
(or rather, would not resume in unintuitive ways), even normal streaming
may break if the server invalidates the URL due to inactivity. As an
alternative implementation, you could prune the back buffer immediately,
so the forward buffer can grow, but then the back buffer would never
grow. Also makes no sense.

As far as the user interface is concerned, the idea is that the limits
on their own aren't really meaningful, the purpose is merely to vaguely
restrict the cache memory usage. There could be just a single option to
set the total allowed memory usage, but the separate backward cache
controls the default ratio of backward/forward cache sizes. From that
perspective, it doesn't matter if the backward cache uses more of the
total buffer than assigned, if the forward buffer is complete.
2019-09-19 20:37:05 +02:00
wm4
e6911f82a5 demux: don't clobber internal demuxer EOF state in cache seeks
The last_eof field is the last known EOF state from the underlying
demuxer. Normally, seeks reset it, because obviously if seek back into
the middle of the file, you don't want last_eof to have a "wrong" value
for a short time window (until a packet is read, which would reset the
field to its correct value).

This shouldn't happen during cache seeks, because you don't touch the
underlying demuxer state.

At first, I made this change because some other work in progress
required it. It turned out that it was unnecessary, but keep the change
anyway, since it's still correct and makes the logic cleaner.
2019-09-19 20:37:05 +02:00
wm4
9e1945d307 win_state: silence a valgrind warning
m_geometry_apply() will read and modify the dummy variable. It's not
actually used for anything, but valgrind will still warn against
uninitialized data. I'm not sure whether this was UB, but in any case
it's annoying when running valgrind.
2019-09-19 20:37:05 +02:00
wm4
11027e99f2 packet: change memory estimation heuristics
Determining how much memory something uses is very hard, especially in
high level code (yes we call code using malloc high level). There's no
way to get an exact amount, especially since the malloc arena is shared
with the entire process anyway. So the demuxer packet cache tries to get
by with an estimate using a number of rough guesses.

It seems this wasn't quite good. In some ways, it was too optimistic, in
others it seemed to account for too much data. Try to get it closer to
what malloc and ta probably do. In particular, talloc adds some
singificant overhead (using talloc for mass-data was a mistake, and it's
even my fault). The result appears to match better with measured memory
usage. This is still extremely dependent on malloc implementation and so
on.

The effect is that you may need to adjust the demuxer cache limits to
cache as much data as it did before this commit. In any case, seems to
be better for me.
2019-09-19 20:37:05 +02:00
wm4
3cea180cc0 packet: free some unnecessary memory in disk cache case
If the disk cache is used, the AVPacket is not used anymore and is
completely deallocated when the packet is written to disk. As a minor
bug, the AVPacket allocation itself was not freed (although it wasn't a
memory leak, since talloc still automatically freed it when the entire
demux_packet was freed). For very large caches, this could easily add up
to over hundred MB, so actually free the unneeded allocation.
2019-09-19 20:37:05 +02:00
wm4
a8be8fe4f4 vd_lavc: put vaapi before vdpau in autoprobe order
--hwdec=auto-copy was preferring vdpau over vaapi. In the HEVC 10 bit
case, this also led to hardware decoding not being enabled. (Probably
because the probing can't start over after enabling hw decoding fails at
runtime, or something like that.)

Possible that this subtly breaks on some setups. You can't always win.
2019-09-19 20:37:05 +02:00
wm4
a17337910c vo_gpu: hwdec_vaegl: silence confusing message during probing
During probing on a system with AMD GPU, mpv used to output the
following messages if hardware decoding was enabled:

[ffmpeg] AVHWFramesContext: Failed to create surface: 2 (resource
allocation failed).
[ffmpeg] AVHWFramesContext: Unable to allocate a surface from internal
buffer pool.

This commit removed the message, with hopefully no other side effects.
Long explanations follow, better don't read them, it's just tedious
drivel about the details. People should learn to write concise commit
messages, not drone on and on endlessly all while they have no fucking
point.

The code probes supported hardware pixel format, and checks whether they
can be mapped as textures. av_hwdevice_get_hwframe_constraints() returns
a list of hardware pixel formats in the valid_sw_formats field (the "sw"
means software, but they're still hardware pixel formats, makes sense).
This contained the format yuv420p, even though this is not a valid
hardware format. Trying to create a surface of this type results in VA
surface creation failure, upon which FFmpeg prints the error messages
above. We'd be fine with this, except FFmpeg has a global log callback,
and there's no way to suppress these messages without creating other
issues.

It turns out that FFmpeg's vaapi implementation returns all formats from
vaQueryImageFormats() if no "hwconfig" is provided. This list includes
yuv420p, which is probably supported for surface upload/download, but
not as native format. Following FFmpeg's logic, it should not appear in
the valid_sw_formats list, because formats for transfers are returned by
another roundabout API.

Idiotically, there doesn't seem to be any vaapi call that determines
whether a format is a valid surface format. All mechanisms to do this
are bound to a VAConfigID (= video codec or video processor), all while
the actual surface creation API strangely does not take a VAConfigID (a
big WTF).

Also, calling the vaCreateSurfaces() API ourselves for probing is out of
the question, because that functions is utterly and idiotically complex.
Look at the FFmpeg code and how much effort it requires to setup a
complete set of attributes - we can't duplicate this.

So the only way left to do this is the most idiotic and tedious way:
enumerating all VAProfile (and VAEntrypoints) to create all possible
VAConfigIDs. Each of the VAConfigIDs is associated with a list of
formats, which FFmpeg can return (by passing the ID along with the
"hwconfig"), and which is probed separately.

Note that VAConfigID actually refers to a dynamic instance of something,
and creating a VAConfigID takes not only the VAProfile and the
VAEntrypoint, but also an arbitrary attribute array. In theory, this
means our attempt to get to know all possible configurations cannot
work, but in practice this attribute array seems to be pointless for
decoding and video processing, and FFmpeg doesn't use it (though the
encoding path does use it). This probably just makes it _barely_ OK to
do it this way.

Could we discard all this probing shit, and somehow do it another way?
Probably not. The EGL API for mapping surfaces doesn't even seem to
provide a way to enumerate supported formats, we may not even know
whether DRM/dmabuf interop is actually supported (AFAIR the EGL
extensions are present even if they don't work), nor do we know whether
the VAAPI driver supports this interop (not sure). So actually trying is
the only way.

Further, mpv initializes the decoder on a another thread, where you
can't just access OpenGL state. This suckage is mostly to be blamed on
OpenGL itself and its crazy thread boundedness. In theory, this could be
done anyway (see how software decoding "direct rendering" tries to get
around this). But to make it worse, the decoder never cares about the
list of supported formats determined by this code; instead,
f_autoconvert.c tries to deal with it and insert a video processor
(well, good luck with this crap, I bet it doesn't even work). So this
whole endeavor might be pointless, other than the fact that failed
probing can disable use of vaapi (which is correct and necessary). But
if you have a shovel, you don't use it to smash the flat end on the heap
of shit that's piled up before you, or do you?

While this method probably works, it's still orgasmically tedious. It
was tedious before: we had to create a real surface, create a GL
texture, map the surface with it, then destroy everything again. But the
added code is tedious on its own. Highlights include the need to malloc
a FFmpeg struct just to pass a single damn integer, the need to
enumerate "entrypoints" for each VA profile, even though all profiles
have exactly 1 entrypoint, and the kind of obnoxious way how vaapi
requires you to preallocate arrays for returned things, even they could
for example reasonably be returned as immutable arrays or have some
other simpler API.

The main grand fuckup is of course that vaapi requires a VAConfigID to
query surface properties, but not for creating surfaces. This
awkwardness even affected the FFmpeg API design, which has a "hwconfig"
concept that is only used by vaapi (vaapi is only 1 out of 10 hardware
decoding APIs supported by the FFmpeg hwcontext stuff). Maybe I'm just
missing something. It's as if vaapi required setting radioactive shit on
fire. Look how clean the native D3D11 code is instead. (Even the ANGLE
code manages to avoid being this fucked up. Or the VDPAU code, despite
supporting multiple mapping methods.)

Another only barely related change is that the valid_sw_formats field
can be NULL, and the API explicitly documents this. Technically, the mpv
code was buggy for not checking this, although until now the FFmpeg
implementation so far could not return it when we still passed NULL for
the hwconfig parameter.
2019-09-19 20:37:05 +02:00
wm4
c42a21c9fa vo_gpu: hwdec_vaegl: refactor format probing
No functional changes, just preparation for the next commit. Split the
probing into multiple functions. Prepare for the yet unused possibility
to pass AVVAAPIHWConfig to probing. try_format_pixfmt() now assumes it
can be called multiple times with the same format, so it filters the
format.

The format probing is now something like O(n^2) for n formats, but n
will most likely remain something under 50 or so.
2019-09-19 20:37:05 +02:00
wm4
f7d9365eb9 m_config: fix typo in comment 2019-09-19 20:37:05 +02:00