mpv/demux/packet.h

93 lines
3.0 KiB
C
Raw Permalink Normal View History

/*
* This file is part of mpv.
*
* mpv is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* mpv is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with mpv. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef MPLAYER_DEMUX_PACKET_H
#define MPLAYER_DEMUX_PACKET_H
#include <stdbool.h>
#include <stddef.h>
#include <inttypes.h>
// Holds one packet/frame/whatever
typedef struct demux_packet {
double pts;
double dts;
double duration;
2016-02-15 20:39:17 +01:00
int64_t pos; // position in source file byte stream
demux: add a on-disk cache Somewhat similar to the old --cache-file, except for the demuxer cache. Instead of keeping packet data in memory, it's written to disk and read back when needed. The idea is to reduce main memory usage, while allowing fast seeking in large cached network streams (especially live streams). Keeping the packet metadata on disk would be rather hard (would use mmap or so, or rewrite the entire demux.c packet queue handling), and since it's relatively small, just keep it in memory. Also for simplicity, the disk cache is append-only. If you're watching really long livestreams, and need pruning, you're probably out of luck. This still could be improved by trying to free unused blocks with fallocate(), but since we're writing multiple streams in an interleaved manner, this is slightly hard. Some rather gross ugliness in packet.h: we want to store the file position of the cached data somewhere, but on 32 bit architectures, we don't have any usable 64 bit members for this, just the buf/len fields, which add up to 64 bit - so the shitty union aliases this memory. Error paths untested. Side data (the complicated part of trying to serialize ffmpeg packets) untested. Stream recording had to be adjusted. Some minor details change due to this, but probably nothing important. The change in attempt_range_joining() is because packets in cache have no valid len field. It was a useful check (heuristically finding broken cases), but not a necessary one. Various other approaches were tried. It would be interesting to list them and to mention the pros and cons, but I don't feel like it.
2019-06-13 19:10:32 +02:00
union {
// Normally valid for packets.
struct {
unsigned char *buffer;
size_t len;
};
// Used if is_cached==true, special uses only.
struct {
uint64_t pos;
} cached_data;
};
int stream; // source stream index (typically sh_stream.index)
2016-02-15 20:39:17 +01:00
bool keyframe;
Implement backwards playback See manpage additions. This is a huge hack. You can bet there are shit tons of bugs. It's literally forcing square pegs into round holes. Hopefully, the manpage wall of text makes it clear enough that the whole shit can easily crash and burn. (Although it shouldn't literally crash. That would be a bug. It possibly _could_ start a fire by entering some sort of endless loop, not a literal one, just something where it tries to do work without making progress.) (Some obvious bugs I simply ignored for this initial version, but there's a number of potential bugs I can't even imagine. Normal playback should remain completely unaffected, though.) How this works is also described in the manpage. Basically, we demux in reverse, then we decode in reverse, then we render in reverse. The decoding part is the simplest: just reorder the decoder output. This weirdly integrates with the timeline/ordered chapter code, which also has special requirements on feeding the packets to the decoder in a non-straightforward way (it doesn't conflict, although a bugmessmass breaks correct slicing of segments, so EDL/ordered chapter playback is broken in backward direction). Backward demuxing is pretty involved. In theory, it could be much easier: simply iterating the usual demuxer output backward. But this just doesn't fit into our code, so there's a cthulhu nightmare of shit. To be specific, each stream (audio, video) is reversed separately. At least this means we can do backward playback within cached content (for example, you could play backwards in a live stream; on that note, it disables prefetching, which would lead to losing new live video, but this could be avoided). The fuckmess also meant that I didn't bother trying to support subtitles. Subtitles are a problem because they're "sparse" streams. They need to be "passively" demuxed: you don't try to read a subtitle packet, you demux audio and video, and then look whether there was a subtitle packet. This means to get subtitles for a time range, you need to know that you demuxed video and audio over this range, which becomes pretty messy when you demux audio and video backwards separately. Backward display is the most weird (and potentially buggy) part. To avoid that we need to touch a LOT of timing code, we negate all timestamps. The basic idea is that due to the navigation, all comparisons and subtractions of timestamps keep working, and you don't need to touch every single of them to "reverse" them. E.g.: bool before = pts_a < pts_b; would need to be: bool before = forward ? pts_a < pts_b : pts_a > pts_b; or: bool before = pts_a * dir < pts_b * dir; or if you, as it's implemented now, just do this after decoding: pts_a *= dir; pts_b *= dir; and then in the normal timing/renderer code: bool before = pts_a < pts_b; Consequently, we don't need many changes in the latter code. But some assumptions inhererently true for forward playback may have been broken anyway. What is mainly needed is fixing places where values are passed between positive and negative "domains". For example, seeking and timestamp user display always uses positive timestamps. The main mess is that it's not obvious which domain a given variable should or does use. Well, in my tests with a single file, it suddenly started to work when I did this. I'm honestly surprised that it did, and that I didn't have to change a single line in the timing code past decoder (just something minor to make external/cached text subtitles display). I committed it immediately while avoiding thinking about it. But there really likely are subtle problems of all sorts. As far as I'm aware, gstreamer also supports backward playback. When I looked at this years ago, I couldn't find a way to actually try this, and I didn't revisit it now. Back then I also read talk slides from the person who implemented it, and I'm not sure if and which ideas I might have taken from it. It's possible that the timestamp reversal is inspired by it, but I didn't check. (I think it claimed that it could avoid large changes by changing a sign?) VapourSynth has some sort of reverse function, which provides a backward view on a video. The function itself is trivial to implement, as VapourSynth aims to provide random access to video by frame numbers (so you just request decreasing frame numbers). From what I remember, it wasn't exactly fluid, but it worked. It's implemented by creating an index, and seeking to the target on demand, and a bunch of caching. mpv could use it, but it would either require using VapourSynth as demuxer and decoder for everything, or replacing the current file every time something is supposed to be played backwards. FFmpeg's libavfilter has reversal filters for audio and video. These require buffering the entire media data of the file, and don't really fit into mpv's architecture. It could be used by playing a libavfilter graph that also demuxes, but that's like VapourSynth but worse.
2019-05-18 02:10:51 +02:00
// backward playback
demux: add a on-disk cache Somewhat similar to the old --cache-file, except for the demuxer cache. Instead of keeping packet data in memory, it's written to disk and read back when needed. The idea is to reduce main memory usage, while allowing fast seeking in large cached network streams (especially live streams). Keeping the packet metadata on disk would be rather hard (would use mmap or so, or rewrite the entire demux.c packet queue handling), and since it's relatively small, just keep it in memory. Also for simplicity, the disk cache is append-only. If you're watching really long livestreams, and need pruning, you're probably out of luck. This still could be improved by trying to free unused blocks with fallocate(), but since we're writing multiple streams in an interleaved manner, this is slightly hard. Some rather gross ugliness in packet.h: we want to store the file position of the cached data somewhere, but on 32 bit architectures, we don't have any usable 64 bit members for this, just the buf/len fields, which add up to 64 bit - so the shitty union aliases this memory. Error paths untested. Side data (the complicated part of trying to serialize ffmpeg packets) untested. Stream recording had to be adjusted. Some minor details change due to this, but probably nothing important. The change in attempt_range_joining() is because packets in cache have no valid len field. It was a useful check (heuristically finding broken cases), but not a necessary one. Various other approaches were tried. It would be interesting to list them and to mention the pros and cons, but I don't feel like it.
2019-06-13 19:10:32 +02:00
bool back_restart : 1; // restart point (reverse and return previous frames)
bool back_preroll : 1; // initial discarded frame for smooth decoder reinit
// If true, cached_data is valid, while buffer/len are not.
bool is_cached : 1;
Implement backwards playback See manpage additions. This is a huge hack. You can bet there are shit tons of bugs. It's literally forcing square pegs into round holes. Hopefully, the manpage wall of text makes it clear enough that the whole shit can easily crash and burn. (Although it shouldn't literally crash. That would be a bug. It possibly _could_ start a fire by entering some sort of endless loop, not a literal one, just something where it tries to do work without making progress.) (Some obvious bugs I simply ignored for this initial version, but there's a number of potential bugs I can't even imagine. Normal playback should remain completely unaffected, though.) How this works is also described in the manpage. Basically, we demux in reverse, then we decode in reverse, then we render in reverse. The decoding part is the simplest: just reorder the decoder output. This weirdly integrates with the timeline/ordered chapter code, which also has special requirements on feeding the packets to the decoder in a non-straightforward way (it doesn't conflict, although a bugmessmass breaks correct slicing of segments, so EDL/ordered chapter playback is broken in backward direction). Backward demuxing is pretty involved. In theory, it could be much easier: simply iterating the usual demuxer output backward. But this just doesn't fit into our code, so there's a cthulhu nightmare of shit. To be specific, each stream (audio, video) is reversed separately. At least this means we can do backward playback within cached content (for example, you could play backwards in a live stream; on that note, it disables prefetching, which would lead to losing new live video, but this could be avoided). The fuckmess also meant that I didn't bother trying to support subtitles. Subtitles are a problem because they're "sparse" streams. They need to be "passively" demuxed: you don't try to read a subtitle packet, you demux audio and video, and then look whether there was a subtitle packet. This means to get subtitles for a time range, you need to know that you demuxed video and audio over this range, which becomes pretty messy when you demux audio and video backwards separately. Backward display is the most weird (and potentially buggy) part. To avoid that we need to touch a LOT of timing code, we negate all timestamps. The basic idea is that due to the navigation, all comparisons and subtractions of timestamps keep working, and you don't need to touch every single of them to "reverse" them. E.g.: bool before = pts_a < pts_b; would need to be: bool before = forward ? pts_a < pts_b : pts_a > pts_b; or: bool before = pts_a * dir < pts_b * dir; or if you, as it's implemented now, just do this after decoding: pts_a *= dir; pts_b *= dir; and then in the normal timing/renderer code: bool before = pts_a < pts_b; Consequently, we don't need many changes in the latter code. But some assumptions inhererently true for forward playback may have been broken anyway. What is mainly needed is fixing places where values are passed between positive and negative "domains". For example, seeking and timestamp user display always uses positive timestamps. The main mess is that it's not obvious which domain a given variable should or does use. Well, in my tests with a single file, it suddenly started to work when I did this. I'm honestly surprised that it did, and that I didn't have to change a single line in the timing code past decoder (just something minor to make external/cached text subtitles display). I committed it immediately while avoiding thinking about it. But there really likely are subtle problems of all sorts. As far as I'm aware, gstreamer also supports backward playback. When I looked at this years ago, I couldn't find a way to actually try this, and I didn't revisit it now. Back then I also read talk slides from the person who implemented it, and I'm not sure if and which ideas I might have taken from it. It's possible that the timestamp reversal is inspired by it, but I didn't check. (I think it claimed that it could avoid large changes by changing a sign?) VapourSynth has some sort of reverse function, which provides a backward view on a video. The function itself is trivial to implement, as VapourSynth aims to provide random access to video by frame numbers (so you just request decreasing frame numbers). From what I remember, it wasn't exactly fluid, but it worked. It's implemented by creating an index, and seeking to the target on demand, and a bunch of caching. mpv could use it, but it would either require using VapourSynth as demuxer and decoder for everything, or replacing the current file every time something is supposed to be played backwards. FFmpeg's libavfilter has reversal filters for audio and video. These require buffering the entire media data of the file, and don't really fit into mpv's architecture. It could be used by playing a libavfilter graph that also demuxes, but that's like VapourSynth but worse.
2019-05-18 02:10:51 +02:00
Rewrite ordered chapters and timeline stuff This uses a different method to piece segments together. The old approach basically changes to a new file (with a new start offset) any time a segment ends. This meant waiting for audio/video end on segment end, and then changing to the new segment all at once. It had a very weird impact on the playback core, and some things (like truly gapless segment transitions, or frame backstepping) just didn't work. The new approach adds the demux_timeline pseudo-demuxer, which presents an uniform packet stream from the many segments. This is pretty similar to how ordered chapters are implemented everywhere else. It also reminds of the FFmpeg concat pseudo-demuxer. The "pure" version of this approach doesn't work though. Segments can actually have different codec configurations (different extradata), and subtitles are most likely broken too. (Subtitles have multiple corner cases which break the pure stream-concatenation approach completely.) To counter this, we do two things: - Reinit the decoder with each segment. We go as far as allowing concatenating files with completely different codecs for the sake of EDL (which also uses the timeline infrastructure). A "lighter" approach would try to make use of decoder mechanism to update e.g. the extradata, but that seems fragile. - Clip decoded data to segment boundaries. This is equivalent to normal playback core mechanisms like hr-seek, but now the playback core doesn't need to care about these things. These two mechanisms are equivalent to what happened in the old implementation, except they don't happen in the playback core anymore. In other words, the playback core is completely relieved from timeline implementation details. (Which honestly is exactly what I'm trying to do here. I don't think ordered chapter behavior deserves improvement, even if it's bad - but I want to get it out from the playback core.) There is code duplication between audio and video decoder common code. This is awful and could be shareable - but this will happen later. Note that the audio path has some code to clip audio frames for the purpose of codec preroll/gapless handling, but it's not shared as sharing it would cause more pain than it would help.
2016-02-15 21:04:07 +01:00
// segmentation (ordered chapters, EDL)
bool segmented;
struct mp_codec_params *codec; // set to non-NULL iff segmented is set
double start, end; // set to non-NOPTS iff segmented is set
Rewrite ordered chapters and timeline stuff This uses a different method to piece segments together. The old approach basically changes to a new file (with a new start offset) any time a segment ends. This meant waiting for audio/video end on segment end, and then changing to the new segment all at once. It had a very weird impact on the playback core, and some things (like truly gapless segment transitions, or frame backstepping) just didn't work. The new approach adds the demux_timeline pseudo-demuxer, which presents an uniform packet stream from the many segments. This is pretty similar to how ordered chapters are implemented everywhere else. It also reminds of the FFmpeg concat pseudo-demuxer. The "pure" version of this approach doesn't work though. Segments can actually have different codec configurations (different extradata), and subtitles are most likely broken too. (Subtitles have multiple corner cases which break the pure stream-concatenation approach completely.) To counter this, we do two things: - Reinit the decoder with each segment. We go as far as allowing concatenating files with completely different codecs for the sake of EDL (which also uses the timeline infrastructure). A "lighter" approach would try to make use of decoder mechanism to update e.g. the extradata, but that seems fragile. - Clip decoded data to segment boundaries. This is equivalent to normal playback core mechanisms like hr-seek, but now the playback core doesn't need to care about these things. These two mechanisms are equivalent to what happened in the old implementation, except they don't happen in the playback core anymore. In other words, the playback core is completely relieved from timeline implementation details. (Which honestly is exactly what I'm trying to do here. I don't think ordered chapter behavior deserves improvement, even if it's bad - but I want to get it out from the playback core.) There is code duplication between audio and video decoder common code. This is awful and could be shareable - but this will happen later. Note that the audio path has some code to clip audio frames for the purpose of codec preroll/gapless handling, but it's not shared as sharing it would cause more pain than it would help.
2016-02-15 21:04:07 +01:00
// subtitles only
bool animated;
bool seen;
int seen_pos;
double sub_duration;
2016-02-15 20:39:17 +01:00
// private
struct demux_packet *next;
2016-02-15 20:39:17 +01:00
struct AVPacket *avpacket; // keep the buffer allocation and sidedata
demux: don't loop over all packets to find forward buffered size on seek The size of all forward buffered packets is used to control maximum buffering. Until now, this size was incrementally adjusted, but had to be recomputed on seeks within the cache. Doing this was actually pretty expensive. It iterates over a linked list of separate memory allocations (which are probably spread all over the heap due to the allocation behavior), and the demux_packet_estimate_total_size() call touches a lot of further memory locations. I guess this affects the cache rather negatively. In an unscientific test, the recompute_buffers() function (which contained this loop) was responsible for roughly half of the time seeking took. Replace this with a way that computes the buffered size between 2 packets in constant times. The demux_packet.cum_pos field contains the summed sizes of all previous packets, so subtracting cum_pos between two packets yields the size of all packets in between. We can do this because we never remove packets from the middle of the queue. We only add packets to the end, or remove packets at the beginning. The tail_cum_pos field is needed because we don't store the end position of a packet, so the last packet's position would be unknown. We could recompute the "estimated" packet size, or store the estimated size in the packet struct, but I just didn't like this. This also removes the cached fw_bytes fields. It's slightly nicer to just recompute them when needed. Maintaining them incrementally was annoying. total_size stays though, since recomputing it isn't that cheap (would need to loop over all ranges every time). I'm always using uint64_t for sizes. This is certainly needed (a stream could easily burn through more than 4GB of data, even if much less of that is cached). The actual cached amount should always fit into size_t, so it's casted to size_t for printfs (yes, I hate the way you specify stdint.h types in printfs, the less I have to use that crap, the better).
2019-06-01 21:56:24 +02:00
uint64_t cum_pos; // demux.c internal: cumulative size until _start_ of pkt
} demux_packet_t;
struct AVBufferRef;
struct demux_packet *new_demux_packet(size_t len);
struct demux_packet *new_demux_packet_from_avpacket(struct AVPacket *avpkt);
struct demux_packet *new_demux_packet_from(void *data, size_t len);
struct demux_packet *new_demux_packet_from_buf(struct AVBufferRef *buf);
void demux_packet_shorten(struct demux_packet *dp, size_t len);
void free_demux_packet(struct demux_packet *dp);
struct demux_packet *demux_copy_packet(struct demux_packet *dp);
size_t demux_packet_estimate_total_size(struct demux_packet *dp);
void demux_packet_copy_attribs(struct demux_packet *dst, struct demux_packet *src);
int demux_packet_set_padding(struct demux_packet *dp, int start, int end);
int demux_packet_add_blockadditional(struct demux_packet *dp, uint64_t id,
void *data, size_t size);
demux: add a on-disk cache Somewhat similar to the old --cache-file, except for the demuxer cache. Instead of keeping packet data in memory, it's written to disk and read back when needed. The idea is to reduce main memory usage, while allowing fast seeking in large cached network streams (especially live streams). Keeping the packet metadata on disk would be rather hard (would use mmap or so, or rewrite the entire demux.c packet queue handling), and since it's relatively small, just keep it in memory. Also for simplicity, the disk cache is append-only. If you're watching really long livestreams, and need pruning, you're probably out of luck. This still could be improved by trying to free unused blocks with fallocate(), but since we're writing multiple streams in an interleaved manner, this is slightly hard. Some rather gross ugliness in packet.h: we want to store the file position of the cached data somewhere, but on 32 bit architectures, we don't have any usable 64 bit members for this, just the buf/len fields, which add up to 64 bit - so the shitty union aliases this memory. Error paths untested. Side data (the complicated part of trying to serialize ffmpeg packets) untested. Stream recording had to be adjusted. Some minor details change due to this, but probably nothing important. The change in attempt_range_joining() is because packets in cache have no valid len field. It was a useful check (heuristically finding broken cases), but not a necessary one. Various other approaches were tried. It would be interesting to list them and to mention the pros and cons, but I don't feel like it.
2019-06-13 19:10:32 +02:00
void demux_packet_unref_contents(struct demux_packet *dp);
#endif /* MPLAYER_DEMUX_PACKET_H */