1
mirror of https://github.com/mpv-player/mpv synced 2024-11-03 03:19:24 +01:00
Commit Graph

39 Commits

Author SHA1 Message Date
wm4
74c11f0c84 sub: detect charset in demuxer
Slightly simpler, and removes the need to pre-read all subtitle packets.

This still does the subtitle charset conversion on the packet level
(instead converting when parsing the file), so in theory this still
could provide a way to change the charset at runtime. But maybe even
this should be removed, as FFmpeg is somewhat likely to get its own
charset detection and conversion mechanism in the future. (Would have
to keep the subtitle file in memory to allow changing the charset on
the fly, I guess.)
2015-12-17 01:17:23 +01:00
wm4
384b13c5fd demux_libass: remove this demuxer
This loaded external .ass files via libass. libavformat's .ass reader is
now good enough, so use that instead.

Apparently libavformat still doesn't support fonts embedded into text
.ass files, but support for this has been accidentally broken in mpv for
a while anyway. (And only 1 person complained.)
2015-11-11 21:28:20 +01:00
wm4
d60270ed3d sub: fix --sub-codepage UTF-8 with fallback
Fixes e.g --sub-codepage=utf8:gb18030 if the subtitle us UTF-8.

This was broken in commit e5d31808.

Also log the detected charset in verbose mode.
2015-09-01 13:37:45 +02:00
wm4
e5d3180889 charset_conv: use our own UTF-8 check with ENCA only
Some charsets can look like valid UTF-8, but aren't UTF-8. One example
is ISO-2022-JP. While ENCA apparently likes to get misdetect real UTF-8,
this is not the case with uchardet. uchardet can detect ISO-2022-JP
correctly, but didn't even get to try, because our own UTF-8 check
succeeded. So run the UTF-8 check when using ENCA only.

Fixes #2195.
2015-08-04 18:58:58 +02:00
Jehan
e7897dfb9b charset_conv: "auto" encoding detection now uses uchardet.
If mpv is not built with uchardet, "enca" is still the fallback default
encoding detection.
2015-08-04 17:51:00 +02:00
wm4
bbbb2d9723 charset_conv: fix switched parameters
Fixes #2186.
2015-08-02 18:21:22 +02:00
wm4
a74914a057 charset_conv: add uchardet support
For now, it needs to be explicitly selected. ENCA is still the default.

This assumes uchardet returns iconv names. This doesn't seem to be
always the case, and the result are lots of iconv errors. So
explicitly check for this situation, and print a warning if it
occurs. It's entirely possible that uchardet support is actually
useless, because names are not necessarily iconv-compatible (but
uchardet doesn't seem to document whether it attempts to return
iconv-compatible names if possible).

Fixes #908.
2015-08-02 00:03:29 +02:00
wm4
11f2be2bcc charset_conv: make it possible to return an allocated string as guess
uchardet is written in C++, and thus doesn't appreciate the value of
using static strings, and internally stores the guessed charset as
allocated std::string. Add a minimal hack to deal with this. (I don't
appreciate that the code is potentially harder to understand by
returning either a static or allocated string, but I do appreciate for
not having to litter the existing code with strdups.)
2015-08-01 23:49:37 +02:00
wm4
92b9d75d72 threads: use utility+POSIX functions instead of weird wrappers
There is not much of a reason to have these wrappers around. Use POSIX
standard functions directly, and use a separate utility function to take
care of the timespec calculations. (Course POSIX for using this weird
format for time values.)
2015-05-11 23:44:36 +02:00
wm4
f77e3cbf0c json: fix UTF-8 handling
We escape only characters below 32, plus " and \. UTF-8 should be apssed
through verbatim. Since char can be signed (and usually is), the check
broke and happened to escape UTF-8 encoded bytes too. This broke UTF-8
completely.

Note that we don't check for broken or invalid UTF-8, such as described
both in the client API and IPC docs.

Fixes #1874.
2015-04-28 09:58:28 +02:00
Marcin Kurczewski
f43017bfe9 Update license headers
Signed-off-by: wm4 <wm4@nowhere>
2015-04-13 12:10:01 +02:00
wm4
4a7c6aaedf bstr: fix possible undefined behavior with length 0 strings
BSTR_P() passes the string length and start pointer to printf-like
functions. If the lenfth is 0, the pointer can be NULL, but we're
actually still not allowed to pass a NULL pointer in any case.

This is mostly a technically, because nobody in their right mind would
attempt to specifically break such cases. But it's still undefined
behavior, and some libcs might be strict about this.
2015-01-12 14:43:52 +01:00
wm4
86b521f7df Silence some Coverity warnings
None of this really matters.
2014-11-21 09:59:58 +01:00
wm4
9c456ab58f bstr: don't call memcpy(..., NULL, 0)
This is clearly not allowed, although it's not a problem on most libcs.

Found by Coverity.
2014-11-21 05:18:11 +01:00
wm4
38420eb49e json: handle >\\"< fragments correctly
It assumed that any >\"< sequence was an escape for >"<, but that is not
the case with JSON such as >{"ducks":"\\"}<. In this case, the second
>\< is obviously not starting an escape.
2014-10-21 02:59:30 +02:00
wm4
5548c75e55 lua: expose JSON parser
The JSON parser was introduced for the IPC protocol, but I guess it's
useful here too.

The motivation for this commit is the same as with 8e4fa5fc (again).
2014-10-19 05:51:37 +02:00
wm4
c01151e0bf misc: add JSON parser 2014-10-17 20:46:31 +02:00
James Ross-Gowan
476fc65b0f bstr: check strings before memcmp/strncasecmp
bstr.start can be NULL when bstr.len is 0, so don't call memcmp or
strncasecmp if that's the case. Passing NULL to string functions is
invalid C, even when the length is 0, and it causes Windows to raise an
invalid parameter error.

Should fix #1155
2014-10-07 08:45:27 +02:00
wm4
68ff8a0484 Move compat/ and bstr/ directory contents somewhere else
bstr.c doesn't really deserve its own directory, and compat had just
a few files, most of which may as well be in osdep. There isn't really
any justification for these extra directories, so get rid of them.

The compat/libav.h was empty - just delete it. We changed our approach
to API compatibility, and will likely not need it anymore.
2014-08-29 12:31:52 +02:00
wm4
559fe1daac Add Plan 9-style barriers
Plan 9 has a very interesting synchronization mechanism, the
rendezvous() call. A good property of this is that you don't need to
explicitly initialize and destroy a barrier object, unlike as with e.g.
POSIX barriers (which are mandatory to begin with). Upon "meeting", they
can exchange a value.

This mechanism will be nice to synchronize certain stages of
initialization between threads in the following commit.

Unlike Plan 9 rendezvous(), this is not implemented with a hashtable,
because that would require additional effort (especially if you want to
make it actually scele). Unlike the Plan 9 variant, we use intptr_t
instead of void* as type for the value, because I expect that we will be
mostly passing a status code as value and not a pointer. Converting an
integer to void* requires two cast (because the integer needs to be
intptr_t), the other way around it's only one cast.

We don't particularly care about performance in this case either. It's
simply not important for our use-case. So a simple linked list is used
for waiters, and on wakeup, all waiters are temporarily woken up.
2014-07-26 20:29:48 +02:00
wm4
aa1a383342 sub: add detection via BOM
Useful for Windows stuff. Actually, ENCA support should catch this, but,
well, whatever, everyone seems to hate ENCA.

Detection with BOM is trivial, although it needs some hackery to
integrate it with the existing autodetection support. For one, change
the default value of --sub-codepage to make this easier.

Probably fixes issue #937 (the second part).
2014-07-22 23:40:48 +02:00
wm4
f8c2dd1b78 build: include <strings.h> for strcasecmp()
It happens to work without strings.h on glibc or with _GNU_SOURCE, but
the POSIX standard requires including <strings.h>.

Hopefully fixes OSX build.
2014-07-10 08:29:32 +02:00
wm4
9a210ca2d5 Audit and replace all ctype.h uses
Something like "char *s = ...; isdigit(s[0]);" triggers undefined
behavior, because char can be signed, and thus s[0] can be a negative
value. The is*() functions require unsigned char _or_ EOF. EOF is a
special value outside of unsigned char range, thus the argument to the
is*() functions can't be a char.

This undefined behavior can actually trigger crashes if the
implementation of these functions e.g. uses lookup tables, which are
then indexed with out-of-range values.

Replace all <ctype.h> uses with our own custom mp_is*() functions added
with misc/ctype.h. As a bonus, these functions are locale-independent.
(Although currently, we _require_ C locale for other reasons.)
2014-07-01 23:11:08 +02:00
wm4
d87a84ca89 ring: use a different type for read/write pointers
uint_least32_t could be larger than uint32_t, so the return values of
mp_ring_get_wpos/rpos must be adjusted. Actually just use unsigned long
as type instead, because that is less awkward than uint_least32_t.
2014-05-30 02:15:25 +02:00
wm4
690ea07b7a ring: implement drain in terms of read
I think this makes it easier to reason about it and avoids duplicate
logic.
2014-05-29 02:24:05 +02:00
Paweł Forysiuk
f289060259 Fix gcc 4.7 warning about shadowing talloc_parent in mp_dispact_queue 2014-05-28 22:44:43 +02:00
wm4
8e7cf4bc99 atomics: switch to C11 stdatomic.h
In my opinion, we shouldn't use atomics at all, but ok.

This switches the mpv code to use C11 stdatomic.h, and for compilers
that don't support stdatomic.h yet, we emulate the subset used by mpv
using the builtins commonly provided by gcc and clang.

This supersedes an earlier similar attempt by Kovensky. That attempt
unfortunately relied on a big copypasted freebsd header (which also
depended on much more highly compiler-specific functionality, defined
reserved symbols, etc.), so it had to be NIH'ed.

Some issues:
- C11 says default initialization of atomics "produces a valid state",
  but it's not sure whether the stored value is really 0. But we rely on
  this.
- I'm pretty sure our use of the __atomic... builtins is/was incorrect.
  We don't use atomic load/store intrinsics, and access stuff directly.
- Our wrapper actually does stricter typechecking than the stdatomic.h
  implementation by gcc 4.9. We make the atomic types incompatible with
  normal types by wrapping them into structs. (The FreeBSD wrapper does
  the same.)
- I couldn't test on MinGW.
2014-05-21 02:21:18 +02:00
wm4
f47a4fc3d9 threads: use mpv time for mpthread_cond_timedwait wrapper
Use the time as returned by mp_time_us() for mpthread_cond_timedwait(),
instead of calculating the struct timespec value based on a timeout.
This (probably) makes it easier to wait for a specific deadline.
2014-05-18 19:20:32 +02:00
wm4
5c3dd6402a dispatch: document some guarantees
The here documented guarantee might simplify code using this mechanism
a lot, because it becomes unnecessary to invent a separate mechanism to
make the mp_dispatch_queue_process loop exit after processing a dispatch
callback. (Instead, the dispatch callback can set a flag, and the caller
of mp_dispatch_queue_process can check it.)
2014-04-25 08:35:02 +02:00
wm4
f5df78b3fc dispatch: wakeup only if needed on mp_dispatch_resume()
The wakeup is needed to make mp_dispatch_queue_process() return if
suspension is not needed anymore - which is only the case when the
request count reaches 0.

The assertion added with this commit always has/had to be true.
2014-04-24 01:03:05 +02:00
wm4
26723b32a9 dispatch: improve documentation comments 2014-04-23 21:16:52 +02:00
wm4
cd10af4db6 threads: fix function name
Closer to the corresponding standard function pthread_cond_timedwait.
2014-04-23 21:16:52 +02:00
wm4
80ff94131b dispatch: implement timeout
Note that this mechanism is similarly "unreliable" as for example
pthread_cond_timedwait(). Trying to target the exact wait time will just
make it more complex.

The main use case for this is for threads which either use the dispatch
centrally and want mp_dispatch_queue_process to do a blocking wait for
new work, or threads which have to implement timers. For the former,
anything is fine, as long as they don't have to do active waiting for
new works. For the former, callers are better off recalculating their
deadline after every call.
2014-04-23 21:16:52 +02:00
wm4
29b7260398 dispatch: use a real lock for mp_dispatch_lock()
This is much simpler, leaves fairness isues etc. to the operating
system, and will work better with threading-related debugging tools.

The "trick" to this is that the lock can be acquired and held only while
the queue is in suspend mode. This way we don't need to make sure the
lock is held outside of mp_dispatch_queue_process, which would be quite
messy to get right, because it would have to be in locked state by
default.
2014-04-23 21:16:52 +02:00
wm4
ed7e7e2eb4 dispatch: fix broken locking
mp_dispatch_queue_process() releases the queue->lock mutex while
processing a dispatch callback. But this allowed mp_dispatch_lock() to
grab the "logical" lock represented by queue->locked. Grabbing the
logical lock is not a problem in itself, but it can't be allowed to
happen while the callback is still running.

Fix this by claiming the logical lock while the dispatch callback is
processed. Also make sure that the thread calling mp_dispatch_lock() is
woken up properly.

Fortunately, this didn't matter, because the locking function is unused.
2014-04-23 21:16:52 +02:00
wm4
ff4028f3bf dispatch: wakeup target thread when locking/suspending
Without this, it could happen that both the caller thread and the target
thread sleep.
2014-04-23 21:16:52 +02:00
wm4
2b26517ef7 dispatch: move into its own source file
This was part of osdep/threads.c out of laziness. But it doesn't contain
anything OS dependent. Note that the rest of threads.c actually isn't
all that OS dependent either (just some minor ifdeffery to work around
the lack of clock_gettime() on OSX).
2014-04-23 21:16:51 +02:00
wm4
33c8fd789d charset_conv: mp_msg conversions 2013-12-21 21:43:16 +01:00
wm4
0112143fda Split mpvcore/ into common/, misc/, bstr/ 2013-12-17 02:39:45 +01:00