mirror of
https://code.videolan.org/videolan/vlc
synced 2024-09-28 23:09:59 +02:00
257 lines
8.9 KiB
Markdown
257 lines
8.9 KiB
Markdown
|
# Executor
|
||
|
|
||
|
## Overview
|
||
|
|
||
|
The executor API allows to submit _runnables_ to be executed from background
|
||
|
threads.
|
||
|
|
||
|
The runnable instances are allocated and freed by the caller. It is the
|
||
|
responsibility of the caller to guarantee that the runnable is valid until
|
||
|
the task is canceled or complete.
|
||
|
|
||
|
The caller is expected to (but not forced to) embed the runnable into its
|
||
|
custom task structure:
|
||
|
|
||
|
```c
|
||
|
struct my_task
|
||
|
{
|
||
|
/* custom data */
|
||
|
int num;
|
||
|
char *str;
|
||
|
|
||
|
/* embedded runnable */
|
||
|
struct vlc_runnable runnable;
|
||
|
};
|
||
|
|
||
|
void Run(void *userdata)
|
||
|
{
|
||
|
// ...
|
||
|
}
|
||
|
```
|
||
|
|
||
|
and submit it as follow:
|
||
|
|
||
|
```c
|
||
|
struct my_task *task = ...;
|
||
|
|
||
|
task->runnable.run = Run;
|
||
|
task->runnable.userdata = task; /* userdata passed to Run() */
|
||
|
|
||
|
vlc_executor_Submit(executor, &task->runnable);
|
||
|
```
|
||
|
|
||
|
Since the task is allocated by the caller, vlc_executor_Submit() may not fail
|
||
|
(it returns void).
|
||
|
|
||
|
The cancelation of a submitted runnable may be requested. It succeeds only if
|
||
|
the runnable is not started yet.
|
||
|
|
||
|
```c
|
||
|
VLC_API void
|
||
|
vlc_executor_Submit(vlc_executor_t *executor, struct vlc_runnable *runnable);
|
||
|
|
||
|
VLC_API bool
|
||
|
vlc_executor_Cancel(vlc_executor_t *executor, struct vlc_runnable *runnable);
|
||
|
```
|
||
|
|
||
|
|
||
|
## Design discussions
|
||
|
|
||
|
The design of this API is the result of many discussions, especially on the
|
||
|
mailing-list:
|
||
|
- <https://mailman.videolan.org/pipermail/vlc-devel/2020-August/136696.html>
|
||
|
- <https://mailman.videolan.org/pipermail/vlc-devel/2020-September/136944.html>
|
||
|
- <https://mailman.videolan.org/pipermail/vlc-devel/2020-September/137188.html>
|
||
|
|
||
|
This documentation aims to explain the rationale for each decision.
|
||
|
|
||
|
|
||
|
### Runnable allocation
|
||
|
|
||
|
The `struct vlc_runnable` must be provided by the client, and the executor uses
|
||
|
only this instance to represent its task (it does not allocate per-task private
|
||
|
data).
|
||
|
|
||
|
Firstly, note that this allows `vlc_executor_Submit()` to return `void`, so no
|
||
|
error handling is necessary. It's minor, but convenient.
|
||
|
|
||
|
An attractive alternative would be to return a new allocated runnable on
|
||
|
submission, which could be used for cancelation:
|
||
|
|
||
|
```c
|
||
|
struct vlc_runnable *
|
||
|
vlc_executor_Submit(vlc_executor_t *, void (*run)(void *), void *userdata);
|
||
|
|
||
|
void vlc_executor_Cancel(vlc_executor_t *, struct vlc_runnable *);
|
||
|
```
|
||
|
|
||
|
But this is inherently racy: a naive implementation could return a pointer to a
|
||
|
`vlc_runnable` already freed. Indeed, `vlc_executor_Submit()` must post the
|
||
|
runnable to some pending queue, and in theory it may be executed by a background
|
||
|
thread even before the function returns.
|
||
|
|
||
|
For example, if the executor frees the runnable as soon as its execution is
|
||
|
complete, this simple code is incorrect:
|
||
|
|
||
|
```c
|
||
|
struct my_task
|
||
|
{
|
||
|
char *some_custom_data;
|
||
|
struct vlc_runnable *runnable; /* keep a pointer to cancel it */
|
||
|
};
|
||
|
|
||
|
void Run(void *userdata)
|
||
|
{
|
||
|
struct my_task *task = userdata;
|
||
|
/* (task->runnable may be uninitialized here) */
|
||
|
}
|
||
|
```
|
||
|
|
||
|
```c
|
||
|
struct my_task *task = ...;
|
||
|
task->runnable = vlc_executor_Submit(executor, Run, task);
|
||
|
/* task->runnable may be already freed here */
|
||
|
vlc_executor_Cancel(executor, task->runnable); /* boom, use-after-free */
|
||
|
```
|
||
|
|
||
|
To avoid the use-after-free, the runnable ownership must be shared. This makes
|
||
|
the API more complex:
|
||
|
|
||
|
```c
|
||
|
struct vlc_runnable_t *runnable = vlc_executor_Submit(executor, Run, task);
|
||
|
vlc_runnable_Release(runnable); /* don't need the runnable anymore */
|
||
|
```
|
||
|
|
||
|
However, this would still be insufficient: the task resources must typically be
|
||
|
released at the end of the `run()` callback, but at this point
|
||
|
`vlc_executor_Submit()` may not have returned yet, so the runnable instance
|
||
|
would not be known by the user.
|
||
|
|
||
|
Since in practice, the client needs its own task structure (to store the
|
||
|
parameters, state and result of the execution), so it must allocate it and pass
|
||
|
it to the executor (as userdata) anyway. Therefore, it's simpler if the task
|
||
|
already embed the runnable.
|
||
|
|
||
|
Note that it is not mandatory to embed the runnable into the client task
|
||
|
structure (it's just convenient). The runnable could be anywhere, for example on
|
||
|
the stack (provided that it lives long enough).
|
||
|
|
||
|
This design also has disadvantages. In particular, it forces the runnable queue
|
||
|
to be implemented as an intrusive list. As a consequence, the public `struct
|
||
|
vlc_runnable` must include an executor-private field (`struct vlc_list node`),
|
||
|
and the same runnable may not be queued twice (it is not possible to put the
|
||
|
same item twice in the same intrusive list).
|
||
|
|
||
|
Therefore, a client must always explicitly create a new runnable for each
|
||
|
execution. In practice, it should not be a problem though, since it must often
|
||
|
create a new custom task for the execution state anyway.
|
||
|
|
||
|
In addition, further extensions of the executor API are constrained by the fact
|
||
|
that it could not allocate and store per-task data (other than the intrusive
|
||
|
list node).
|
||
|
|
||
|
|
||
|
### Cancelation
|
||
|
|
||
|
It is important to be able to cancel and interrupt a running task.
|
||
|
|
||
|
However, in C, the interruption mechanism is very specific to the concrete task
|
||
|
implementation, so it could not be provided by the executor: it has to be
|
||
|
provided by the user. And since the user is also at the origin of the
|
||
|
cancelation request (which could lead to the interruption), the executor just
|
||
|
let the user handle interruption manually.
|
||
|
|
||
|
Since it does not handle interruption, it could not provide a generic timeout
|
||
|
mechanism either (see below). The user has to handle timeout manually.
|
||
|
|
||
|
It just provides a way to cancel a queued task not started yet:
|
||
|
|
||
|
```c
|
||
|
VLC_API bool
|
||
|
vlc_executor_Cancel(vlc_executor_t *executor, struct vlc_runnable *runnable);
|
||
|
```
|
||
|
|
||
|
The runnable instance is created and destroyed by the user, so there is no
|
||
|
possible use-after-free race condition.
|
||
|
|
||
|
Since the runnable is queued via an intrusive list, it can be removed in O(1).
|
||
|
|
||
|
The tricky part is that it must indicate to the user if the runnable has
|
||
|
actually been dequeued or not (i.e. if it has already been taken by a thread to
|
||
|
be run). Indeed, the user must be able to know if the `run()` callback will be
|
||
|
executed or not, in order to release the task resources correctly in all cases,
|
||
|
without race conditions. This adds some complexity for the user.
|
||
|
|
||
|
For the implementation, the problem is that even if we can remove an item in an
|
||
|
intrusive list in O(1), there is a priori no way to know immediately if the item
|
||
|
was in a specific list or not. To circumvent this problem, when an item is
|
||
|
dequeued, the executor resets the list nodes to `NULL`. This allows to store 1
|
||
|
bit of information indicating whether the item has been dequeued or not.
|
||
|
|
||
|
Note that the use-after-free race condition on cancel could alternatively be
|
||
|
solved by passing an optional user-provided `void *id`:
|
||
|
|
||
|
```c
|
||
|
struct vlc_runnable
|
||
|
{
|
||
|
void (*run)(void *userdata)
|
||
|
void *userdata;
|
||
|
struct vlc_list node;
|
||
|
|
||
|
/* add a user-provided id (may be NULL) used for cancelation */
|
||
|
void *id;
|
||
|
}
|
||
|
|
||
|
void vlc_executor_Submit(vlc_executor_t *executor, struct vlc_runnable *);
|
||
|
|
||
|
void vlc_executor_Cancel(vlc_executor_t *executor, void *id);
|
||
|
```
|
||
|
|
||
|
This would solve the problem because the user owns the `id`, so it is guaranteed
|
||
|
to exist on cancel. This is also practical, because the user could just pass a
|
||
|
pointer to its custom task structure.
|
||
|
|
||
|
An additional advantage is that a higher-level API (like the preparser) could
|
||
|
directly benefit from the cancelation API, without tracking its submitted tasks
|
||
|
to find the matching runnable.
|
||
|
|
||
|
But it would only make sense if the user was not already forced to track its
|
||
|
submitted tasks, to be able to cancel them on deletion. Indeed, since the
|
||
|
executor must report for each task if it has been dequeued or not, it is not
|
||
|
possible to provide a function to cancel all tasks at once. Therefore, the user
|
||
|
has to keep track of submitted tasks, which is inconvenient.
|
||
|
|
||
|
|
||
|
### Timeout
|
||
|
|
||
|
A timeout implementation is often very specific to the concrete task
|
||
|
implementation (for example, calling `vlc_cond_timedwait()` instead of
|
||
|
`vlc_cond_wait()`), so it makes sense to let the user implement it.
|
||
|
|
||
|
However, we could see the timeout as a cancelation where the deadline is known
|
||
|
in advance. Since the user must already implement the interruption on its own,
|
||
|
it could have been used to provide a timeout mechanism "for free" (for the
|
||
|
user).
|
||
|
|
||
|
We decided against it, for several reasons:
|
||
|
- in theory, the user may want to do something different on timeout and
|
||
|
cancelation (it may for example consider that one is an error and the other
|
||
|
is not);
|
||
|
- this would require more callbacks and would complexify the executor API;
|
||
|
- if it's necessary, the user could just use a separate component (timer) to
|
||
|
trigger the interruption.
|
||
|
|
||
|
|
||
|
## Conclusion
|
||
|
|
||
|
The resulting executor is a "minimal" API, simple and general: it just handles
|
||
|
execution.
|
||
|
|
||
|
As a drawback, it does not help for interruption, cancelation and timeout, so
|
||
|
the user has more work to do on its own. For example, it must always track its
|
||
|
submitted tasks, and implement boilerplate to correctly handle cancelation and
|
||
|
interruption on deletion.
|
||
|
|
||
|
But I can't think of an executor API design in C with all the desirable
|
||
|
properties. Any design I considered sucked one way or another. :/
|