Compare commits

..

23 Commits

Author SHA1 Message Date
Ventilaar
30ea647ca9 Ok, long time no commit. I dont know what ive changed, pray it works
Some checks failed
Update worker server / build-and-publish (release) Successful in 15s
Generate docker image / build-and-publish (release) Failing after 25s
2024-10-15 15:48:09 +02:00
Ventilaar
a7c640a8cf Fix search error, add tombstone 2024-05-04 22:49:50 +02:00
Ventilaar
f6da232164 Rename functions 2024-04-21 00:31:25 +02:00
Ventilaar
1d5934275c Handle websub added messages to queue 2024-04-21 00:26:00 +02:00
Ventilaar
72af6b6126 Handle mass websub subscriptions with added statistics. General cleanup 2024-04-18 23:36:45 +02:00
Ventilaar
8bf8e08af3 Forgot admin imports 2024-04-18 00:59:46 +02:00
Ventilaar
236b56915b Handle WebSub endpoint renewing. Basic code for XML parsing (not implemented yet) 2024-04-18 00:56:22 +02:00
Ventilaar
ac0243a783 Quick key rename title_slug 2024-04-17 12:24:14 +02:00
Ventilaar
bb78c97d52 Do not store websub posted raw data as str 2024-04-10 11:25:05 +02:00
Ventilaar
7ccb827a9c hotfix the hotfix of the hotfix 2024-04-09 13:01:23 +02:00
Ventilaar
9c0e4fb63c Hotfix the websub hotfix. Add button to easily monitor websub callbacks. Clean stuck websub requests after 3 days 2024-04-09 12:56:57 +02:00
Ventilaar
75d42ad3cd Websub callback domain hotfix 2024-04-09 12:16:47 +02:00
Ventilaar
4fa0ee2c68 Hotfix channel sorting 2024-04-09 12:11:14 +02:00
Ventilaar
7e06c8673b Update PyJWT requirement 2024-04-06 23:27:18 +02:00
Ventilaar
96565e9e2b Add small time difference leeway 2024-04-06 23:23:32 +02:00
Ventilaar
f90b0bdc42 Secure OIDC login and cleanup 2024-04-06 22:57:46 +02:00
Ventilaar
1be9729720 fix startup when oidc provider is not setup 2024-04-02 18:49:06 +02:00
Ventilaar
1918a03e05 add recently added view 2024-04-02 18:42:56 +02:00
Ventilaar
ed4f8b03eb Update readmy to reflect current status of project 2024-03-30 22:46:03 +01:00
Ventilaar
7266a437d1 forgot about the orphaned view 2024-03-22 00:13:41 +01:00
Ventilaar
360b80343f Quick change to load slugified files 2024-03-22 00:10:41 +01:00
Ventilaar
45348d2cf5 Sort orphaned videos by added date, add queue functionality 2024-03-21 15:22:56 +01:00
Ventilaar
e80318fc6b hotfix caching websub 2024-03-20 23:11:38 +01:00
30 changed files with 1317 additions and 346 deletions

View File

@@ -1,4 +1,4 @@
name: Generate release
name: Generate docker image
on:
release:
@@ -22,13 +22,4 @@ jobs:
uses: docker/build-push-action@v5
with:
push: true
tags: git.ventilaar.nl/ventilaar/ayta:latest
- name: Update worker server
uses: appleboy/ssh-action@v1.0.3
with:
host: 192.168.66.109
username: root
key: ${{ secrets.SERVER_KEY }}
port: 22
script: /root/update_worker.sh
tags: git.ventilaar.nl/ventilaar/ayta:latest

View File

@@ -0,0 +1,18 @@
name: Update worker server
on:
release:
types: [published]
jobs:
build-and-publish:
runs-on: ubuntu-latest
steps:
- name: Update worker server
uses: appleboy/ssh-action@v1.0.3
with:
host: 192.168.66.109
username: root
key: ${{ secrets.SERVER_KEY }}
port: 22
script: /root/update_worker.sh

View File

@@ -3,27 +3,67 @@
This project will be awesome, only if I invest enough time. This software will replace my
current cronjob yt-dlp archive service.
Partially inspired by [hobune](https://github.com/rebane2001/hobune). While that project is amazing by it's own, it's just not scaleable.
Partially inspired by [hobune](https://github.com/rebane2001/hobune). While that project is amazingby it's own, it's just not scaleable.
## The idea
The new setup will either be fully running in flask, including the task that checks the
youtube channels every x hours. Or Flask will be used as the gui frontend, and a seperate
script will do the channel archiving. I have not desided yet.
What currently works is that the gui frontend calls to a seperate database while a cronjob
handles the downloading of new videos from a list of channels.
Having over 250k videos, scaling the current cronjob yt-dlp archive task is just really hard. Filetypes change, things get partially downloaded and such.
Partially yt-dlp is to blame because it's a package that needs to change all the time. But with this some changes are not accounted for.
yt-dlp will still do the downloads. But a flask frontend will be developed to make all downloaded videos easily indexable.
For it to be quick (unlike hobune) a database has to be implemented. This could get solved by a static site generator type of software, but that is not my choice.
The whole software package will use postgresql as a data backend and celery as background tasks.
Currently development however is using mongodb just because it's easy.
## How it works currently(legacy)
In the legacy folder you will find files that are currently in my archive project. How it works is
that I have a cronjob running every 6 hours what then runs yt-dlp with a config file. In that config
that I have a cronjob running every 24 hours what then runs yt-dlp with a config file. In that config
file a channel list contains all the channels that yt-dlp needs to update. If a new video has been
uploaded, yt-dlp will automatically download a 720p version of the video, all subtitles at that time
(rip community captions, will not forget you) and a json file with all the rest of the metadata. Oh
and also the thumbnail.
This works. But is very slow and uses lots of "API" calls to youtube, which will sometimes will get
the IP blocked. This needs to be overhauled.
the IP blocked. This is why full channel upload pages are not downloaded anymore, I have limited to first 50 videos.
## Goals
Some goals have been set up which will prioritise functionality for the software package.
The starting status is that info.json files of videos are loaded into the mongodb database on which flask
will generate a page for channels and videos to load. But this has major limitations which will not be described right now
but will be reflected in the goals.
### Stage 1
Tasks which have to be finished before the GUI frontend is usable as a manager and user in no perticular order.
- [x] Have videos and channels listed on a page
- [x] Have a secured admin page where the database can be managed
- [x] Have working video streaming
- [x] CI/CD pipeline for quicker deployment
- [x] Add caching to speed up pages
- [x] Add ratelimiting for expensive pages
- [x] Ability to show cronjob logs to easily troubleshoot
### Stage 2
Extra functionality for further development of features.
- [x] Fix video titles on disk with slugs
- [x] Working search functionality
- [x] Video reporting functionality
- [x] Ability (for external applications) to queue up video ids for download
- [x] Add websub requesting and receiving ability. (not fully usable yet without celery tasks)
- [x] OIDC or Webauthn logins instead of static argon2 passwords
### Stage 3
Mainly focused on retiring the cronjob based scripts and moving it to celery based tasks
- [ ] manage videos by ID's instead of per channel basis
- [ ] download videos from queue
- [ ] Manage websub callbacks
### Stage 4
Mongodb finally has it's limitations.
- [ ] Migrate to postgresql
### Stage ...
Since this is my flagship software which I have developed more features will be added.
It may take some time since this is just a hobby for me. And I'm not a programmer by title.
## Things learned
### Video playlists
@@ -50,26 +90,22 @@ If you swap the channel name to channel id. The folders will never change.
### Storage structure
The following folder structure is pretty nice for using static scripts. The one drawback
is that you can't search for video id's or titles. Because the search takes too long.
This is mainly why we need a new system using a database.
```
./videos/{channel_id}/{upload_date}/{video_id}/video_title.mp4
```
For the new system using a blob like storage will be key. I had the following in mind. It will be an independant
random key and not the YouTube video ID because I have notices that multiple real videos exist under the same key by
uploaders who replace old videos.
This is mainly why we need a new system using a database mainly for search.
The following structure is easily scaleable and usable in a object storage format.
```
-| data
| - videos
| - 128bit_random_id.mp4
| - subtitles
| - same_random_id_EN.srt
| - same_random_id_DE.srt
| - thumbnails
| - 128bit_random_id.jpg
./videos/{channel_id}/{video_id}/video-title-slug-format.info.json
```
## API things learned
### YouTube push notifications in API form exist
Using the pubsubhubbub service provided by Google we will implement downloading videos based on uploads.
The API is based on WebSub which is greatly documented.
The hub will give xml+atom notifications when a video is uploaded by a channel and when a video is deleted.
The goal is to download a video when a notification gets trough, and run a full channel sync when a video is deleted.
This will be next to periodic full channel polling to download videos which the hub has not notified us about.
### Etag is useful
When we will call the api for 50 items in a playlist we also get an etag back.
This is a sort of hash of the returned data.

View File

@@ -1,45 +1,55 @@
def create_app(test_config=None):
import os, secrets
from flask import Flask
from ayta.extensions import limiter, caching, celery_init_app
from ayta.extensions import limiter, caching, celery_init_app, oidc
from werkzeug.middleware.proxy_fix import ProxyFix
from . import filters
config = {'MONGO_CONNECTION': os.environ.get('AYTA_MONGOCONNECTION', 'mongodb://root:example@192.168.66.140:27017'),
'S3_CONNECTION': os.environ.get('AYTA_S3CONNECTION', '192.168.66.111:9001'),
'S3_ACCESSKEY': os.environ.get('AYTA_S3ACCESSKEY', 'lnUiGClFVXVuZbsr'),
'S3_SECRETKEY': os.environ.get('AYTA_S3SECRETKEY', 'Qz9NG7rpcOWdK2WL'),
'CACHE_TYPE': os.environ.get('AYTA_CACHETYPE', 'SimpleCache'),
'OIDC_PROVIDER': os.environ.get('AYTA_OIDC_PROVIDER', 'https://auth.ventilaar.nl'),
'OIDC_ID': os.environ.get('AYTA_OIDC_ID', 'ayta'),
'CACHE_DEFAULT_TIMEOUT': int(os.environ.get('AYTA_CACHETIMEOUT', 6)),
'SECRET_KEY': os.environ.get('AYTA_SECRETKEY', secrets.token_hex(32)),
'DEBUG': bool(os.environ.get('AYTA_DEBUG', False)),
'DOMAIN': os.environ.get('AYTA_DOMAIN', 'testing.mashallah.nl'),
'CELERY': dict(broker_url=str(os.environ.get('AYTA_CELERYBROKER', 'amqp://guest:guest@192.168.66.140:5672/')),
task_ignore_result=True,)
'DOMAIN': os.environ.get('AYTA_DOMAIN', 'https://testing.mashallah.nl'),
'CELERY': {'broker_url': str(os.environ.get('AYTA_CELERYBROKER', 'amqp://guest:guest@192.168.66.140:5672/'))}
}
# Static Flask configuration options
config['CELERY']['task_ignore_result'] = True
config['CACHE_TYPE'] = 'SimpleCache'
config['SECRET_KEY'] = secrets.token_bytes(32)
# Celery Periodic tasks
config['CELERY']['beat_schedule'] = {}
config['CELERY']['beat_schedule']['Renew WebSub endpoints'] = {'task': 'ayta.tasks.websub_renew_expiring', 'schedule': 4000}
config['CELERY']['beat_schedule']['Process WebSub data'] = {'task': 'ayta.tasks.websub_process_data', 'schedule': 100}
app = Flask(__name__)
app.config.from_mapping(config)
limiter.init_app(app)
caching.init_app(app)
oidc.init_app(app)
celery_init_app(app)
app.wsgi_app = ProxyFix(app.wsgi_app, x_for=1)
app.jinja_env.filters['pretty_duration'] = filters.pretty_duration
app.jinja_env.filters['pretty_time'] = filters.pretty_time
app.jinja_env.filters['current_time'] = filters.current_time
app.jinja_env.filters['epoch_time'] = filters.epoch_time
app.jinja_env.filters['epoch_date'] = filters.epoch_date
from .blueprints import watch
from .blueprints import index
from .blueprints import admin
from .blueprints import search
from .blueprints import channel
from .blueprints import auth
from .blueprints import websub
from .blueprints import api
app.register_blueprint(watch.bp)
app.register_blueprint(index.bp)
@@ -47,6 +57,6 @@ def create_app(test_config=None):
app.register_blueprint(search.bp)
app.register_blueprint(channel.bp)
app.register_blueprint(auth.bp)
app.register_blueprint(websub.bp)
app.register_blueprint(api.bp)
return app

View File

@@ -1,11 +1,10 @@
from flask import Blueprint, render_template, request, redirect, url_for, flash
from flask import Blueprint, render_template, request, redirect, url_for, flash, current_app
from ..nosql import get_nosql
from ..s3 import get_s3
from ..dlp import checkChannelId, getChannelInfo
from ..decorators import login_required
from ..tasks import subscribe_websub_callback, unsubscribe_websub_callback
from ..tasks import test_sleep, websub_subscribe_callback, websub_unsubscribe_callback, video_download
from datetime import datetime
import requests
from secrets import token_urlsafe
bp = Blueprint('admin', __name__, url_prefix='/admin')
@@ -72,15 +71,15 @@ def channel(channelId):
value = request.form.get('value', None)
if task == 'subscribe-websub':
task = subscribe_websub_callback.delay(channelId)
task = websub_subscribe_callback.delay(channelId)
flash(f"Started task {task.id}")
return redirect(url_for('admin.channel', channelId=channelId))
if task == 'update-value':
if key == 'active':
if key in ['active', 'websub']:
value = True if value else False
if key == 'added_date':
if key in ['added_date']:
value = datetime.strptime(value, '%Y-%m-%d')
get_nosql().update_channel_key(channelId, key, value)
@@ -110,29 +109,41 @@ def run(runId):
@bp.route('/websub', methods=['GET', 'POST'])
@login_required
def websub():
render = {}
if request.method == 'POST':
task = request.form.get('task', None)
value = request.form.get('value', None)
if task == 'unsubscribe':
channelId = get_nosql().websub_getCallback(value).get('channel')
task = unsubscribe_websub_callback.delay(value, channelId)
task = websub_unsubscribe_callback.delay(value)
flash(f"Started task {task.id}")
return redirect(url_for('admin.websub'))
elif task == 'clean-retired':
get_nosql().websub_cleanRetired()
return redirect(url_for('admin.websub'))
elif task == 'unsubscribe-callbacks':
for callbackId in get_nosql().websub_getCallbacks():
websub_unsubscribe_callback.delay(callbackId)
flash(f"Started unsubscribe tasks for all callbacks")
return redirect(url_for('admin.websub'))
elif task == 'subscribe-channels':
for channelId in get_nosql().list_all_channels(websub=True):
websub_subscribe_callback.delay(channelId)
flash(f'Started subscribe tasks for activated channels')
return redirect(url_for('admin.websub'))
callbackIds = get_nosql().websub_getCallbacks()
callbacks = {}
render['stats'] = get_nosql().websub_statistics()
for callbackId in callbackIds:
callbacks[callbackId] = get_nosql().websub_getCallback(callbackId)
return render_template('admin/websub.html', callbacks=callbacks)
return render_template('admin/websub.html', callbacks=callbacks, render=render)
@bp.route('/reports', methods=['GET', 'POST'])
@login_required
@@ -150,8 +161,96 @@ def reports():
return render_template('admin/reports.html', reports=reports)
@bp.route('/files', methods=['GET', 'POST'])
@bp.route('/queue', methods=['GET', 'POST'])
@login_required
def files():
run = get_s3().list_objects()
return str(run)
def queue():
if request.method == 'POST':
task = request.form.get('task', None)
value = request.form.get('value', None)
if task == 'add-endpoint':
description = request.form.get('description', None)
if not description or len(description) <= 7:
flash('Description must be at least 8 characters long')
if value and len(value) >= 12:
get_nosql().queue_newEndpoint(value, description)
flash(f'Created endpoint ID: {value}')
else:
value = token_urlsafe(16)
get_nosql().queue_newEndpoint(value, description)
flash(f'Created endpoint ID: {value}')
elif task == 'retire':
get_nosql().queue_retireEndpoint(value)
flash(f'Endpoint retired: {value}')
elif task == 'clean-retired':
get_nosql().queue_cleanRetired()
flash(f'Cleaned retired endpoints')
elif task == 'manual-queue':
direct = request.form.get('direct', None)
if direct:
task = video_download.delay(value)
flash(f"Started task {task.id}")
else:
get_nosql().queue_insertQueue(value, 'webui')
flash(f'Added to queue: {value}')
elif task == 'delete-queue':
get_nosql().queue_deleteQueue(value)
flash(f'Deleted from queue: {value}')
elif task == 'empty-queue':
get_nosql().queue_emptyQueue()
flash(f'Queue has been emptied')
return redirect(url_for('admin.queue'))
endpoints = get_nosql().queue_getEndpoints()
queue = get_nosql().queue_getQueue()
return render_template('admin/queue.html', endpoints=endpoints, queue=queue)
@bp.route('/users', methods=['GET', 'POST'])
@login_required
def users():
if request.method == 'POST':
task = request.form.get('task', None)
value = request.form.get('value', None)
if task == 'add-user':
alias = request.form.get('alias', None)
description = request.form.get('description', None)
if value is None or alias is None:
flash('Missing fields')
return redirect(url_for('admin.users'))
doc_id = get_nosql().add_user(value, alias, description)
flash(f'User added: {doc_id}')
return redirect(url_for('admin.users'))
if task == 'delete-user':
get_nosql().delete_user(value)
flash(f'User deleted: {value}')
return redirect(url_for('admin.users'))
users = get_nosql().list_all_users()
return render_template('admin/users.html', users=users)
@bp.route('/workers', methods=['GET', 'POST'])
#@login_required
def workers():
if request.method == 'POST':
task = request.form.get('task', None)
if task == 'test-sleep':
test_sleep.delay()
celery = current_app.extensions.get('celery')
tasks = celery.control.inspect().active()
return render_template('admin/workers.html', tasks=tasks)

View File

@@ -2,12 +2,12 @@ from flask import Blueprint, render_template, request, redirect, url_for, flash,
from ..nosql import get_nosql
from ..extensions import caching, caching_unless
bp = Blueprint('websub', __name__, url_prefix='/websub')
import re
@bp.route('/c/<cap>', methods=['GET', 'POST'])
# Caching GET requests should be save since this endpoint is used as a capability URL
@caching.cached(unless=caching_unless)
def callback(cap):
bp = Blueprint('api', __name__, url_prefix='/api')
@bp.route('/websub/<cap>', methods=['GET', 'POST'])
def websub(cap):
if request.method == 'GET':
topic = request.args.get('hub.topic')
challenge = request.args.get('hub.challenge')
@@ -33,8 +33,34 @@ def callback(cap):
return challenge
if get_nosql().websub_existsCallback(cap):
if not get_nosql().websub_savePost(cap, str(request.data)):
if not get_nosql().websub_savePost(cap, request.data):
return abort(500)
return '', 202
return abort(404)
return abort(404)
@bp.route('/queue/<cap>', methods=['POST'])
def queue(cap):
# if endpoint does not exist
if not get_nosql().queue_isActive(cap):
return abort(404)
videoId = request.form.get('v')
# if request is not valid
if not videoId:
return abort(400)
# if requested string is not correct
if not re.match(r"^[a-zA-Z0-9_-]{11}$", videoId):
return abort(422)
# if given string is already in the archive
if get_nosql().check_exists(videoId):
return abort(409)
# try to insert
if get_nosql().queue_insertQueue(videoId, cap):
return '', 202
else:
return abort(409)

View File

@@ -1,10 +1,8 @@
from flask import Blueprint, redirect, url_for, render_template, request, session, flash, current_app
from ..extensions import limiter, caching, caching_unless
from flask import Blueprint, redirect, url_for, render_template, request, session, flash, current_app, redirect
from ..extensions import limiter, caching, caching_unless, oidc
from ..nosql import get_nosql
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
corr = '$argon2id$v=19$m=65536,t=3,p=4$XzX9K2MKRrGWEf/0iHf2AA$m6Q/aHoj1/uct+8a00QTS5xVWnANeMPKVUg4P822sbM'
from time import sleep
bp = Blueprint('auth', __name__, url_prefix='/auth')
@@ -27,7 +25,7 @@ def login():
password = request.form.get('password', None)
if current_app.config.get('DEBUG'):
session['username'] = 'admin'
session['username'] = 'DEBUG ADMIN'
flash('You have been logged in')
return redirect(request.args.get('next', url_for('admin.base')))
@@ -35,19 +33,40 @@ def login():
flash('Password was empty')
return redirect(url_for('auth.login'))
try:
ph = PasswordHasher()
if ph.verify(corr, password):
session['username'] = 'admin'
flash('You have been logged in')
return redirect(request.args.get('next', url_for('admin.base')))
except VerifyMismatchError:
flash('Wrong password')
return redirect(url_for('auth.login'))
except:
flash('Something went wrong')
return redirect(url_for('auth.login'))
return render_template('login.html')
sleep(0.3)
flash('Wrong password')
return redirect(url_for('auth.login'))
return render_template('login.html')
@bp.route('/oidc', methods=['GET'])
def start_oidc():
return redirect(oidc.generate_redirect(), code=302)
@bp.route('/callback', methods=['POST'])
@limiter.limit('30 per day', override_defaults=False)
@caching.cached(unless=caching_unless)
def callback():
state = request.form.get('state', None)
id_token = request.form.get('id_token', None)
if request.form.get('error', None):
return f'We got an error from the authentication provider with the message: {request.form.get("error_description", None)}', 400
if state is None or id_token is None:
return 'Request error', 400
if not oidc.state_check(state):
return 'CSRF Error, state is not valid', 400
sub = oidc.check_bearer(id_token)
if not sub:
return f'Invalid JWT token we got: {id_token}', 400
if not get_nosql().get_user(sub):
return f'Authentication successful, but you are not allowed to access authenticated pages. Please report this ID to the administrators if you want access: {sub}', 403
session['username'] = sub
flash('You have been logged in')
return redirect(request.args.get('next', url_for('admin.base')))

View File

@@ -1,6 +1,5 @@
from flask import Blueprint, render_template, flash, url_for, redirect
from ..nosql import get_nosql
from ..s3 import get_s3
from ..extensions import caching, caching_unless
bp = Blueprint('channel', __name__, url_prefix='/channel')
@@ -8,12 +7,15 @@ bp = Blueprint('channel', __name__, url_prefix='/channel')
@bp.route('')
@caching.cached(unless=caching_unless)
def base():
channels = {}
channels = []
channelIds = get_nosql().list_all_channels()
for channelId in channelIds:
channels[channelId] = get_nosql().get_channel_info(channelId)
channels[channelId]['video_count'] = get_nosql().get_channel_videos_count(channelId)
channel = get_nosql().get_channel_info(channelId)
channel['video_count'] = get_nosql().get_channel_videos_count(channelId)
channels.append(channel)
channels = sorted(channels, key=lambda x: x.get('added_date'), reverse=True)
return render_template('channel/index.html', channels=channels)
@@ -32,7 +34,7 @@ def channel(channelId):
for videoId in videoIds:
videos.append(get_nosql().get_video_info(videoId, limited=True))
videos = sorted(videos, key=lambda x: x.get('upload_date'), reverse=True)
videos = sorted(videos, key=lambda x: x.get('upload_date', '19700101'), reverse=True)
return render_template('channel/channel.html', channel=channelInfo, videos=videos)
@@ -41,8 +43,23 @@ def channel(channelId):
def orphaned():
videoIds = get_nosql().get_orphaned_videos()
videos = {}
videos = []
for videoId in videoIds:
videos[videoId] = get_nosql().get_video_info(videoId, limited=True)
videos.append(get_nosql().get_video_info(videoId, limited=True))
videos = sorted(videos, key=lambda x: x.get('epoch', 0), reverse=True)
return render_template('channel/orphaned.html', videos=videos)
return render_template('channel/orphaned.html', videos=videos)
@bp.route('/recent')
@caching.cached(unless=caching_unless)
def recent():
videoIds = get_nosql().get_recent_videos()
videos = []
for videoId in videoIds:
videos.append(get_nosql().get_video_info(videoId, limited=True))
videos = sorted(videos, key=lambda x: x.get('epoch', 0), reverse=True)
return render_template('channel/recent.html', videos=videos)

View File

@@ -36,4 +36,9 @@ def base():
render['info'] = get_nosql().get_video_info(vGet)
render['params'] = request.args.get('v')
if render['info'].get('_status') != 'available':
flash(render['info'].get('_status_description', 'Video unavailable because of technical errors. Come back later.'))
return redirect(url_for('index.base'))
return render_template('watch/index.html', render=render)

View File

@@ -3,10 +3,12 @@ from flask_limiter.util import get_remote_address
from flask_caching import Cache
from flask import Flask, request, session
from celery import Celery, Task
from .oidc import OIDC
from flask import Flask, request, session
def celery_init_app(app: Flask) -> Celery:
class FlaskTask(Task):
def __call__(self, *args: object, **kwargs: object) -> object:
@@ -46,3 +48,4 @@ limiter = Limiter(
caching = Cache()
oidc = OIDC()

View File

@@ -16,9 +16,15 @@ def pretty_time(time):
except:
return time # return given time
def epoch_time(time):
def epoch_date(epoch):
try:
return datetime.fromtimestamp(time).strftime('%d %b %Y')
return datetime.fromtimestamp(epoch).strftime('%d %b %Y')
except:
return None
def epoch_time(epoch):
try:
return datetime.fromtimestamp(epoch).strftime('%d %b %Y %H:%M:%S')
except:
return None

File diff suppressed because it is too large Load Diff

162
ayta/oidc.py Normal file
View File

@@ -0,0 +1,162 @@
class OIDC():
"""
This function class is nothing more than a nonce and state store for security in the authentication mechanism.
Additionally this class provides the function to generate redirect url's and check bearer tokens on their validity as well as caching jwt signing keys.
Fairly barebones and should be 100% secure. (famous last words)
This is made for form posted JWT's. While not the most secure it is the most easy way to implement. Moving on to a code based solution might be preferred in the future.
"""
def __init__(self, app=None):
self.states = {}
self.nonces = {}
if app is not None:
self.init_app(app)
def init_app(self, app):
import requests
import jwt
config = app.config.copy()
self.client_id = config['OIDC_ID']
self.provider = config['OIDC_PROVIDER']
self.domain = config['DOMAIN']
self.window = 120 # the time window to allow states and nonces in seconds
# Authentication provider url must be HTTPS and end on a TLD
if self.provider[:8] != 'https://' or self.provider[-1] == '/':
print('Incorrect OIDC provider URI', flush=True)
exit()
# Get the provider configuration endpoints
configuration = requests.get(f'{self.provider}/.well-known/openid-configuration').json()
jwks_uri = configuration.get('jwks_uri')
self.authorize_uri = configuration.get('authorization_endpoint')
# Start the JWKS management client, it will load the keys and maintain them
self.jwks_manager = jwt.PyJWKClient(jwks_uri)
#######################################################
def state_maintenance(self):
from datetime import datetime
# Current time minus the acceptable window
pivot = datetime.now().timestamp() - self.window
# List with expired states
expired_states = [state for state, timestamp in self.states.items() if timestamp <= pivot]
# Remove expired states from store
for state in expired_states:
del self.states[state]
def state_gen(self):
import secrets
from datetime import datetime
# Clean state store first
self.state_maintenance()
# Generate token and paired timestamp
state = secrets.token_urlsafe(8)
timestamp = datetime.now().timestamp()
# Add token to the state store
self.states[state] = timestamp
# Return the state
return state
def state_check(self, state):
# Clean state store first
self.state_maintenance()
# If given state is actively stored
if state in self.states:
# Delete state and return True
del self.states[state]
return True
# Given state is not stored
return False
#######################################################
# Same code as above but a different store for nonces #
#######################################################
def nonce_maintenance(self):
from datetime import datetime
pivot = datetime.now().timestamp() - self.window
expired_nonces = [nonce for nonce, timestamp in self.nonces.items() if timestamp <= pivot]
for nonce in expired_nonces:
del self.nonces[nonce]
def nonce_gen(self):
import secrets
from datetime import datetime
self.nonce_maintenance()
nonce = secrets.token_urlsafe(8)
timestamp = datetime.now().timestamp()
self.nonces[nonce] = timestamp
return nonce
def nonce_check(self, nonce):
self.nonce_maintenance()
if nonce in self.nonces:
del self.nonces[nonce]
return True
return False
#######################################################
def generate_redirect(self):
return str(f'{self.authorize_uri}'
'?response_mode=form_post&response_type=id_token&scope=openid'
f'&redirect_uri={self.domain}/auth/callback'
f'&client_id={self.client_id}'
f'&nonce={self.nonce_gen()}'
f'&state={self.state_gen()}')
def check_bearer(self, token):
import jwt
# Test given JWT
try:
# Get the signed public key from the token
signing_key = self.jwks_manager.get_signing_key_from_jwt(token).key
# Try to decode the token, this will also check the validity in these points:
# 1. Token is signed by expected keys
# 2. Token is issued by the expected provider
# 3. Expected parameters are really in the token
# 4. Token is really intended for us
# 5. Token is still valid (with 5 sec margin)
decoded = jwt.decode(token, signing_key,
algorithms=jwt.algorithms.get_default_algorithms(),
issuer=self.provider,
require=['aud', 'client_id', 'exp', 'iat', 'iss', 'rat', 'sub'],
audience=self.client_id,
leeway=5)
# Any exception (invalid JWT, invalid formatting etc...) must return False
except Exception as e:
print(e, flush=True)
return False
# Double check if given token is really requested by us by matching the nonce in the signed key
if not self.nonce_check(decoded.get('nonce', None)):
return False
# Return the unique user identifier
return decoded.get('sub', False)

View File

@@ -1,50 +0,0 @@
from minio import Minio
from minio.error import S3Error
from flask import current_app
from flask import g
##########################################
# SETUP FLASK #
##########################################
def get_s3():
"""Connect to the application's configured database. The connection is unique for each request and will be reused if this is called again."""
if "s3" not in g:
g.s3 = Mineral(current_app.config["S3_CONNECTION"], current_app.config["S3_ACCESSKEY"], current_app.config["S3_SECRETKEY"])
return g.s3
def close_s3(e=None):
"""If this request connected to the database, close the connection."""
s3 = g.pop("s3", None)
if s3 is not None:
s3.close()
def init_app(app):
"""Register database functions with the Flask app. This is called by the application factory."""
app.teardown_appcontext(close_s3)
#app.cli.add_command(init_db_command)
##########################################
# ORM #
##########################################
class Mineral:
def __init__(self, location, access, secret):
try:
self.client = Minio(location, access_key=access, secret_key=secret, secure=False)
except S3Error as exc:
print('Minio connection error ', exc)
def list_objects(self, bucket='ytarchive'):
ret = self.client.list_objects(bucket, '')
rett = []
for r in ret:
print(r.object_name, flush=True)
rett.append(r)
return rett

View File

@@ -1,38 +1,77 @@
from celery import shared_task
from flask import current_app
##########################################
# CELERY TASKS #
##########################################
@shared_task()
def subscribe_websub_callback(channelId):
def test_sleep(time=60):
from time import sleep
sleep(time)
return True
@shared_task()
def video_download(videoId):
"""
I do not want to deal with the quirks of native yt-dlp in python, hence the subprocess.
"""
import subprocess
process = subprocess.run(['/usr/local/bin/yt-dlp', '--config-location', '/var/www/archive.ventilaar.net/goodstuff/config_video.conf', '--', f'https://www.youtube.com/watch?v={videoId}'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)
if process.returncode != 0:
return False
return True
@shared_task()
def websub_subscribe_callback(channelId):
import requests
from .nosql import get_nosql
callbackId = get_nosql().websub_newCallback(channelId)
# check if a callback already exists for channel
answer = get_nosql().websub_existsCallback(channelId, channel=True)
if not answer:
callbackId = get_nosql().websub_newCallback(channelId)
else:
callbackId = answer
url = 'https://pubsubhubbub.appspot.com/subscribe'
data = {
'hub.callback': f'https://{current_app.config["DOMAIN"]}/websub/c/{callbackId}',
'hub.callback': f'{current_app.config["DOMAIN"]}/api/websub/{callbackId}',
'hub.topic': f'https://www.youtube.com/xml/feeds/videos.xml?channel_id={channelId}',
'hub.verify': 'async',
'hub.mode': 'subscribe',
'hub.verify_token': '',
'hub.secret': '',
'hub.lease_numbers': '86400',
'hub.lease_numbers': '432000',
}
get_nosql().websub_requestingCallback(callbackId)
response = requests.post(url, data=data)
if response.status_code == 202:
return True
# maybe handle errors?
return False
@shared_task()
def unsubscribe_websub_callback(callbackId, channelId):
def websub_unsubscribe_callback(callbackId):
import requests
from .nosql import get_nosql
answer = get_nosql().websub_existsCallback(callbackId)
if not answer:
return False
channelId = get_nosql().websub_getCallback(callbackId).get('channel')
url = 'https://pubsubhubbub.appspot.com/subscribe'
data = {'hub.callback': f'https://{current_app.config["DOMAIN"]}/websub/c/{callbackId}',
data = {'hub.callback': f'{current_app.config["DOMAIN"]}/api/websub/{callbackId}',
'hub.topic': f'https://www.youtube.com/xml/feeds/videos.xml?channel_id={channelId}',
'hub.verify': 'async',
'hub.mode': 'unsubscribe'
@@ -44,4 +83,101 @@ def unsubscribe_websub_callback(callbackId, channelId):
if response.status_code == 202:
return True
return False
# maybe handle errors?
return False
@shared_task()
def websub_process_data():
from .nosql import get_nosql
while True:
blob = get_nosql().websub_getFirstPostData()
if not blob:
break
_id, data = blob
parsed = do_parse_data(data)
if parsed:
state, channelId, videoId = parsed
if state == 'added':
if not get_nosql().check_exists(videoId): # if video not exists
get_nosql().queue_insertQueue(videoId, 'WebSub')
# note for future me
# the websub notifications report ALL videos, including shorts and livestreams
# so if you are going to work on individual video downloading make sure you filter them!
elif state == 'removed':
# we currently do not do anything with removed videos
# but the idea is to trigger a full channel mirror in case a creator started to mass delete videos
pass
get_nosql().websub_deletePostProcessing(_id)
@shared_task()
def websub_renew_expiring(hours=6):
from .nosql import get_nosql
from datetime import datetime, timedelta
count = 0
for callbackId in get_nosql().websub_getCallbacks():
data = get_nosql().websub_getCallback(callbackId)
if data.get('status') not in ['active']: # callback not active
continue
pivot = datetime.utcnow() + timedelta(hours=hours) # hours past now
expires = data.get('activation_time') + timedelta(seconds=data.get('lease')) # callback expires at
if pivot <= expires: # expiration happens after n hours fron now
continue # skip callback
# expiration happens within n hours
websub_subscribe_callback.delay(data.get('channel'))
# limit amount of subscribe requests to spread out the requests over time
# with an expiration pivot of 6h and a maximum validity of 5 days we can currently handle 3072 channels
count = count + 1
if count >= 256:
break
##########################################
# TASK MODULES #
##########################################
def do_parse_data(data):
import xml.etree.ElementTree as ET
data = data.decode('utf-8')
try:
root = ET.fromstring(data)
except ET.ParseError:
print('Not XML')
return False
yt = any(child.tag.startswith('{http://www.youtube.com/xml/schemas/2015}') for child in root.iter())
at = any(child.tag.startswith('{http://purl.org/atompub/tombstones/1.0}') for child in root.iter())
if yt and not at:
# Video published
state = 'added'
ns = {'yt': 'http://www.youtube.com/xml/schemas/2015', '': 'http://www.w3.org/2005/Atom'}
entry = root.find('.//{http://www.w3.org/2005/Atom}entry')
videoId = entry.find('./yt:videoId', ns).text
channelId = entry.find('./yt:channelId', ns).text
elif not yt and at:
# Video hidden
state = 'removed'
ns = {'at': 'http://purl.org/atompub/tombstones/1.0', '': 'http://www.w3.org/2005/Atom'}
deleted_entry = root.find('.//{http://purl.org/atompub/tombstones/1.0}deleted-entry')
videoId = deleted_entry.attrib['ref'].split(':')[-1]
channelId = deleted_entry.find('./at:by/uri', ns).text.split('/')[-1]
else:
print('Unknown xml')
return False
return (state, channelId, videoId)

View File

@@ -19,7 +19,7 @@
{% for item in channelInfo %}
<form method="POST">
<div class="input-field">
<span class="supporting-text">{{ item }}</span>
<span class="supporting-text mb-2">{{ item }}</span>
<input class="validate" type="text" value="{{ item }}" name="key" hidden>
</div>

View File

@@ -11,59 +11,89 @@
<div class="divider"></div>
<div class="row">
<div class="col s12">
<h5>Global channel options</h5>
<h5>Global channel options</h5>
</div>
</div>
<div class="row">
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.system') }}">
<div class="card black-text">
<a href="{{ url_for('admin.system') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">System</span>
<p class="grey-text">Internal system settings</p>
<p class="grey-text">Internal system settings</p>
</div>
</div>
</a>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.channels') }}">
<div class="card black-text">
<a href="{{ url_for('admin.channels') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Channels</span>
<p class="grey-text">Manage channels in the system</p>
<p class="grey-text">Manage channels in the system</p>
</div>
</div>
</a>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.runs') }}">
<div class="card black-text">
<a href="{{ url_for('admin.runs') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Archive runs</span>
<p class="grey-text">Look at the cron run logs</p>
<p class="grey-text">Look at the cron run logs</p>
</div>
</div>
</a>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.websub') }}">
<div class="card black-text">
<a href="{{ url_for('admin.websub') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">WebSub</span>
<p class="grey-text">Edit WebSub YouTube links</p>
<p class="grey-text">Edit WebSub YouTube links</p>
</div>
</div>
</a>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.reports') }}">
<div class="card black-text">
<a href="{{ url_for('admin.reports') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Reports</span>
<p class="grey-text">View user reports</p>
<p class="grey-text">View user reports</p>
</div>
</div>
</a>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.queue') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Queue</span>
<p class="grey-text">Video download queue and API access</p>
</div>
</div>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.users') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Users</span>
<p class="grey-text">Authenticated users</p>
</div>
</div>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.workers') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Workers</span>
<p class="grey-text">Worker and task management</p>
</div>
</div>
</a>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,163 @@
{% extends 'material_base.html' %}
{% block title %}Queue administration page{% endblock %}
{% block description %}Queue administration page of the AYTA system{% endblock %}
{% block content %}
<div class="row">
<div class="col s12">
<h4>Queue administration page</h4>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s12">
<h5>Options</h5>
</div>
</div>
<div class="row">
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Direct actions</span>
<form class="mt-4" method="post" onsubmit="return confirm('Are you sure?');">
<button class="btn mb-2 red" type="submit" name="task" value="empty-queue">Empty Queue</button>
<br>
<span class="supporting-text">Removes all queued ids</span>
</form>
<form class="mt-4" method="post" onsubmit="return confirm('Are you sure?');">
<button class="btn mb-2" type="submit" name="task" value="clean-retired">Clean retired</button>
<br>
<span class="supporting-text">Prunes all deactivated endpoints, but keeps last 3 days</span>
</form>
</div>
</div>
</div>
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Create new endpoint</span>
<form method="post">
<div class="row">
<div class="col s12 input-field">
<input placeholder="Custom endpoint" name="value" type="text" class="validate" minlength="12">
<span class="supporting-text">Leaving this empty will create a random secure string</span>
</div>
<div class="col s12 input-field">
<input placeholder="Description" name="description" type="text" class="validate" minlength="8" maxlength="64" required>
<span class="supporting-text">Description for the endpoint for better administration</span>
</div>
<button class="btn mt-4" type="submit" name="task" value="add-endpoint">Create</button>
</div>
</form>
</div>
</div>
</div>
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Queue manually</span>
<form method="post">
<div class="row">
<div class="col s12 input-field">
<input placeholder="Youtube video ID" name="value" type="text" class="validate" minlength="11" maxlength="11" required>
<span class="supporting-text">Must be a valid Youtube video ID</span>
</div>
<div class="col s12 mt-5 input-field">
<div class="switch">
<label>Queue<input type="checkbox" value="direct" name="direct"><span class="lever"></span>Direct</label>
<span class="supporting-text">Queue up or start directly</span>
</div>
</div>
<button class="btn mt-4" type="submit" name="task" value="manual-queue">Queue</button>
</div>
</form>
</div>
</div>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Registered endpoints</h5>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
<label for="filter_query">Filter results</label>
</div>
</div>
<div class="row">
<div class="col s12">
<table class="striped highlight responsive-table">
<thead>
<tr>
<th>Actions</th>
<th>id</th>
<th>description</th>
<th>status</th>
<th>created_time</th>
<th>retired_time</th>
</tr>
</thead>
<tbody>
{% for endpoint in endpoints %}
<tr class="filterable">
<td>
<form method="post">
<input type="text" value="{{ endpoint.get('id') }}" name="value" hidden>
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="retire" title="Retire endpoint" {% if endpoint.get('status') != 'active' %}disabled{% endif %}>🗑️</button>
</form>
</td>
<td>{{ endpoint.get('id') }}</td>
<td>{{ endpoint.get('description') }}</td>
<td>{{ endpoint.get('status') }}</td>
<td>{{ endpoint.get('created_time') }}</td>
<td>{{ endpoint.get('retired_time') }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Queued ID's</h5>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
<label for="filter_query">Filter results</label>
</div>
</div>
<div class="row">
<div class="col s12">
<table class="striped highlight responsive-table">
<thead>
<tr>
<th>Actions</th>
<th>id</th>
<th>endpoint</th>
<th>status</th>
<th>created_time</th>
</tr>
</thead>
<tbody>
{% for id in queue %}
<tr class="filterable">
<td>
<form method="post">
<input type="text" value="{{ id.get('id') }}" name="value" hidden>
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="delete-queue" title="Delete from queue" {% if id.get('status') != 'queued' %}disabled{% endif %}>🗑️</button>
</form>
</td>
<td>{{ id.get('id') }}</td>
<td>{{ id.get('endpoint') }}</td>
<td>{{ id.get('status') }}</td>
<td>{{ id.get('created_time') }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,82 @@
{% extends 'material_base.html' %}
{% block title %}Users administration page{% endblock %}
{% block description %}Users administration page of the AYTA system{% endblock %}
{% block content %}
<div class="row">
<div class="col s12 l11">
<h4>Users administration page</h4>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>All users</h5>
</div>
</div>
<div class="row">
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Authorize new user</span>
<form method="post">
<div class="row">
<div class="col s12 input-field">
<input placeholder="sub" name="value" type="text" class="validate" required>
<span class="supporting-text">Unique identifier</span>
</div>
<div class="col s12 input-field">
<input placeholder="Alias" name="alias" type="text" class="validate"required>
<span class="supporting-text">Name of the user</span>
</div>
<div class="col s12 input-field">
<input placeholder="Description" name="description" type="text" class="validate">
<span class="supporting-text">Additional information</span>
</div>
<button class="btn mt-4" type="submit" name="task" value="add-user">Create</button>
</div>
</form>
</div>
</div>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Registered users</h5>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
<label for="filter_query">Filter results</label>
</div>
</div>
<div class="row">
<div class="col s12">
<table class="striped highlight responsive-table">
<thead>
<tr>
<th>Actions</th>
<th>sub</th>
<th>Alias</th>
<th>Description</th>
</tr>
</thead>
<tbody>
{% for user in users %}
<tr class="filterable">
<td>
<form method="post">
<input type="text" value="{{ user.get('sub') }}" name="value" hidden>
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="delete-user" title="Delete user">🗑️</button>
</form>
</td>
<td>{{ user.get('sub') }}</td>
<td>{{ user.get('alias') }}</td>
<td>{{ user.get('description') }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
{% endblock %}

View File

@@ -4,14 +4,9 @@
{% block content %}
<div class="row">
<div class="col s12 l11">
<div class="col s12">
<h4>WebSub administration page</h4>
</div>
<div class="col s12 l1 m-5">
<form method="POST">
<input title="Prunes all retired callbacks, but keeps last 3 days" type="submit" value="clean-retired" name="task">
</form>
</div>
</div>
<div class="divider"></div>
<div class="row">
@@ -19,6 +14,43 @@
<h5>WebSub options</h5>
</div>
</div>
<div class="row">
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Direct actions</span>
<form method="post" onsubmit="return confirm('Are you sure?');">
<button class="btn mb-2 green" type="submit" name="task" value="subscribe-channels">Subscribe channels</button>
<br>
<span class="supporting-text">Send WebSub subscription request for all activated channels. (This will renew existing ones as well)</span>
</form>
<form class="mt-4" method="post" onsubmit="return confirm('Are you sure?');">
<button class="btn mb-2 red" type="submit" name="task" value="unsubscribe-callbacks">Unsubscribe channels</button>
<br>
<span class="supporting-text">Send WebSub unsubscription request for all activated endpoints. (This will only unsubscribe, not disable)</span>
</form>
<form class="mt-4" method="post" onsubmit="return confirm('Are you sure?');">
<button class="btn mb-2" type="submit" name="task" value="clean-retired">Clean retired</button>
<br>
<span class="supporting-text">Prunes all retired callbacks, but keeps until last day</span>
</form>
</div>
</div>
</div>
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Statistics</span>
<h6>Unprocessed callback datapoints</h6>
<p>{{ render['stats']['unprocessed_data'] }}</p>
<h6>Active callbacks</h6>
<p>{{ render['stats']['active_callbacks'] }}</p>
<h6>Something</h6>
<p>Blah</p>
</div>
</div>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
@@ -50,6 +82,7 @@
{% for callback in callbacks %}
<tr class="filterable">
<td>
<a target="_blank" rel="noopener noreferrer" href="https://pubsubhubbub.appspot.com/subscription-details?hub.callback={{ config['DOMAIN'] }}/api/websub/{{ callbacks[callback].get('id') }}&hub.topic=https://www.youtube.com/xml/feeds/videos.xml?channel_id={{ callbacks[callback].get('channel') }}"><button class="btn-small waves-effect waves-light" title="Information on Pubsubhubbub (external link)"></button></a>
<form method="post">
<input type="text" value="{{ callbacks[callback].get('id') }}" name="value" hidden>
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="unsubscribe" title="Send unsubscribe request to hub" {% if callbacks[callback].get('status') != 'active' %}disabled{% endif %}>🗑️</button>

View File

@@ -0,0 +1,47 @@
{% extends 'material_base.html' %}
{% block title %}Workers administration page{% endblock %}
{% block description %}Workers administration page of the AYTA system{% endblock %}
{% block content %}
<div class="row">
<div class="col s12">
<h4>Workers administration page</h4>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s12">
<h5>Options</h5>
</div>
</div>
<form method="POST">
<input title="test-sleep" type="submit" value="test-sleep" name="task">
</form>
<div class="divider"></div>
<div class="row">
<div class="col s12">
<h6>Current workers</h6>
{% for worker in tasks %}
<span>{{ worker }}</span>
<table class="striped highlight responsive-table" style=" border: 1px solid black;">
<thead>
<tr>
<th>ID</th>
<th>Task</th>
<th>Time started</th>
</tr>
</thead>
<tbody>
{% for task in tasks[worker] %}
<tr>
<td>{{ task.get('id') }}</td>
<td>{{ task.get('type') }}</td>
<td>{{ task.get('time_start')|epoch_time }}</td>
</tr>
{% endfor %}
</tbody>
</table>
{% endfor %}
</div>
</div>
{% endblock %}

View File

@@ -25,7 +25,7 @@
<div class="card medium black-text">
<a href="{{ url_for('watch.base') }}?v={{ video.get('id') }}">
<div class="card-image">
<img loading="lazy" src="https://archive.ventilaar.net/videos/automatic/{{ video.get('channel_id') }}/{{ video.get('id') }}/{{ video.get('title') }}.jpg">
<img loading="lazy" src="https://archive.ventilaar.net/videos/automatic/{{ video.get('channel_id') }}/{{ video.get('id') }}/{{ video.get('_title_slug') }}.jpg">
</div>
</a>
<div class="card-content activator">

View File

@@ -19,7 +19,17 @@
</div>
</div>
<div class="row">
<div class="col s12 m-4 filterable">
<div class="col s6 m-4">
<a href="{{ url_for('channel.recent') }}">
<div class="card black-text">
<div class="card-content center">
<span class="card-title">Recent videos</span>
<p class="grey-text">The last videos to have been added to the archive</p>
</div>
</div>
</a>
</div>
<div class="col s6 m-4">
<a href="{{ url_for('channel.orphaned') }}">
<div class="card black-text">
<div class="card-content center">
@@ -31,12 +41,12 @@
</div>
{% for channel in channels %}
<div class="col s6 l4 m-4 filterable">
<a href="{{ url_for('channel.channel', channelId=channel) }}">
<a href="{{ url_for('channel.channel', channelId=channel.get('id')) }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">{{ channels[channel].get('original_name') }}</span>
<p class="grey-text">{{ channels[channel].get('id') }}</p>
<p><b>Added:</b> {{ channels[channel].get('added_date')|pretty_time }} | <b>Active:</b> {{ channels[channel].get('active') }} | <b>Videos:</b> {{ channels[channel].get('video_count') }}</p>
<span class="card-title">{{ channel.get('original_name') }}</span>
<p class="grey-text">{{ channel.get('id') }}</p>
<p><b>Added:</b> {{ channel.get('added_date')|pretty_time }} | <b>Active:</b> {{ channel.get('active') }} | <b>Videos:</b> {{ channel.get('video_count') }}</p>
</div>
</div>
</a>

View File

@@ -5,14 +5,14 @@
{% block content %}
<div class="row">
<div class="col s12">
<h4>Channels lising page</h4>
<h4>Videos lising page</h4>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Orphaned videos</h5>
<p>Videos in the archive which do not have a permanent channel linked. There is a high chance that the videos are manually downloaded.</p>
<p>Videos in the archive which do not have a permanent channel linked. There is a high chance that the videos are manually downloaded. Sorted by last added.</p>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
@@ -23,18 +23,19 @@
{% for video in videos %}
<div class="col s6 l4 m-4 filterable">
<div class="card medium black-text">
<a href="{{ url_for('watch.base') }}?v={{ video }}">
<a href="{{ url_for('watch.base') }}?v={{ video.get('id') }}">
<div class="card-image">
<img loading="lazy" src="https://archive.ventilaar.net/videos/automatic/{{ videos[video].get('channel_id') }}/{{ videos[video].get('id') }}/{{ videos[video].get('title') }}.jpg">
<img loading="lazy" src="https://archive.ventilaar.net/videos/automatic/{{ video.get('channel_id') }}/{{ video.get('id') }}/{{ video.get('_title_slug') }}.jpg">
</div>
</a>
<div class="card-content activator">
<span class="card-title">{{ videos[video].get('title') }}</span>
<p class="grey-text">{{ videos[video].get('id') }} | {{ videos[video].get('upload_date')|pretty_time }}</p>
<span class="card-title">{{ video.get('title') }}</span>
<p><b>{{ video.get('uploader') }}</b></p>
<p class="grey-text">{{ video.get('id') }} | {{ video.get('upload_date')|pretty_time }}</p>
</div>
<div class="card-reveal">
<span class="card-title truncate">{{ videos[video].get('title') }}</span>
<p style="white-space: pre-wrap;">{{ videos[video].get('description') }}</p>
<span class="card-title truncate">{{ video.get('title') }}</span>
<p style="white-space: pre-wrap;">{{ video.get('description') }}</p>
</div>
</div>
</div>

View File

@@ -0,0 +1,44 @@
{% extends 'material_base.html' %}
{% block title %}Recent videos{% endblock %}
{% block description %}The last videos to have been added to the archive{% endblock %}
{% block content %}
<div class="row">
<div class="col s12">
<h4>Videos lising page</h4>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Recent videos</h5>
<p>The last 99 videos to have been added to the archive.</p>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
<label for="filter_query">Filter results</label>
</div>
</div>
<div class="row">
{% for video in videos %}
<div class="col s6 l4 m-4 filterable">
<div class="card medium black-text">
<a href="{{ url_for('watch.base') }}?v={{ video.get('id') }}">
<div class="card-image">
<img loading="lazy" src="https://archive.ventilaar.net/videos/automatic/{{ video.get('channel_id') }}/{{ video.get('id') }}/{{ video.get('_title_slug') }}.jpg">
</div>
</a>
<div class="card-content activator">
<span class="card-title">{{ video.get('title') }}</span>
<p><b>{{ video.get('uploader') }}</b></p>
<p class="grey-text">{{ video.get('id') }} | {{ video.get('upload_date')|pretty_time }}</p>
</div>
<div class="card-reveal">
<span class="card-title truncate">{{ video.get('title') }}</span>
<p style="white-space: pre-wrap;">{{ video.get('description') }}</p>
</div>
</div>
</div>
{% endfor %}
</div>
{% endblock %}

View File

@@ -43,6 +43,10 @@
<a href="{{ url_for('channel.channel', channelId='UCzGdxkzULCa9RlD-Q2EZPXQ') }}"><span class="title">Kalashnikov Group</span></a>
<p>Reason: This account has been terminated for a violation of YouTube's Terms of Service.</p>
</li>
<li class="collection-item">
<a href="{{ url_for('channel.channel', channelId='UCtfg1tENiu3SgGMZVduFmTg') }}"><span class="title">FiberNinja</span></a>
<p>Reason: This channel was removed because it violated our Community Guidelines.</p>
</li>
</ul>
</div>
</div>

View File

@@ -4,7 +4,7 @@
{% block content %}
<div class="row">
<div class="col s12 l3">
<div class="col s12 l3 mr-4">
<h4>pls login</h4>
<form method="post">
<div class="input-field">
@@ -12,10 +12,9 @@
</div>
<button class="btn mt-4" type="submit" name="action" value="login">Login</button>
</form>
<div class="divider"></div>
<a href="{{ url_for('auth.start_oidc') }}"><button class="btn mt-4 green">Login with OIDC</button></a>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s12 l3">
<p>This is a WEBP-free archive</p>
<img class="responsive-img" src="{{ url_for('static', filename='img/fuck_webp.png') }}">

View File

@@ -3,15 +3,15 @@
<head>
<link rel="icon" type="image/x-icon" href="/favicon.ico">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='css/materialize.min.css') }}">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='css/custom.css') }}">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='css/custom.css') }}">
<script src="{{ url_for('static', filename='js/jquery-3.7.1.slim.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/materialize.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/custom.js') }}"></script>
<script src="{{ url_for('static', filename='js/materialize.min.js') }}"></script>
<script src="{{ url_for('static', filename='js/custom.js') }}"></script>
<title>{% block title %}{% endblock %} | AYTA</title>
<meta charset="UTF-8">
<meta name="description" content="{% block description %}{% endblock %}">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
{% block opengraph %}{% endblock %}
<meta name="viewport" content="width=device-width, initial-scale=1.0">
{% block opengraph %}{% endblock %}
</head>
<body class="grey lighten-2">
<header>
@@ -20,45 +20,49 @@
<ul id="nav-mobile" class="left">
<li><a href="{{ url_for('channel.base') }}">Channels</a></li>
<li><a href="{{ url_for('admin.base') }}">Admin</a></li>
{% if config.get('DEBUG') %}<li><span class="new badge mt-5" data-badge-caption="True">Debug mode is</span></li>{% endif %}
</ul>
<a href="{{ url_for('index.base') }}" class="brand-logo center">AYTA</a>
<ul id="nav-mobile" class="right">
{% if 'username' in session %}<li><a href="{{ url_for('auth.logout') }}"><span class="new badge" data-badge-caption="{{ session.username }}">Logged in as</span></a></li>{% endif %}
<a href="{{ url_for('index.base') }}" class="brand-logo center">AYTA</a>
<ul id="nav-mobile" class="right">
<li><a href="{{ url_for('search.base') }}">Search</a></li>
<li><a href="{{ url_for('index.help') }}">Help</a></li>
</ul>
</div>
</nav>
</header>
<main>
{% with messages = get_flashed_messages() %}
{% if messages %}
</header>
<main>
{% with messages = get_flashed_messages() %}
{% if messages %}
{% for message in messages %}
<script>M.toast({text: '{{ message }}', displayLength: 5000, outDuration: 999, inDuration: 666})</script>
{% endfor %}
{% endif %}
{% endwith %}
<div class="container">
<noscript>Hey there, while I did build this application in mind to minimize javascript usage, the experience would be much better if you would enable it!</noscript>
{% block content %}{% endblock %}
</div>
</main>
<script>M.toast({text: '{{ message }}', displayLength: 5000, outDuration: 999, inDuration: 666})</script>
<noscript>A message appeared without supporting javasript: {{ message }}</noscript>
{% endfor %}
{% endif %}
{% endwith %}
<div class="container">
<noscript>Hey there, while I did build this application in mind to minimize javascript usage, the experience would be much better if you would enable it!</noscript>
{% block content %}{% endblock %}
</div>
</main>
<footer class="page-footer deep-orange">
<div class="container">
<div class="row">
<div class="s12 l6">
<div class="s12 l6 mr-4">
<h5>Awesome YouTube Archive</h5>
<p>A custom content management system for archived YouTube videos!</p>
<p>A custom content management system for archived YouTube videos.</p>
</div>
<div class="s12 l6">
<span class="new badge" data-badge-caption="{{ null|current_time }}">Page generated on</span>
<div class="s12 l6">
<h6>Still in development, slowly...</h6>
<h6>This is not a streaming website! Videos may buffer (a lot)!</h6>
<h6>This is not a streaming website! Videos may buffer (a lot)!</h6>
<div class="section mb-4">
<span class="new badge" data-badge-caption="{{ null|current_time }}">Page generated on</span>
{% if config.get('DEBUG') %}<span class="new badge" data-badge-caption="True">Debug mode is</span>{% endif %}
{% if 'username' in session %}<a href="{{ url_for('auth.logout') }}"><span class="new badge" data-badge-caption="{{ session.username }}">Logged in as</span></a>{% endif %}
</div>
</div>
</div>
</div>
</footer>
<script>M.AutoInit();</script>
<script>M.AutoInit();</script>
</body>
</html>

View File

@@ -7,95 +7,85 @@
<meta property="og:type" content="website" />
<meta property="og:url" content="{{ url_for('watch.base') }}?v={{ render.get('info').get('id') }}" />
<meta property="og:image" content="https://archive.ventilaar.net/videos/automatic/{{ render.get('info').get('channel_id') }}/{{ render.get('info').get('id') }}/{{ render.get('info').get('title') }}.jpg" />
<meta property="og:description" content="{{ render.get('info').get('description')|truncate(100) }}" />
<meta property="og:description" content="{{ render.get('info').get('description', '')|truncate(100) }}" />
{% endblock %}
{% block content %}
<div class="row">
<div class="col s12">
<h4>{{ render.get('info').get('title') }}</h4>
</div>
<div class="col s3">
<p><b>Video by:</b> <a href="{{ url_for('channel.channel', channelId=render.get('info').get('channel_id')) }}">{{ render.get('info').get('uploader') }}</a></p>
</div>
<div class="col s3">
<p><b>Upload date:</b> {{ render.get('info').get('upload_date')|pretty_time }}</p>
</div>
<div class="col s3">
<p><b>Archive date:</b> {{ render.get('info').get('epoch')|epoch_time }}</p>
</div>
<div class="col s3">
<p><b>Video length:</b> {{ render.get('info').get('duration')|pretty_duration }}</p>
</div>
</div>
<div class="row">
<div class="col s12 center-align">
<div class="col s12 mt-4 center-align">
<video controls class="responsive-video">
<source src="https://archive.ventilaar.net/videos/automatic/{{ render.get('info').get('channel_id') }}/{{ render.get('info').get('id') }}/{{ render.get('info').get('title') }}.mp4">
<source src="https://archive.ventilaar.net/videos/automatic/{{ render.get('info').get('channel_id') }}/{{ render.get('info').get('id') }}/{{ render.get('info').get('title') }}.webm">
<source src="https://archive.ventilaar.net/videos/automatic/{{ render.get('info').get('channel_id') }}/{{ render.get('info').get('id') }}/{{ render.get('info').get('_title_slug') }}.mp4">
<source src="https://archive.ventilaar.net/videos/automatic/{{ render.get('info').get('channel_id') }}/{{ render.get('info').get('id') }}/{{ render.get('info').get('_title_slug') }}.webm">
Your browser does not support the video tag.
</video>
</div>
</div>
<div class="row">
<div class="col s12 l9 center-align mr-4">
<div class="section">
<div class="row">
<div class="col s12 m3">
<p><a href="https://youtu.be/{{ render.get('info').get('id') }}" target="_blank" rel="noopener noreferrer">▶️ Watch on YouTube</a></p>
</div>
<div class="col s12 m3">
<p><a href="https://archive.ventilaar.net/videos/automatic/{{ render.get('info').get('channel_id') }}/{{ render.get('info').get('id') }}/">🗄️ Source files</a></p>
</div>
<div class="col s12 m3">
<p>Sample text</p>
</div>
<div class="col s12 m3 input-field">
<form method="post">
<select id="report" name="reason">
<option value="" disabled selected></option>
<option value="auto-video">Auto/Video Problems</option>
<option value="metadata">Incorrect metadata</option>
<option value="illegal">Illegal video</option>
</select>
<label for="report">Report a problem</label>
<button for="report" class="btn" type="submit" name="action">Submit report</button>
</form>
</div>
</div>
</div>
<div class="divider"></div>
<div class="section">
<div class="col s12 l9 mr-4">
<h5>{{ render.get('info').get('title') }}</h5>
</div>
<div class="col s12 l3">
<p><b>Video by:</b> <a href="{{ url_for('channel.channel', channelId=render.get('info').get('channel_id')) }}">{{ render.get('info').get('uploader') }}</a></p>
<p><b>Upload date:</b> {{ render.get('info').get('upload_date')|pretty_time }}</p>
<p><b>Archive date:</b> {{ render.get('info').get('epoch')|epoch_date }}</p>
<p><b>Video length:</b> {{ render.get('info').get('duration')|pretty_duration }}</p>
</div>
<div class="col s4 l3 center-align">
<p><a href="https://youtu.be/{{ render.get('info').get('id') }}" target="_blank" rel="noopener noreferrer">▶️ Watch on YouTube</a></p>
</div>
<div class="col s4 l3 center-align">
<p><a href="https://archive.ventilaar.net/videos/automatic/{{ render.get('info').get('channel_id') }}/{{ render.get('info').get('id') }}/">🗄️ Source files</a></p>
</div>
<div class="col s4 l3 center-align">
<p></p>
</div>
<div class="col s12 l3 center-align input-field">
<form method="post">
<select id="report" name="reason">
<option value="" disabled selected></option>
<option value="auto-video">Auto/Video Problems</option>
<option value="metadata">Incorrect metadata</option>
<option value="illegal">Illegal video</option>
</select>
<label for="report">Report a problem</label>
<button for="report" class="btn mt-4" type="submit" name="action">Submit report</button>
</form>
</div>
</div>
<div class="divider mt-4"></div>
<div class="row">
<div class="col s12 l9 mr-4">
<div class="section center-align">
<h5>Description</h5>
<p style="white-space: pre-wrap;" class="left-align">{{ render.get('info').get('description') }}</p>
</div>
<div class="divider"></div>
<div class="section input-field">
<div class="section center-align input-field">
<h5>Full info JSON dump</h5>
<textarea readonly class="materialize-textarea grey lighten">{{ render.get('info') }}</textarea>
<textarea readonly class="materialize-textarea grey lighten">{{ render.get('info') }}</textarea>
</div>
</div>
<div class="col s12 l3 ml-4">
<div class="section">
{% if render.get('info').get('categories') %}
{% if render.get('info').get('categories') %}
<h5>Categories</h5>
<ul class="collection">
{% for category in render.get('info').get('categories') %}
<ul class="collection">
{% for category in render.get('info').get('categories') %}
<li class="collection-item">{{ category }}</li>
{% endfor %}
{% endfor %}
</ul>
{% endif %}
{% endif %}
</div>
<div class="divider"></div>
<div class="section">
{% if render.get('info').get('tags') %}
{% if render.get('info').get('tags') %}
<h5>Tags</h5>
<ul class="collection">
{% for tag in render.get('info').get('tags') %}
{% for tag in render.get('info').get('tags') %}
<li class="collection-item">{{ tag }}</li>
{% endfor %}
{% endfor %}
</ul>
{% endif %}
{% endif %}
</div>
</div>
</div>

View File

@@ -2,13 +2,10 @@
flask
flask-caching
flask-login
flask-oidc
flask-limiter
minio
pymongo
yt-dlp
argon2-cffi
gunicorn
celery
sqlalchemy
sqlalchemy
pyjwt[crypto]