You've already forked amazing-ytdlp-archive
Compare commits
51 Commits
2be13ba1fb
...
v0.6.9
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
1186d236f2 | ||
|
|
5a4726ac10 | ||
|
|
46bde82d32 | ||
|
|
6c681d6b07 | ||
|
|
0d5d233e90 | ||
|
|
548a4860fc | ||
|
|
da333ab4f6 | ||
|
|
f2b01033ea | ||
|
|
49f0ea7481 | ||
|
|
f1287a4212 | ||
|
|
30ea647ca9 | ||
|
|
a7c640a8cf | ||
|
|
f6da232164 | ||
|
|
1d5934275c | ||
|
|
72af6b6126 | ||
|
|
8bf8e08af3 | ||
|
|
236b56915b | ||
|
|
ac0243a783 | ||
|
|
bb78c97d52 | ||
|
|
7ccb827a9c | ||
|
|
9c0e4fb63c | ||
|
|
75d42ad3cd | ||
|
|
4fa0ee2c68 | ||
|
|
7e06c8673b | ||
|
|
96565e9e2b | ||
|
|
f90b0bdc42 | ||
|
|
1be9729720 | ||
|
|
1918a03e05 | ||
|
|
ed4f8b03eb | ||
|
|
7266a437d1 | ||
|
|
360b80343f | ||
|
|
45348d2cf5 | ||
|
|
e80318fc6b | ||
|
|
69bf7026dd | ||
|
|
e264a346a5 | ||
|
|
c50116b942 | ||
|
|
970fd1fa0f | ||
|
|
c71bd547ca | ||
|
|
2dbae35e4e | ||
| cd06c86b1a | |||
|
|
fe60b3d981 | ||
|
|
4eeb72082c | ||
|
|
dcca91fef1 | ||
|
|
5bf7d5f25c | ||
|
|
dffd04078a | ||
|
|
cb82a50dc4 | ||
|
|
7e4d872566 | ||
|
|
7f6dff2b7a | ||
|
|
08e94449ed | ||
|
|
5c910b2bca | ||
|
|
afd07334c5 |
8
.dockerignore
Normal file
8
.dockerignore
Normal file
@@ -0,0 +1,8 @@
|
||||
# Ignore everything
|
||||
**
|
||||
|
||||
# Add required files and folders
|
||||
!ayta
|
||||
!README.md
|
||||
!LICENCE
|
||||
!requirements.txt
|
||||
@@ -1,9 +1,8 @@
|
||||
name: Generate release
|
||||
name: Generate docker image
|
||||
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- 'v*'
|
||||
release:
|
||||
types: [published]
|
||||
|
||||
jobs:
|
||||
build-and-publish:
|
||||
18
.gitea/workflows/workers-tasks.yaml
Normal file
18
.gitea/workflows/workers-tasks.yaml
Normal file
@@ -0,0 +1,18 @@
|
||||
name: Update worker server
|
||||
|
||||
on:
|
||||
release:
|
||||
types: [published]
|
||||
|
||||
jobs:
|
||||
build-and-publish:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Update worker server
|
||||
uses: appleboy/ssh-action@v1.0.3
|
||||
with:
|
||||
host: 192.168.66.109
|
||||
username: root
|
||||
key: ${{ secrets.SERVER_KEY }}
|
||||
port: 22
|
||||
script: /root/update_worker.sh
|
||||
@@ -1,6 +1,7 @@
|
||||
FROM python:3-alpine
|
||||
FROM python:3.12-alpine
|
||||
WORKDIR /app
|
||||
COPY . /app
|
||||
COPY requirements.txt /app
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
COPY . /app
|
||||
EXPOSE 8000
|
||||
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "ayta:create_app"]
|
||||
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "1", "ayta:create_app()"]
|
||||
84
README.md
84
README.md
@@ -3,27 +3,67 @@
|
||||
This project will be awesome, only if I invest enough time. This software will replace my
|
||||
current cronjob yt-dlp archive service.
|
||||
|
||||
Partially inspired by [hobune](https://github.com/rebane2001/hobune). While that project is amazing by it's own, it's just not scaleable.
|
||||
Partially inspired by [hobune](https://github.com/rebane2001/hobune). While that project is amazingby it's own, it's just not scaleable.
|
||||
|
||||
## The idea
|
||||
The new setup will either be fully running in flask, including the task that checks the
|
||||
youtube channels every x hours. Or Flask will be used as the gui frontend, and a seperate
|
||||
script will do the channel archiving. I have not desided yet.
|
||||
|
||||
What currently works is that the gui frontend calls to a seperate database while a cronjob
|
||||
handles the downloading of new videos from a list of channels.
|
||||
Having over 350k videos, scaling the current cronjob yt-dlp archive task is just really hard. Filetypes change, things get partially downloaded and such.
|
||||
Partially yt-dlp is to blame because it's a package that needs to change all the time. But with this some changes are not accounted for.
|
||||
yt-dlp will still do the downloads. But a flask frontend will be developed to make all downloaded videos easily indexable.
|
||||
For it to be quick (unlike hobune) a database has to be implemented. This could get solved by a static site generator type of software, but that is not my choice.
|
||||
|
||||
The whole software package will use postgresql as a data backend and celery as background tasks.
|
||||
Currently development however is using mongodb just because it's easy.
|
||||
|
||||
## How it works currently(legacy)
|
||||
In the legacy folder you will find files that are currently in my archive project. How it works is
|
||||
that I have a cronjob running every 6 hours what then runs yt-dlp with a config file. In that config
|
||||
that I have a cronjob running every 24 hours what then runs yt-dlp with a config file. In that config
|
||||
file a channel list contains all the channels that yt-dlp needs to update. If a new video has been
|
||||
uploaded, yt-dlp will automatically download a 720p version of the video, all subtitles at that time
|
||||
(rip community captions, will not forget you) and a json file with all the rest of the metadata. Oh
|
||||
and also the thumbnail.
|
||||
|
||||
This works. But is very slow and uses lots of "API" calls to youtube, which will sometimes will get
|
||||
the IP blocked. This needs to be overhauled.
|
||||
the IP blocked. This is why full channel upload pages are not downloaded anymore, I have limited to first 50 videos.
|
||||
|
||||
## Goals
|
||||
Some goals have been set up which will prioritise functionality for the software package.
|
||||
The starting status is that info.json files of videos are loaded into the mongodb database on which flask
|
||||
will generate a page for channels and videos to load. But this has major limitations which will not be described right now
|
||||
but will be reflected in the goals.
|
||||
|
||||
### Stage 1
|
||||
Tasks which have to be finished before the GUI frontend is usable as a manager and user in no perticular order.
|
||||
- [x] Have videos and channels listed on a page
|
||||
- [x] Have a secured admin page where the database can be managed
|
||||
- [x] Have working video streaming
|
||||
- [x] CI/CD pipeline for quicker deployment
|
||||
- [x] Add caching to speed up pages
|
||||
- [x] Add ratelimiting for expensive pages
|
||||
- [x] Ability to show cronjob logs to easily troubleshoot
|
||||
|
||||
### Stage 2
|
||||
Extra functionality for further development of features.
|
||||
- [x] Fix video titles on disk with slugs
|
||||
- [x] Working search functionality
|
||||
- [x] Video reporting functionality
|
||||
- [x] Ability (for external applications) to queue up video ids for download
|
||||
- [x] Add websub requesting and receiving ability. (not fully usable yet without celery tasks)
|
||||
- [x] OIDC or Webauthn logins instead of static argon2 passwords
|
||||
|
||||
### Stage 3
|
||||
Mainly focused on retiring the cronjob based scripts and moving it to celery based tasks
|
||||
- [ ] manage videos by ID's instead of per channel basis
|
||||
- [ ] download videos from queue
|
||||
- [x] Manage websub callbacks
|
||||
|
||||
### Stage 4
|
||||
Mongodb finally has it's limitations.
|
||||
- [ ] Migrate to postgresql
|
||||
|
||||
### Stage ...
|
||||
Since this is my flagship software which I have developed more features will be added.
|
||||
It may take some time since this is just a hobby for me. And I'm not a programmer by title.
|
||||
|
||||
|
||||
## Things learned
|
||||
### Video playlists
|
||||
@@ -50,26 +90,22 @@ If you swap the channel name to channel id. The folders will never change.
|
||||
### Storage structure
|
||||
The following folder structure is pretty nice for using static scripts. The one drawback
|
||||
is that you can't search for video id's or titles. Because the search takes too long.
|
||||
This is mainly why we need a new system using a database.
|
||||
```
|
||||
./videos/{channel_id}/{upload_date}/{video_id}/video_title.mp4
|
||||
```
|
||||
For the new system using a blob like storage will be key. I had the following in mind. It will be an independant
|
||||
random key and not the YouTube video ID because I have notices that multiple real videos exist under the same key by
|
||||
uploaders who replace old videos.
|
||||
This is mainly why we need a new system using a database mainly for search.
|
||||
|
||||
The following structure is easily scaleable and usable in a object storage format.
|
||||
```
|
||||
-| data
|
||||
| - videos
|
||||
| - 128bit_random_id.mp4
|
||||
| - subtitles
|
||||
| - same_random_id_EN.srt
|
||||
| - same_random_id_DE.srt
|
||||
| - thumbnails
|
||||
| - 128bit_random_id.jpg
|
||||
./videos/{channel_id}/{video_id}/video-title-slug-format.info.json
|
||||
```
|
||||
|
||||
## API things learned
|
||||
### YouTube push notifications in API form exist
|
||||
Using the pubsubhubbub service provided by Google we will implement downloading videos based on uploads.
|
||||
The API is based on WebSub which is greatly documented.
|
||||
|
||||
The hub will give xml+atom notifications when a video is uploaded by a channel and when a video is deleted.
|
||||
The goal is to download a video when a notification gets trough, and run a full channel sync when a video is deleted.
|
||||
This will be next to periodic full channel polling to download videos which the hub has not notified us about.
|
||||
|
||||
### Etag is useful
|
||||
When we will call the api for 50 items in a playlist we also get an etag back.
|
||||
This is a sort of hash of the returned data.
|
||||
|
||||
@@ -1,37 +1,55 @@
|
||||
import os
|
||||
import secrets
|
||||
from flask import Flask
|
||||
from ayta.extensions import limiter, caching
|
||||
|
||||
from . import filters
|
||||
|
||||
def create_app(test_config=None):
|
||||
import os, secrets
|
||||
from flask import Flask
|
||||
from ayta.extensions import limiter, caching, celery_init_app, oidc
|
||||
from werkzeug.middleware.proxy_fix import ProxyFix
|
||||
|
||||
from . import filters
|
||||
|
||||
config = {'MONGO_CONNECTION': os.environ.get('AYTA_MONGOCONNECTION', 'mongodb://root:example@192.168.66.140:27017'),
|
||||
'S3_CONNECTION': os.environ.get('AYTA_S3CONNECTION', '192.168.66.111:9001'),
|
||||
'S3_ACCESSKEY': os.environ.get('AYTA_S3ACCESSKEY', 'lnUiGClFVXVuZbsr'),
|
||||
'S3_SECRETKEY': os.environ.get('AYTA_S3SECRETKEY', 'Qz9NG7rpcOWdK2WL'),
|
||||
'CACHE_TYPE': os.environ.get('AYTA_CACHETYPE', 'SimpleCache'),
|
||||
'CACHE_DEFAULT_TIMEOUT': os.environ.get('AYTA_CACHETIMEOUT', 5),
|
||||
'SECRET_KEY': os.environ.get('AYTA_SECRETKEY', secrets.token_hex(32)),
|
||||
'DEBUG': os.environ.get('AYTA_DEBUG', True)
|
||||
'OIDC_PROVIDER': os.environ.get('AYTA_OIDC_PROVIDER', 'https://auth.ventilaar.nl'),
|
||||
'OIDC_ID': os.environ.get('AYTA_OIDC_ID', 'ayta'),
|
||||
'CACHE_DEFAULT_TIMEOUT': int(os.environ.get('AYTA_CACHETIMEOUT', 6)),
|
||||
'DEBUG': bool(os.environ.get('AYTA_DEBUG', False)),
|
||||
'DOMAIN': os.environ.get('AYTA_DOMAIN', 'https://testing.mashallah.nl'),
|
||||
'CELERY': {'broker_url': str(os.environ.get('AYTA_CELERYBROKER', 'amqp://guest:guest@192.168.66.140:5672/'))}
|
||||
}
|
||||
|
||||
# Static Flask configuration options
|
||||
|
||||
config['CELERY']['task_ignore_result'] = True
|
||||
config['CACHE_TYPE'] = 'SimpleCache'
|
||||
config['SECRET_KEY'] = secrets.token_bytes(32)
|
||||
|
||||
# Celery Periodic tasks
|
||||
|
||||
config['CELERY']['beat_schedule'] = {}
|
||||
config['CELERY']['beat_schedule']['Renew WebSub endpoints'] = {'task': 'ayta.tasks.websub_renew_expiring', 'schedule': 4000}
|
||||
config['CELERY']['beat_schedule']['Process WebSub data'] = {'task': 'ayta.tasks.websub_process_data', 'schedule': 100}
|
||||
|
||||
app = Flask(__name__)
|
||||
app.config.from_mapping(config)
|
||||
|
||||
limiter.init_app(app)
|
||||
caching.init_app(app)
|
||||
oidc.init_app(app)
|
||||
celery_init_app(app)
|
||||
|
||||
app.wsgi_app = ProxyFix(app.wsgi_app, x_for=1)
|
||||
|
||||
app.jinja_env.filters['pretty_duration'] = filters.pretty_duration
|
||||
app.jinja_env.filters['pretty_time'] = filters.pretty_time
|
||||
app.jinja_env.filters['current_time'] = filters.current_time
|
||||
|
||||
app.jinja_env.filters['epoch_time'] = filters.epoch_time
|
||||
app.jinja_env.filters['epoch_date'] = filters.epoch_date
|
||||
|
||||
from .blueprints import watch
|
||||
from .blueprints import index
|
||||
from .blueprints import admin
|
||||
from .blueprints import search
|
||||
from .blueprints import channel
|
||||
from .blueprints import auth
|
||||
from .blueprints import api
|
||||
|
||||
app.register_blueprint(watch.bp)
|
||||
app.register_blueprint(index.bp)
|
||||
@@ -39,7 +57,6 @@ def create_app(test_config=None):
|
||||
app.register_blueprint(search.bp)
|
||||
app.register_blueprint(channel.bp)
|
||||
app.register_blueprint(auth.bp)
|
||||
|
||||
app.add_url_rule("/", endpoint="base")
|
||||
app.register_blueprint(api.bp)
|
||||
|
||||
return app
|
||||
File diff suppressed because it is too large
Load Diff
66
ayta/blueprints/api.py
Normal file
66
ayta/blueprints/api.py
Normal file
@@ -0,0 +1,66 @@
|
||||
from flask import Blueprint, render_template, request, redirect, url_for, flash, abort
|
||||
from ..nosql import get_nosql
|
||||
from ..extensions import caching, caching_unless
|
||||
|
||||
import re
|
||||
|
||||
bp = Blueprint('api', __name__, url_prefix='/api')
|
||||
|
||||
@bp.route('/websub/<cap>', methods=['GET', 'POST'])
|
||||
def websub(cap):
|
||||
if request.method == 'GET':
|
||||
topic = request.args.get('hub.topic')
|
||||
challenge = request.args.get('hub.challenge')
|
||||
mode = request.args.get('hub.mode')
|
||||
lease_seconds = request.args.get('hub.lease_seconds')
|
||||
|
||||
if mode not in ['subscribe', 'unsubscribe']:
|
||||
return abort(400)
|
||||
|
||||
if not get_nosql().websub_existsCallback(cap):
|
||||
return abort(404)
|
||||
|
||||
if mode == 'unsubscribe':
|
||||
get_nosql().websub_retireCallback(cap)
|
||||
return challenge
|
||||
|
||||
if not all([topic, challenge, mode, lease_seconds]):
|
||||
return abort(400)
|
||||
|
||||
if not get_nosql().websub_activateCallback(cap, lease_seconds):
|
||||
return abort(500)
|
||||
|
||||
return challenge
|
||||
|
||||
if get_nosql().websub_existsCallback(cap):
|
||||
if not get_nosql().websub_savePost(cap, request.data):
|
||||
return abort(500)
|
||||
return '', 202
|
||||
|
||||
return abort(404)
|
||||
|
||||
@bp.route('/queue/<cap>', methods=['POST'])
|
||||
def queue(cap):
|
||||
# if endpoint does not exist
|
||||
if not get_nosql().queue_isActive(cap):
|
||||
return abort(404)
|
||||
|
||||
videoId = request.form.get('v')
|
||||
|
||||
# if request is not valid
|
||||
if not videoId:
|
||||
return abort(400)
|
||||
|
||||
# if requested string is not correct
|
||||
if not re.match(r"^[a-zA-Z0-9_-]{11}$", videoId):
|
||||
return abort(422)
|
||||
|
||||
# if given string is already in the archive
|
||||
if get_nosql().check_exists(videoId):
|
||||
return abort(409)
|
||||
|
||||
# try to insert
|
||||
if get_nosql().queue_insertQueue(videoId, cap):
|
||||
return '', 202
|
||||
else:
|
||||
return abort(409)
|
||||
@@ -1,10 +1,8 @@
|
||||
from flask import Blueprint, redirect, url_for, render_template, request, session, flash, current_app
|
||||
from ..extensions import limiter, caching, caching_only_get
|
||||
from flask import Blueprint, redirect, url_for, render_template, request, session, flash, current_app, redirect
|
||||
from ..extensions import limiter, caching, caching_unless, oidc
|
||||
from ..nosql import get_nosql
|
||||
|
||||
from argon2 import PasswordHasher
|
||||
from argon2.exceptions import VerifyMismatchError
|
||||
|
||||
corr = '$argon2id$v=19$m=64,t=3,p=4$YmY5RTV0bU9tRkx3Q0FvUw$VfPI6BowKvsO4pI1aRslXfbigerssHrHQnQNDhgR8Og'
|
||||
from time import sleep
|
||||
|
||||
bp = Blueprint('auth', __name__, url_prefix='/auth')
|
||||
|
||||
@@ -17,33 +15,58 @@ def base():
|
||||
def logout():
|
||||
session.pop('username', None)
|
||||
flash('You have been logged out')
|
||||
return redirect(url_for('index.base'))
|
||||
return redirect(url_for('auth.login'))
|
||||
|
||||
@bp.route('/login', methods=['GET', 'POST'])
|
||||
@limiter.limit('10 per day', override_defaults=False)
|
||||
@caching.cached(unless=caching_only_get)
|
||||
@caching.cached(unless=caching_unless)
|
||||
def login():
|
||||
if request.method == 'POST':
|
||||
password = request.form.get('password', None)
|
||||
|
||||
if current_app.config.get('DEBUG'):
|
||||
session['username'] = 'admin'
|
||||
session['username'] = 'DEBUG ADMIN'
|
||||
flash('You have been logged in')
|
||||
return redirect(url_for('admin.base'))
|
||||
return redirect(request.args.get('next', url_for('admin.base')))
|
||||
|
||||
if not password:
|
||||
flash('Password was empty')
|
||||
return 'password required!'
|
||||
return redirect(url_for('auth.login'))
|
||||
|
||||
try:
|
||||
ph = PasswordHasher()
|
||||
if ph.verify(corr, password):
|
||||
session['username'] = 'admin'
|
||||
flash('You have been logged in')
|
||||
return redirect(url_for('admin.base'))
|
||||
except VerifyMismatchError:
|
||||
flash('Wrong password')
|
||||
except:
|
||||
flash('Something went wrong')
|
||||
|
||||
return render_template('login.html')
|
||||
sleep(0.3)
|
||||
flash('Wrong password')
|
||||
return redirect(url_for('auth.login'))
|
||||
|
||||
return render_template('login.html')
|
||||
|
||||
@bp.route('/oidc', methods=['GET'])
|
||||
def start_oidc():
|
||||
return redirect(oidc.generate_redirect(), code=302)
|
||||
|
||||
@bp.route('/callback', methods=['POST'])
|
||||
@limiter.limit('30 per day', override_defaults=False)
|
||||
@caching.cached(unless=caching_unless)
|
||||
def callback():
|
||||
state = request.form.get('state', None)
|
||||
id_token = request.form.get('id_token', None)
|
||||
|
||||
if request.form.get('error', None):
|
||||
return f'We got an error from the authentication provider with the message: {request.form.get("error_description", None)}', 400
|
||||
|
||||
if state is None or id_token is None:
|
||||
return 'Request error', 400
|
||||
|
||||
if not oidc.state_check(state):
|
||||
return 'CSRF Error, state is not valid', 400
|
||||
|
||||
sub = oidc.check_bearer(id_token)
|
||||
|
||||
if not sub:
|
||||
return f'Invalid JWT token we got: {id_token}', 400
|
||||
|
||||
if not get_nosql().get_user(sub):
|
||||
return f'Authentication successful, but you are not allowed to access authenticated pages. Please report this ID to the administrators if you want access: {sub}', 403
|
||||
|
||||
session['username'] = sub
|
||||
flash('You have been logged in')
|
||||
return redirect(request.args.get('next', url_for('admin.base')))
|
||||
@@ -1,34 +1,65 @@
|
||||
from flask import Blueprint, render_template
|
||||
from flask import Blueprint, render_template, flash, url_for, redirect
|
||||
from ..nosql import get_nosql
|
||||
from ..s3 import get_s3
|
||||
from ..extensions import caching
|
||||
from ..extensions import caching, caching_unless
|
||||
|
||||
bp = Blueprint('channel', __name__, url_prefix='/channel')
|
||||
|
||||
@bp.route('')
|
||||
@caching.cached()
|
||||
@caching.cached(unless=caching_unless)
|
||||
def base():
|
||||
channels = {}
|
||||
channels = []
|
||||
channelIds = get_nosql().list_all_channels()
|
||||
|
||||
for channelId in channelIds:
|
||||
channels[channelId] = get_nosql().get_channel_info(channelId)
|
||||
channels[channelId]['video_count'] = get_nosql().get_channel_videos_count(channelId)
|
||||
channel = get_nosql().get_channel_info(channelId)
|
||||
channel['video_count'] = get_nosql().get_channel_videos_count(channelId)
|
||||
channels.append(channel)
|
||||
|
||||
channels = sorted(channels, key=lambda x: x.get('added_date'), reverse=True)
|
||||
|
||||
return render_template('channel/index.html', channels=channels)
|
||||
|
||||
@bp.route('/<channelId>')
|
||||
@caching.cached()
|
||||
@caching.cached(unless=caching_unless)
|
||||
def channel(channelId):
|
||||
channelInfo = get_nosql().get_channel_info(channelId)
|
||||
|
||||
if not channelInfo:
|
||||
return 'That channel ID does not exist in the system'
|
||||
flash('That channel ID does not exist in the system')
|
||||
return redirect(url_for('channel.base'))
|
||||
|
||||
videoIds = get_nosql().get_channel_videoIds(channelId)
|
||||
|
||||
videos = {}
|
||||
videos = []
|
||||
for videoId in videoIds:
|
||||
videos[videoId] = get_nosql().get_video_info(videoId, limited=True)
|
||||
videos.append(get_nosql().get_video_info(videoId, limited=True))
|
||||
|
||||
videos = sorted(videos, key=lambda x: x.get('upload_date', '19700101'), reverse=True)
|
||||
|
||||
return render_template('channel/channel.html', channel=channelInfo, videos=videos)
|
||||
return render_template('channel/channel.html', channel=channelInfo, videos=videos)
|
||||
|
||||
@bp.route('/orphaned')
|
||||
@caching.cached(unless=caching_unless)
|
||||
def orphaned():
|
||||
videoIds = get_nosql().get_orphaned_videos()
|
||||
|
||||
videos = []
|
||||
for videoId in videoIds:
|
||||
videos.append(get_nosql().get_video_info(videoId, limited=True))
|
||||
|
||||
videos = sorted(videos, key=lambda x: x.get('epoch', 0), reverse=True)
|
||||
|
||||
return render_template('channel/orphaned.html', videos=videos)
|
||||
|
||||
@bp.route('/recent')
|
||||
@caching.cached(unless=caching_unless)
|
||||
def recent():
|
||||
videoIds = get_nosql().get_recent_videos()
|
||||
|
||||
videos = []
|
||||
for videoId in videoIds:
|
||||
videos.append(get_nosql().get_video_info(videoId, limited=True))
|
||||
|
||||
videos = sorted(videos, key=lambda x: x.get('epoch', 0), reverse=True)
|
||||
|
||||
return render_template('channel/recent.html', videos=videos)
|
||||
@@ -1,14 +1,19 @@
|
||||
from flask import Blueprint, render_template
|
||||
from ..extensions import caching
|
||||
from flask import Blueprint, render_template, send_from_directory
|
||||
from ..extensions import caching, caching_unless
|
||||
|
||||
bp = Blueprint('index', __name__, url_prefix='/')
|
||||
|
||||
@bp.route('', methods=['GET'])
|
||||
@caching.cached()
|
||||
@caching.cached(unless=caching_unless)
|
||||
def base():
|
||||
return render_template('index.html')
|
||||
|
||||
@bp.route('help', methods=['GET'])
|
||||
@caching.cached()
|
||||
@caching.cached(unless=caching_unless)
|
||||
def help():
|
||||
return render_template('index.html')
|
||||
return render_template('help.html')
|
||||
|
||||
@bp.route('robots.txt', methods=['GET'])
|
||||
@caching.cached(unless=caching_unless)
|
||||
def robots():
|
||||
return render_template('robots.txt')
|
||||
|
||||
@@ -1,10 +1,26 @@
|
||||
from flask import Blueprint, render_template
|
||||
from flask import Blueprint, render_template, request, flash, redirect, url_for
|
||||
from ..nosql import get_nosql
|
||||
from ..extensions import caching
|
||||
from ..extensions import limiter, caching, caching_unless
|
||||
|
||||
bp = Blueprint('search', __name__, url_prefix='/search')
|
||||
|
||||
@bp.route('')
|
||||
@caching.cached()
|
||||
@bp.route('', methods=['GET', 'POST'])
|
||||
@limiter.limit('50 per day', override_defaults=False)
|
||||
@caching.cached(unless=caching_unless)
|
||||
def base():
|
||||
return render_template('search/index.html', stats=get_nosql().gen_stats())
|
||||
if request.method == 'POST':
|
||||
task = request.form.get('task')
|
||||
|
||||
if task == 'search':
|
||||
query = request.form.get('query')
|
||||
|
||||
if not 3 <= len(query) <= 64:
|
||||
flash('Query too short or too long, must be between 3 and 64')
|
||||
return redirect(url_for('search.base'))
|
||||
|
||||
results = get_nosql().search_videos(query)
|
||||
|
||||
return render_template('search/index.html', results=results, query=query)
|
||||
|
||||
|
||||
return render_template('search/index.html', stats=get_nosql().gen_stats())
|
||||
@@ -1,19 +1,44 @@
|
||||
from flask import Blueprint, render_template, request
|
||||
from flask import Blueprint, render_template, request, flash, redirect, url_for
|
||||
from ..nosql import get_nosql
|
||||
from ..extensions import caching, caching_v_parameter
|
||||
from ..extensions import caching, caching_v_parameter, caching_unless
|
||||
|
||||
bp = Blueprint('watch', __name__, url_prefix='/watch')
|
||||
|
||||
@bp.route('', methods=['GET'])
|
||||
@caching.cached(make_cache_key=caching_v_parameter)
|
||||
@bp.route('', methods=['GET', 'POST'])
|
||||
@caching.cached(make_cache_key=caching_v_parameter, unless=caching_unless)
|
||||
def base():
|
||||
render = {}
|
||||
|
||||
vGet = request.args.get('v')
|
||||
|
||||
if not vGet:
|
||||
flash('Thats not how it works pal')
|
||||
return redirect(url_for('index.base'))
|
||||
|
||||
if not get_nosql().check_exists(vGet):
|
||||
return render_template('watch/404.html')
|
||||
flash('The requested video is not in the archive')
|
||||
return redirect(url_for('index.base'))
|
||||
|
||||
render = {}
|
||||
|
||||
if request.method == 'POST':
|
||||
reason = request.form.get('reason')
|
||||
|
||||
if reason not in ['auto-video', 'metadata', 'illegal']:
|
||||
flash('Invalid report reason')
|
||||
return redirect(url_for('watch.base', v=vGet))
|
||||
else:
|
||||
reportId = get_nosql().insert_report(vGet, reason)
|
||||
if reportId:
|
||||
flash(f'Report has been received: {reportId}')
|
||||
return redirect(url_for('watch.base', v=vGet))
|
||||
else:
|
||||
flash('Something went wrong with reporting')
|
||||
return redirect(url_for('watch.base', v=vGet))
|
||||
|
||||
render['info'] = get_nosql().get_video_info(vGet)
|
||||
render['params'] = request.args.get('v')
|
||||
|
||||
if render['info'].get('_status') != 'available':
|
||||
flash(render['info'].get('_status_description', 'Video unavailable because of technical errors. Come back later.'))
|
||||
return redirect(url_for('index.base'))
|
||||
|
||||
return render_template('watch/index.html', render=render)
|
||||
|
||||
@@ -2,10 +2,10 @@ import yt_dlp
|
||||
|
||||
|
||||
def checkChannelId(channelId):
|
||||
if len(channelId) < 24: # channelId lengths are 24 characters
|
||||
if len(channelId) <= 23: # channelId lengths are 24 characters
|
||||
return False
|
||||
|
||||
if len(channelId) > 25: # But some are 25, idk why
|
||||
if len(channelId) >= 26: # But some are 25, idk why
|
||||
return False
|
||||
|
||||
if channelId[0:2] not in ['UC', 'UU']:
|
||||
|
||||
@@ -3,23 +3,49 @@ from flask_limiter.util import get_remote_address
|
||||
|
||||
from flask_caching import Cache
|
||||
|
||||
from flask import request
|
||||
from celery import Celery, Task
|
||||
|
||||
from .oidc import OIDC
|
||||
|
||||
def caching_only_get(*args, **kwargs):
|
||||
if request.method == 'GET':
|
||||
return False
|
||||
from flask import Flask, request, session
|
||||
|
||||
def celery_init_app(app: Flask) -> Celery:
|
||||
class FlaskTask(Task):
|
||||
def __call__(self, *args: object, **kwargs: object) -> object:
|
||||
with app.app_context():
|
||||
return self.run(*args, **kwargs)
|
||||
|
||||
celery_app = Celery(app.name, task_cls=FlaskTask)
|
||||
celery_app.config_from_object(app.config["CELERY"])
|
||||
celery_app.set_default()
|
||||
app.extensions["celery"] = celery_app
|
||||
return celery_app
|
||||
|
||||
def caching_unless(*args, **kwargs):
|
||||
# if it is not a get request
|
||||
if request.method != 'GET':
|
||||
return True
|
||||
|
||||
# if username is defined in session cookie
|
||||
if session.get('username'):
|
||||
return True
|
||||
|
||||
# in the case that a user is not logged in but a message needs to be flashed, do not cache page
|
||||
if session.get('_flashes'):
|
||||
return True
|
||||
|
||||
return True
|
||||
# do cache page
|
||||
return False
|
||||
|
||||
def caching_v_parameter(*args, **kwargs):
|
||||
return request.args.get('v')
|
||||
|
||||
|
||||
limiter = Limiter(
|
||||
get_remote_address,
|
||||
default_limits=['1000 per day', '100 per hour'],
|
||||
default_limits=['86400 per day', '3600 per hour'],
|
||||
storage_uri="memory://",
|
||||
)
|
||||
|
||||
caching = Cache()
|
||||
caching = Cache()
|
||||
|
||||
oidc = OIDC()
|
||||
@@ -1,18 +1,34 @@
|
||||
from datetime import datetime
|
||||
|
||||
def pretty_duration(seconds):
|
||||
minutes, seconds = divmod(seconds, 60)
|
||||
return f'{minutes}:{seconds} minutes'
|
||||
if seconds is None:
|
||||
return None
|
||||
|
||||
minutes, seconds = divmod(seconds, 60) # split total seconds to minute and remaining seconds
|
||||
return f'{minutes}:{seconds:>02} minutes' # return mm:ss format including padding in seconds
|
||||
|
||||
def pretty_time(time):
|
||||
try:
|
||||
return time.strftime('%d %b %Y')
|
||||
return time.strftime('%d %b %Y') # try to return pretty time if given object is datetime
|
||||
except:
|
||||
try:
|
||||
return datetime.strptime(time, '%Y%m%d').strftime('%d %b %Y')
|
||||
return datetime.strptime(time, '%Y%m%d').strftime('%d %b %Y') # try to parse str and give back pretty time
|
||||
except:
|
||||
return time
|
||||
return time # return given time
|
||||
|
||||
|
||||
def current_time(null):
|
||||
return datetime.utcnow().isoformat(sep=" ", timespec="seconds")
|
||||
def epoch_date(epoch):
|
||||
try:
|
||||
return datetime.fromtimestamp(epoch).strftime('%d %b %Y')
|
||||
except:
|
||||
return None
|
||||
|
||||
def epoch_time(epoch):
|
||||
try:
|
||||
return datetime.fromtimestamp(epoch).strftime('%d %b %Y %H:%M:%S')
|
||||
except:
|
||||
return None
|
||||
|
||||
def current_time(null=None, object=False):
|
||||
if object:
|
||||
return datetime.utcnow().replace(microsecond=0)
|
||||
return datetime.utcnow().isoformat(sep=" ", timespec="seconds") # return time in iso format without milliseconds
|
||||
434
ayta/nosql.py
434
ayta/nosql.py
File diff suppressed because it is too large
Load Diff
162
ayta/oidc.py
Normal file
162
ayta/oidc.py
Normal file
@@ -0,0 +1,162 @@
|
||||
class OIDC():
|
||||
"""
|
||||
This function class is nothing more than a nonce and state store for security in the authentication mechanism.
|
||||
Additionally this class provides the function to generate redirect url's and check bearer tokens on their validity as well as caching jwt signing keys.
|
||||
Fairly barebones and should be 100% secure. (famous last words)
|
||||
This is made for form posted JWT's. While not the most secure it is the most easy way to implement. Moving on to a code based solution might be preferred in the future.
|
||||
"""
|
||||
def __init__(self, app=None):
|
||||
self.states = {}
|
||||
self.nonces = {}
|
||||
|
||||
if app is not None:
|
||||
self.init_app(app)
|
||||
|
||||
def init_app(self, app):
|
||||
import requests
|
||||
import jwt
|
||||
|
||||
config = app.config.copy()
|
||||
|
||||
self.client_id = config['OIDC_ID']
|
||||
self.provider = config['OIDC_PROVIDER']
|
||||
self.domain = config['DOMAIN']
|
||||
self.window = 120 # the time window to allow states and nonces in seconds
|
||||
|
||||
# Authentication provider url must be HTTPS and end on a TLD
|
||||
if self.provider[:8] != 'https://' or self.provider[-1] == '/':
|
||||
print('Incorrect OIDC provider URI', flush=True)
|
||||
exit()
|
||||
|
||||
# Get the provider configuration endpoints
|
||||
configuration = requests.get(f'{self.provider}/.well-known/openid-configuration').json()
|
||||
|
||||
jwks_uri = configuration.get('jwks_uri')
|
||||
self.authorize_uri = configuration.get('authorization_endpoint')
|
||||
|
||||
# Start the JWKS management client, it will load the keys and maintain them
|
||||
self.jwks_manager = jwt.PyJWKClient(jwks_uri)
|
||||
|
||||
#######################################################
|
||||
|
||||
def state_maintenance(self):
|
||||
from datetime import datetime
|
||||
|
||||
# Current time minus the acceptable window
|
||||
pivot = datetime.now().timestamp() - self.window
|
||||
|
||||
# List with expired states
|
||||
expired_states = [state for state, timestamp in self.states.items() if timestamp <= pivot]
|
||||
|
||||
# Remove expired states from store
|
||||
for state in expired_states:
|
||||
del self.states[state]
|
||||
|
||||
def state_gen(self):
|
||||
import secrets
|
||||
from datetime import datetime
|
||||
|
||||
# Clean state store first
|
||||
self.state_maintenance()
|
||||
|
||||
# Generate token and paired timestamp
|
||||
state = secrets.token_urlsafe(8)
|
||||
timestamp = datetime.now().timestamp()
|
||||
|
||||
# Add token to the state store
|
||||
self.states[state] = timestamp
|
||||
|
||||
# Return the state
|
||||
return state
|
||||
|
||||
def state_check(self, state):
|
||||
# Clean state store first
|
||||
self.state_maintenance()
|
||||
|
||||
# If given state is actively stored
|
||||
if state in self.states:
|
||||
# Delete state and return True
|
||||
del self.states[state]
|
||||
return True
|
||||
|
||||
# Given state is not stored
|
||||
return False
|
||||
|
||||
#######################################################
|
||||
# Same code as above but a different store for nonces #
|
||||
#######################################################
|
||||
|
||||
def nonce_maintenance(self):
|
||||
from datetime import datetime
|
||||
|
||||
pivot = datetime.now().timestamp() - self.window
|
||||
|
||||
expired_nonces = [nonce for nonce, timestamp in self.nonces.items() if timestamp <= pivot]
|
||||
|
||||
for nonce in expired_nonces:
|
||||
del self.nonces[nonce]
|
||||
|
||||
def nonce_gen(self):
|
||||
import secrets
|
||||
from datetime import datetime
|
||||
|
||||
self.nonce_maintenance()
|
||||
|
||||
nonce = secrets.token_urlsafe(8)
|
||||
timestamp = datetime.now().timestamp()
|
||||
|
||||
self.nonces[nonce] = timestamp
|
||||
|
||||
return nonce
|
||||
|
||||
def nonce_check(self, nonce):
|
||||
self.nonce_maintenance()
|
||||
|
||||
if nonce in self.nonces:
|
||||
del self.nonces[nonce]
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
#######################################################
|
||||
|
||||
def generate_redirect(self):
|
||||
return str(f'{self.authorize_uri}'
|
||||
'?response_mode=form_post&response_type=id_token&scope=openid'
|
||||
f'&redirect_uri={self.domain}/auth/callback'
|
||||
f'&client_id={self.client_id}'
|
||||
f'&nonce={self.nonce_gen()}'
|
||||
f'&state={self.state_gen()}')
|
||||
|
||||
def check_bearer(self, token):
|
||||
import jwt
|
||||
|
||||
# Test given JWT
|
||||
try:
|
||||
# Get the signed public key from the token
|
||||
signing_key = self.jwks_manager.get_signing_key_from_jwt(token).key
|
||||
|
||||
# Try to decode the token, this will also check the validity in these points:
|
||||
# 1. Token is signed by expected keys
|
||||
# 2. Token is issued by the expected provider
|
||||
# 3. Expected parameters are really in the token
|
||||
# 4. Token is really intended for us
|
||||
# 5. Token is still valid (with 5 sec margin)
|
||||
decoded = jwt.decode(token, signing_key,
|
||||
algorithms=jwt.algorithms.get_default_algorithms(),
|
||||
issuer=self.provider,
|
||||
require=['aud', 'client_id', 'exp', 'iat', 'iss', 'rat', 'sub'],
|
||||
audience=self.client_id,
|
||||
leeway=5)
|
||||
|
||||
# Any exception (invalid JWT, invalid formatting etc...) must return False
|
||||
except Exception as e:
|
||||
print(e, flush=True)
|
||||
return False
|
||||
|
||||
# Double check if given token is really requested by us by matching the nonce in the signed key
|
||||
if not self.nonce_check(decoded.get('nonce', None)):
|
||||
return False
|
||||
|
||||
# Return the unique user identifier
|
||||
return decoded.get('sub', False)
|
||||
50
ayta/s3.py
50
ayta/s3.py
@@ -1,50 +0,0 @@
|
||||
from minio import Minio
|
||||
from minio.error import S3Error
|
||||
|
||||
from flask import current_app
|
||||
from flask import g
|
||||
|
||||
##########################################
|
||||
# SETUP FLASK #
|
||||
##########################################
|
||||
|
||||
def get_s3():
|
||||
"""Connect to the application's configured database. The connection is unique for each request and will be reused if this is called again."""
|
||||
if "s3" not in g:
|
||||
g.s3 = Mineral(current_app.config["S3_CONNECTION"], current_app.config["S3_ACCESSKEY"], current_app.config["S3_SECRETKEY"])
|
||||
|
||||
return g.s3
|
||||
|
||||
|
||||
def close_s3(e=None):
|
||||
"""If this request connected to the database, close the connection."""
|
||||
s3 = g.pop("s3", None)
|
||||
|
||||
if s3 is not None:
|
||||
s3.close()
|
||||
|
||||
def init_app(app):
|
||||
"""Register database functions with the Flask app. This is called by the application factory."""
|
||||
app.teardown_appcontext(close_s3)
|
||||
#app.cli.add_command(init_db_command)
|
||||
|
||||
##########################################
|
||||
# ORM #
|
||||
##########################################
|
||||
|
||||
class Mineral:
|
||||
def __init__(self, location, access, secret):
|
||||
try:
|
||||
self.client = Minio(location, access_key=access, secret_key=secret, secure=False)
|
||||
except S3Error as exc:
|
||||
print('Minio connection error ', exc)
|
||||
|
||||
def list_objects(self, bucket='ytarchive'):
|
||||
ret = self.client.list_objects(bucket, '')
|
||||
rett = []
|
||||
|
||||
for r in ret:
|
||||
print(r.object_name, flush=True)
|
||||
rett.append(r)
|
||||
|
||||
return rett
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 318 B |
Binary file not shown.
BIN
ayta/static/img/fuck_webp.png
Normal file
BIN
ayta/static/img/fuck_webp.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 401 KiB |
BIN
ayta/static/img/hate_speech.png
Normal file
BIN
ayta/static/img/hate_speech.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 14 KiB |
|
Before Width: | Height: | Size: 1.1 KiB After Width: | Height: | Size: 1.1 KiB |
@@ -1,44 +0,0 @@
|
||||
function channelSort() {
|
||||
const sortOption = document.querySelector(".sort").value;
|
||||
const [sortBy, direction] = sortOption.split("-");
|
||||
const isInt = sortBy !== "search";
|
||||
const container = document.querySelector(".channels.flex-grid");
|
||||
[...container.children]
|
||||
.sort((a,b)=>{
|
||||
const dir = direction ? 1 : -1;
|
||||
let valA = a.dataset[sortBy];
|
||||
let valB = b.dataset[sortBy];
|
||||
if (isInt) {
|
||||
valA = parseInt(valA);
|
||||
valB = parseInt(valB);
|
||||
}
|
||||
return (valA>valB?1:-1)*dir;
|
||||
})
|
||||
.forEach(node=>container.appendChild(node));
|
||||
}
|
||||
|
||||
function channelSearch() {
|
||||
let searchTerm = document.querySelector(".search").value.toLowerCase();
|
||||
const allowedClasses = [];
|
||||
const filteredClasses = [];
|
||||
|
||||
document.querySelectorAll('.searchable').forEach((e) => {
|
||||
let filtered = false;
|
||||
for (const c of allowedClasses) {
|
||||
if (!e.querySelector(`.${c}`)) filtered = true;
|
||||
}
|
||||
for (const c of filteredClasses) {
|
||||
if (e.querySelector(`.${c}`)) filtered = true;
|
||||
}
|
||||
|
||||
if (!filtered && (searchTerm === "" || e.dataset.search.toLowerCase().includes(searchTerm))) {
|
||||
e.classList.remove("hide");
|
||||
} else {
|
||||
e.classList.add("hide");
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
window.addEventListener("load", () => {
|
||||
channelSearch();
|
||||
});
|
||||
File diff suppressed because it is too large
Load Diff
202
ayta/tasks.py
Normal file
202
ayta/tasks.py
Normal file
@@ -0,0 +1,202 @@
|
||||
from celery import shared_task
|
||||
from flask import current_app
|
||||
|
||||
##########################################
|
||||
# CELERY TASKS #
|
||||
##########################################
|
||||
|
||||
@shared_task()
|
||||
def test_sleep(time=60):
|
||||
from time import sleep
|
||||
sleep(time)
|
||||
return True
|
||||
|
||||
@shared_task()
|
||||
def video_download(videoId):
|
||||
"""
|
||||
I do not want to deal with the quirks of native yt-dlp in python, hence the subprocess.
|
||||
"""
|
||||
import subprocess
|
||||
|
||||
process = subprocess.run(['/usr/local/bin/yt-dlp', '--config-location', '/var/www/archive.ventilaar.net/goodstuff/config_video.conf', '--', f'https://www.youtube.com/watch?v={videoId}'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)
|
||||
|
||||
if process.returncode != 0:
|
||||
return False
|
||||
return True
|
||||
|
||||
@shared_task()
|
||||
def video_queue():
|
||||
"""
|
||||
Gets the oldest video ID from the queue and runs video_download() on it.
|
||||
"""
|
||||
from .nosql import get_nosql
|
||||
|
||||
videoId = get_nosql().queue_getNext()
|
||||
|
||||
if videoId:
|
||||
videoId = videoId['id']
|
||||
else:
|
||||
return None
|
||||
|
||||
if video_download(videoId):
|
||||
get_nosql().queue_deleteQueue(videoId)
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
@shared_task()
|
||||
def websub_subscribe_callback(channelId):
|
||||
import requests
|
||||
from .nosql import get_nosql
|
||||
|
||||
# check if a callback already exists for channel
|
||||
answer = get_nosql().websub_existsCallback(channelId, channel=True)
|
||||
|
||||
if not answer:
|
||||
callbackId = get_nosql().websub_newCallback(channelId)
|
||||
else:
|
||||
callbackId = answer
|
||||
|
||||
url = 'https://pubsubhubbub.appspot.com/subscribe'
|
||||
data = {
|
||||
'hub.callback': f'{current_app.config["DOMAIN"]}/api/websub/{callbackId}',
|
||||
'hub.topic': f'https://www.youtube.com/xml/feeds/videos.xml?channel_id={channelId}',
|
||||
'hub.verify': 'async',
|
||||
'hub.mode': 'subscribe',
|
||||
'hub.verify_token': '',
|
||||
'hub.secret': '',
|
||||
'hub.lease_numbers': '432000',
|
||||
}
|
||||
|
||||
get_nosql().websub_requestingCallback(callbackId)
|
||||
response = requests.post(url, data=data)
|
||||
if response.status_code == 202:
|
||||
return True
|
||||
|
||||
# maybe handle errors?
|
||||
|
||||
return False
|
||||
|
||||
@shared_task()
|
||||
def websub_unsubscribe_callback(callbackId):
|
||||
import requests
|
||||
from .nosql import get_nosql
|
||||
|
||||
answer = get_nosql().websub_existsCallback(callbackId)
|
||||
|
||||
if not answer:
|
||||
return False
|
||||
|
||||
channelId = get_nosql().websub_getCallback(callbackId).get('channel')
|
||||
|
||||
url = 'https://pubsubhubbub.appspot.com/subscribe'
|
||||
data = {'hub.callback': f'{current_app.config["DOMAIN"]}/api/websub/{callbackId}',
|
||||
'hub.topic': f'https://www.youtube.com/xml/feeds/videos.xml?channel_id={channelId}',
|
||||
'hub.verify': 'async',
|
||||
'hub.mode': 'unsubscribe'
|
||||
}
|
||||
|
||||
get_nosql().websub_retiringCallback(callbackId)
|
||||
response = requests.post(url, data=data)
|
||||
|
||||
if response.status_code == 202:
|
||||
return True
|
||||
|
||||
# maybe handle errors?
|
||||
|
||||
return False
|
||||
|
||||
@shared_task()
|
||||
def websub_process_data():
|
||||
from .nosql import get_nosql
|
||||
|
||||
while True:
|
||||
blob = get_nosql().websub_getFirstPostData()
|
||||
if not blob:
|
||||
break
|
||||
|
||||
_id, data = blob
|
||||
|
||||
parsed = do_parse_data(data)
|
||||
if parsed:
|
||||
state, channelId, videoId = parsed
|
||||
|
||||
if state == 'added':
|
||||
if not get_nosql().check_exists(videoId): # if video not exists
|
||||
get_nosql().queue_insertQueue(videoId, 'WebSub')
|
||||
# note for future me
|
||||
# the websub notifications report ALL videos, including shorts and livestreams
|
||||
# so if you are going to work on individual video downloading make sure you filter them!
|
||||
|
||||
elif state == 'removed':
|
||||
# we currently do not do anything with removed videos
|
||||
# but the idea is to trigger a full channel mirror in case a creator started to mass delete videos
|
||||
pass
|
||||
|
||||
get_nosql().websub_deletePostProcessing(_id)
|
||||
|
||||
@shared_task()
|
||||
def websub_renew_expiring(hours=6):
|
||||
from .nosql import get_nosql
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
count = 0
|
||||
|
||||
for callbackId in get_nosql().websub_getCallbacks():
|
||||
data = get_nosql().websub_getCallback(callbackId)
|
||||
|
||||
if data.get('status') not in ['active']: # callback not active
|
||||
continue
|
||||
|
||||
pivot = datetime.utcnow() + timedelta(hours=hours) # hours past now
|
||||
expires = data.get('activation_time') + timedelta(seconds=data.get('lease')) # callback expires at
|
||||
|
||||
if pivot <= expires: # expiration happens after n hours fron now
|
||||
continue # skip callback
|
||||
|
||||
# expiration happens within n hours
|
||||
websub_subscribe_callback.delay(data.get('channel'))
|
||||
|
||||
# limit amount of subscribe requests to spread out the requests over time
|
||||
# with an expiration pivot of 6h and a maximum validity of 5 days we can currently handle 3072 channels
|
||||
count = count + 1
|
||||
if count >= 256:
|
||||
break
|
||||
|
||||
##########################################
|
||||
# TASK MODULES #
|
||||
##########################################
|
||||
|
||||
def do_parse_data(data):
|
||||
import xml.etree.ElementTree as ET
|
||||
|
||||
data = data.decode('utf-8')
|
||||
|
||||
try:
|
||||
root = ET.fromstring(data)
|
||||
except ET.ParseError:
|
||||
print('Not XML')
|
||||
return False
|
||||
|
||||
yt = any(child.tag.startswith('{http://www.youtube.com/xml/schemas/2015}') for child in root.iter())
|
||||
at = any(child.tag.startswith('{http://purl.org/atompub/tombstones/1.0}') for child in root.iter())
|
||||
|
||||
if yt and not at:
|
||||
# Video published
|
||||
state = 'added'
|
||||
ns = {'yt': 'http://www.youtube.com/xml/schemas/2015', '': 'http://www.w3.org/2005/Atom'}
|
||||
entry = root.find('.//{http://www.w3.org/2005/Atom}entry')
|
||||
videoId = entry.find('./yt:videoId', ns).text
|
||||
channelId = entry.find('./yt:channelId', ns).text
|
||||
elif not yt and at:
|
||||
# Video hidden
|
||||
state = 'removed'
|
||||
ns = {'at': 'http://purl.org/atompub/tombstones/1.0', '': 'http://www.w3.org/2005/Atom'}
|
||||
deleted_entry = root.find('.//{http://purl.org/atompub/tombstones/1.0}deleted-entry')
|
||||
videoId = deleted_entry.attrib['ref'].split(':')[-1]
|
||||
channelId = deleted_entry.find('./at:by/uri', ns).text.split('/')[-1]
|
||||
else:
|
||||
print('Unknown xml')
|
||||
return False
|
||||
|
||||
return (state, channelId, videoId)
|
||||
@@ -1,20 +1,25 @@
|
||||
{% extends 'material_base.html' %}
|
||||
{% block title %}Channel administration page | AYTA{% endblock %}
|
||||
{% block title %}Channel administration page{% endblock %}
|
||||
{% block description %}Channel administration page of the AYTA system{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
<div class="col s12 l11">
|
||||
<h4>{{ channelInfo.original_name }} administration page</h4>
|
||||
<p>The update actions below directly apply to the database!</p>
|
||||
</div>
|
||||
<div class="col s12 l1 m-5">
|
||||
<form method="POST">
|
||||
<input title="Requests callback URL from youtube API" type="submit" value="subscribe-websub" name="task">
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row">
|
||||
<div class="col s12 l4">
|
||||
{% for item in channelInfo %}
|
||||
<form method="POST">
|
||||
<div class="input-field">
|
||||
<span class="supporting-text">{{ item }}</span>
|
||||
<span class="supporting-text mb-2">{{ item }}</span>
|
||||
<input class="validate" type="text" value="{{ item }}" name="key" hidden>
|
||||
</div>
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{% extends 'material_base.html' %}
|
||||
{% block title %}Channels administration page | AYTA{% endblock %}
|
||||
{% block title %}Channels administration page{% endblock %}
|
||||
{% block description %}Channels administration page of the AYTA system{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
@@ -43,30 +43,6 @@
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col s12 l4 m-4">
|
||||
<div class="card green">
|
||||
<div class="card-content white-text">
|
||||
<span class="card-title">Placeholder</span>
|
||||
<p>I am a very simple card. I am good at containing small bits of information. I am convenient because I require little markup to use effectively.</p>
|
||||
</div>
|
||||
<div class="card-action">
|
||||
<a href="#">This is a link</a>
|
||||
<a href="#">This is a link</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col s12 l4 m-4">
|
||||
<div class="card green">
|
||||
<div class="card-content white-text">
|
||||
<span class="card-title">Placeholder</span>
|
||||
<p>I am a very simple card. I am good at containing small bits of information. I am convenient because I require little markup to use effectively.</p>
|
||||
</div>
|
||||
<div class="card-action">
|
||||
<a href="#">This is a link</a>
|
||||
<a href="#">This is a link</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="divider"></div>
|
||||
<div class="row">
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{% extends 'material_base.html' %}
|
||||
{% block title %}Administration page | AYTA{% endblock %}
|
||||
{% block title %}Administration page{% endblock %}
|
||||
{% block description %}Administration page of the AYTA system{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
@@ -11,39 +11,89 @@
|
||||
<div class="divider"></div>
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
<h5>Global channel options</h5>
|
||||
<h5>Global channel options</h5>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row">
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.channels') }}">
|
||||
<div class="card black-text">
|
||||
<a href="{{ url_for('admin.system') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">System</span>
|
||||
<p class="grey-text">Internal system settings</p>
|
||||
<p class="grey-text">Internal system settings</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.channels') }}">
|
||||
<div class="card black-text">
|
||||
<a href="{{ url_for('admin.channels') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Channels</span>
|
||||
<p class="grey-text">Manage channels in the system</p>
|
||||
<p class="grey-text">Manage channels in the system</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.runs') }}">
|
||||
<div class="card black-text">
|
||||
<a href="{{ url_for('admin.runs') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Archive runs</span>
|
||||
<p class="grey-text">Look at the cron run logs</p>
|
||||
<p class="grey-text">Look at the cron run logs</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.websub') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">WebSub</span>
|
||||
<p class="grey-text">Edit WebSub YouTube links</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.reports') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Reports</span>
|
||||
<p class="grey-text">View user reports</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.queue') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Queue</span>
|
||||
<p class="grey-text">Video download queue and API access</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.users') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Users</span>
|
||||
<p class="grey-text">Authenticated users</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col s6 l4 m-4">
|
||||
<a href="{{ url_for('admin.workers') }}">
|
||||
<div class="card black-text">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Workers</span>
|
||||
<p class="grey-text">Worker and task management</p>
|
||||
</div>
|
||||
</div>
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
{% endblock %}
|
||||
172
ayta/templates/admin/queue.html
Normal file
172
ayta/templates/admin/queue.html
Normal file
@@ -0,0 +1,172 @@
|
||||
{% extends 'material_base.html' %}
|
||||
{% block title %}Queue administration page{% endblock %}
|
||||
{% block description %}Queue administration page of the AYTA system{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
<h4>Queue administration page</h4>
|
||||
</div>
|
||||
</div>
|
||||
<div class="divider"></div>
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
<h5>Options</h5>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row">
|
||||
<div class="col s12 l4 m-4">
|
||||
<div class="card">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Direct actions</span>
|
||||
<form class="mt-4" method="post" onsubmit="return confirm('Are you sure?');">
|
||||
<button class="btn mb-2 red" type="submit" name="task" value="empty-queue">Empty Queue</button>
|
||||
<br>
|
||||
<span class="supporting-text">Removes all queued ids</span>
|
||||
</form>
|
||||
<form class="mt-4" method="post" onsubmit="return confirm('Are you sure?');">
|
||||
<button class="btn mb-2" type="submit" name="task" value="clean-retired">Clean retired</button>
|
||||
<br>
|
||||
<span class="supporting-text">Prunes all deactivated endpoints, but keeps last 3 days</span>
|
||||
</form>
|
||||
<form class="mt-4" method="post" onsubmit="return confirm('Are you sure?');">
|
||||
<button class="btn mb-2 green" type="submit" name="task" value="queue-run-once">Download oldest queued</button>
|
||||
<br>
|
||||
<span class="supporting-text">Will download the oldest queued video ID</span>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col s12 l4 m-4">
|
||||
<div class="card">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Create new endpoint</span>
|
||||
<form method="post">
|
||||
<div class="row">
|
||||
<div class="col s12 input-field">
|
||||
<input placeholder="Custom endpoint" name="value" type="text" class="validate" minlength="12">
|
||||
<span class="supporting-text">Leaving this empty will create a random secure string</span>
|
||||
</div>
|
||||
<div class="col s12 input-field">
|
||||
<input placeholder="Description" name="description" type="text" class="validate" minlength="8" maxlength="64" required>
|
||||
<span class="supporting-text">Description for the endpoint for better administration</span>
|
||||
</div>
|
||||
<button class="btn mt-4" type="submit" name="task" value="add-endpoint">Create</button>
|
||||
</div>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col s12 l4 m-4">
|
||||
<div class="card">
|
||||
<div class="card-content">
|
||||
<span class="card-title">Queue manually</span>
|
||||
<form method="post">
|
||||
<div class="row">
|
||||
<div class="col s12 input-field">
|
||||
<input placeholder="Youtube video ID" name="value" type="text" class="validate" minlength="11" maxlength="11" required>
|
||||
<span class="supporting-text">Must be a valid Youtube video ID</span>
|
||||
</div>
|
||||
<div class="col s12 mt-5 input-field">
|
||||
<div class="switch">
|
||||
<label>Queue<input type="checkbox" value="direct" name="direct"><span class="lever"></span>Direct</label>
|
||||
<span class="supporting-text">Queue up or start directly</span>
|
||||
</div>
|
||||
</div>
|
||||
<button class="btn mt-4" type="submit" name="task" value="manual-queue">Queue</button>
|
||||
</div>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="divider"></div>
|
||||
<div class="row">
|
||||
<div class="col s6 l9">
|
||||
<h5>Registered endpoints</h5>
|
||||
</div>
|
||||
<div class="col s6 l3 m-4 input-field">
|
||||
<input id="filter_query" type="text">
|
||||
<label for="filter_query">Filter results</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
<table class="striped highlight responsive-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Actions</th>
|
||||
<th>id</th>
|
||||
<th>description</th>
|
||||
<th>status</th>
|
||||
<th>created_time</th>
|
||||
<th>retired_time</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for endpoint in endpoints %}
|
||||
<tr class="filterable">
|
||||
<td>
|
||||
<form method="post">
|
||||
<input type="text" value="{{ endpoint.get('id') }}" name="value" hidden>
|
||||
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="retire" title="Retire endpoint" {% if endpoint.get('status') != 'active' %}disabled{% endif %}>🗑️</button>
|
||||
</form>
|
||||
</td>
|
||||
<td>{{ endpoint.get('id') }}</td>
|
||||
<td>{{ endpoint.get('description') }}</td>
|
||||
<td>{{ endpoint.get('status') }}</td>
|
||||
<td>{{ endpoint.get('created_time') }}</td>
|
||||
<td>{{ endpoint.get('retired_time') }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="divider"></div>
|
||||
<div class="row">
|
||||
<div class="col s6 l9">
|
||||
<h5>Queued ID's</h5>
|
||||
</div>
|
||||
<div class="col s6 l3 m-4 input-field">
|
||||
<input id="filter_query" type="text">
|
||||
<label for="filter_query">Filter results</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
<table class="striped highlight responsive-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Actions</th>
|
||||
<th>id</th>
|
||||
<th>endpoint</th>
|
||||
<th>status</th>
|
||||
<th>created_time</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for id in queue %}
|
||||
<tr class="filterable">
|
||||
<td>
|
||||
<form method="post">
|
||||
<input type="text" value="{{ id.get('id') }}" name="value" hidden>
|
||||
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="delete-queue" title="Delete from queue" {% if id.get('status') != 'queued' %}disabled{% endif %}>🗑️</button>
|
||||
</form>
|
||||
<form method="post">
|
||||
<input type="text" value="{{ id.get('id') }}" name="value" hidden>
|
||||
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="run-download" title="Run download task" disabled}>⏩</button>
|
||||
<!-- This function fill not work until the download queue and video download process is rewritten -->
|
||||
</form>
|
||||
</td>
|
||||
<td>{{ id.get('id') }}</td>
|
||||
<td>{{ id.get('endpoint') }}</td>
|
||||
<td>{{ id.get('status') }}</td>
|
||||
<td>{{ id.get('created_time') }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
{% endblock %}
|
||||
71
ayta/templates/admin/reports.html
Normal file
71
ayta/templates/admin/reports.html
Normal file
@@ -0,0 +1,71 @@
|
||||
{% extends 'material_base.html' %}
|
||||
{% block title %}Reports administration page{% endblock %}
|
||||
{% block description %}Reports administration page of the AYTA system{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<div class="row">
|
||||
<div class="col s12 l11">
|
||||
<h4>Reports administration page</h4>
|
||||
</div>
|
||||
<div class="col s12 l1 m-5">
|
||||
<form method="POST">
|
||||
<input title="Prunes all closed reports, but keeps last 30 days" type="submit" value="clean-closed" name="task">
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
<div class="divider"></div>
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
<h5>Report options</h5>
|
||||
</div>
|
||||
</div>
|
||||
<div class="divider"></div>
|
||||
<div class="row">
|
||||
<div class="col s6 l9">
|
||||
<h5>All reports</h5>
|
||||
</div>
|
||||
<div class="col s6 l3 m-4 input-field">
|
||||
<input id="filter_query" type="text">
|
||||
<label for="filter_query">Filter results</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row">
|
||||
<div class="col s12">
|
||||
{% if reports is not defined %}
|
||||
<p>No reports!</p>
|
||||
{% else %}
|
||||
<table class="striped highlight responsive-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Actions</th>
|
||||
<th>_id</th>
|
||||
<th>videoId</th>
|
||||
<th>status</th>
|
||||
<th>reason</th>
|
||||
<th>reporting_time</th>
|
||||
<th>closing_time</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for report in reports %}
|
||||
<tr class="filterable">
|
||||
<td>
|
||||
<form method="post">
|
||||
<input type="text" value="{{ report.get('_id') }}" name="value" hidden>
|
||||
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="close" title="Close the report" {% if report.get('status') != 'open' %}disabled{% endif %}>✅</button>
|
||||
</form>
|
||||
</td>
|
||||
<td>{{ report.get('_id') }}</td>
|
||||
<td><a href="{{ url_for('watch.base') }}?v={{ report.get('videoId') }}">{{ report.get('videoId') }}</a></td>
|
||||
<td>{{ report.get('status') }}</td>
|
||||
<td>{{ report.get('reason') }}</td>
|
||||
<td>{{ report.get('reporting_time') }}</td>
|
||||
<td>{{ report.get('closing_time') }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
{% endif %}
|
||||
</div>
|
||||
</div>
|
||||
{% endblock %}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user