Compare commits

..

25 Commits

Author SHA1 Message Date
Ventilaar
1be9729720 fix startup when oidc provider is not setup 2024-04-02 18:49:06 +02:00
Ventilaar
1918a03e05 add recently added view 2024-04-02 18:42:56 +02:00
Ventilaar
ed4f8b03eb Update readmy to reflect current status of project 2024-03-30 22:46:03 +01:00
Ventilaar
7266a437d1 forgot about the orphaned view 2024-03-22 00:13:41 +01:00
Ventilaar
360b80343f Quick change to load slugified files 2024-03-22 00:10:41 +01:00
Ventilaar
45348d2cf5 Sort orphaned videos by added date, add queue functionality 2024-03-21 15:22:56 +01:00
Ventilaar
e80318fc6b hotfix caching websub 2024-03-20 23:11:38 +01:00
Ventilaar
69bf7026dd many things changed 2024-03-20 22:44:02 +01:00
Ventilaar
e264a346a5 Fix limiting by proxy, fix search result sorting, sort channels by descending upload date
All checks were successful
Generate release / build-and-publish (push) Successful in 21s
2024-03-15 00:24:47 +01:00
Ventilaar
c50116b942 Add basic search, fix orphaned thumbnails. Merged manual playlist to automatic video collection(orphaned)
All checks were successful
Generate release / build-and-publish (push) Successful in 20s
2024-03-13 14:20:05 +01:00
Ventilaar
970fd1fa0f Add base WebSub support (not finished). Add orphaned videos view. Implement video reporting and managing. Some small changes
All checks were successful
Generate release / build-and-publish (push) Successful in 1m1s
2024-03-13 00:13:57 +01:00
Ventilaar
c71bd547ca Merge branch 'master' of https://git.ventilaar.nl/ventilaar/amazing-ytdlp-archive
All checks were successful
Generate release / build-and-publish (push) Successful in 21s
2024-03-06 14:06:09 +01:00
Ventilaar
2dbae35e4e Rebased runs admin page. Fixed title. 404 is also good now 2024-03-06 14:00:31 +01:00
cd06c86b1a A bit faster container building
All checks were successful
Generate release / build-and-publish (push) Successful in 50s
2024-02-29 20:41:53 +01:00
Ventilaar
fe60b3d981 TBH I don't know what I changed
All checks were successful
Generate release / build-and-publish (push) Successful in 48s
2024-02-29 20:39:30 +01:00
Ventilaar
4eeb72082c ready for release
All checks were successful
Generate release / build-and-publish (push) Successful in 47s
2024-02-29 00:35:18 +01:00
Ventilaar
dcca91fef1 sorry
All checks were successful
Generate release / build-and-publish (push) Successful in 44s
2024-02-28 23:44:54 +01:00
Ventilaar
5bf7d5f25c optimize docker
Some checks failed
Generate release / build-and-publish (push) Failing after 11s
2024-02-28 23:43:38 +01:00
Ventilaar
dffd04078a forgot my password lmao
All checks were successful
Generate release / build-and-publish (push) Successful in 47s
2024-02-28 23:33:27 +01:00
Ventilaar
cb82a50dc4 Add debug message
All checks were successful
Generate release / build-and-publish (push) Successful in 46s
2024-02-28 23:16:23 +01:00
Ventilaar
7e4d872566 Fix debug flag
All checks were successful
Generate release / build-and-publish (push) Successful in 44s
2024-02-28 23:06:01 +01:00
Ventilaar
7f6dff2b7a Pls run
All checks were successful
Generate release / build-and-publish (push) Successful in 44s
2024-02-28 22:59:28 +01:00
Ventilaar
08e94449ed Add missing required package
All checks were successful
Generate release / build-and-publish (push) Successful in 44s
2024-02-28 22:49:35 +01:00
Ventilaar
5c910b2bca Merge branch 'master' of https://git.ventilaar.nl/ventilaar/amazing-ytdlp-archive
All checks were successful
Generate release / build-and-publish (push) Successful in 43s
2024-02-28 22:46:38 +01:00
Ventilaar
afd07334c5 Edit the way the app starts 2024-02-28 22:46:31 +01:00
73 changed files with 1502 additions and 1755 deletions

8
.dockerignore Normal file
View File

@@ -0,0 +1,8 @@
# Ignore everything
**
# Add required files and folders
!ayta
!README.md
!LICENCE
!requirements.txt

View File

@@ -1,9 +1,8 @@
name: Generate release
on:
push:
tags:
- 'v*'
release:
types: [published]
jobs:
build-and-publish:
@@ -23,4 +22,13 @@ jobs:
uses: docker/build-push-action@v5
with:
push: true
tags: git.ventilaar.nl/ventilaar/ayta:latest
tags: git.ventilaar.nl/ventilaar/ayta:latest
- name: Update worker server
uses: appleboy/ssh-action@v1.0.3
with:
host: 192.168.66.109
username: root
key: ${{ secrets.SERVER_KEY }}
port: 22
script: /root/update_worker.sh

View File

@@ -1,6 +1,7 @@
FROM python:3-alpine
WORKDIR /app
COPY . /app
COPY requirements.txt /app
RUN pip install --no-cache-dir -r requirements.txt
COPY . /app
EXPOSE 8000
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "ayta:create_app"]
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "ayta:create_app()"]

View File

@@ -3,27 +3,67 @@
This project will be awesome, only if I invest enough time. This software will replace my
current cronjob yt-dlp archive service.
Partially inspired by [hobune](https://github.com/rebane2001/hobune). While that project is amazing by it's own, it's just not scaleable.
Partially inspired by [hobune](https://github.com/rebane2001/hobune). While that project is amazingby it's own, it's just not scaleable.
## The idea
The new setup will either be fully running in flask, including the task that checks the
youtube channels every x hours. Or Flask will be used as the gui frontend, and a seperate
script will do the channel archiving. I have not desided yet.
What currently works is that the gui frontend calls to a seperate database while a cronjob
handles the downloading of new videos from a list of channels.
Having over 250k videos, scaling the current cronjob yt-dlp archive task is just really hard. Filetypes change, things get partially downloaded and such.
Partially yt-dlp is to blame because it's a package that needs to change all the time. But with this some changes are not accounted for.
yt-dlp will still do the downloads. But a flask frontend will be developed to make all downloaded videos easily indexable.
For it to be quick (unlike hobune) a database has to be implemented. This could get solved by a static site generator type of software, but that is not my choice.
The whole software package will use postgresql as a data backend and celery as background tasks.
Currently development however is using mongodb just because it's easy.
## How it works currently(legacy)
In the legacy folder you will find files that are currently in my archive project. How it works is
that I have a cronjob running every 6 hours what then runs yt-dlp with a config file. In that config
that I have a cronjob running every 24 hours what then runs yt-dlp with a config file. In that config
file a channel list contains all the channels that yt-dlp needs to update. If a new video has been
uploaded, yt-dlp will automatically download a 720p version of the video, all subtitles at that time
(rip community captions, will not forget you) and a json file with all the rest of the metadata. Oh
and also the thumbnail.
This works. But is very slow and uses lots of "API" calls to youtube, which will sometimes will get
the IP blocked. This needs to be overhauled.
the IP blocked. This is why full channel upload pages are not downloaded anymore, I have limited to first 50 videos.
## Goals
Some goals have been set up which will prioritise functionality for the software package.
The starting status is that info.json files of videos are loaded into the mongodb database on which flask
will generate a page for channels and videos to load. But this has major limitations which will not be described right now
but will be reflected in the goals.
### Stage 1
Tasks which have to be finished before the GUI frontend is usable as a manager and user in no perticular order.
- [x] Have videos and channels listed on a page
- [x] Have a secured admin page where the database can be managed
- [x] Have working video streaming
- [x] CI/CD pipeline for quicker deployment
- [x] Add caching to speed up pages
- [x] Add ratelimiting for expensive pages
- [x] Ability to show cronjob logs to easily troubleshoot
### Stage 2
Extra functionality for further development of features.
- [x] Fix video titles on disk with slugs
- [x] Working search functionality
- [x] Video reporting functionality
- [x] Ability (for external applications) to queue up video ids for download
- [x] Add websub requesting and receiving ability. (not fully usable yet without celery tasks)
- [] OIDC or Webauthn logins instead of static argon2 passwords
### Stage 3
Mainly focused on retiring the cronjob based scripts and moving it to celery based tasks
- [] manage videos by ID's instead of per channel basis
- [] download videos from queue
- [] Manage websub callbacks
### Stage 4
Mongodb finally has it's limitations.
- [] Migrate to postgresql
### Stage ...
Since this is my flagship software which I have developed more features will be added.
It may take some time since this is just a hobby for me. And I'm not a programmer by title.
## Things learned
### Video playlists
@@ -50,26 +90,22 @@ If you swap the channel name to channel id. The folders will never change.
### Storage structure
The following folder structure is pretty nice for using static scripts. The one drawback
is that you can't search for video id's or titles. Because the search takes too long.
This is mainly why we need a new system using a database.
```
./videos/{channel_id}/{upload_date}/{video_id}/video_title.mp4
```
For the new system using a blob like storage will be key. I had the following in mind. It will be an independant
random key and not the YouTube video ID because I have notices that multiple real videos exist under the same key by
uploaders who replace old videos.
This is mainly why we need a new system using a database mainly for search.
The following structure is easily scaleable and usable in a object storage format.
```
-| data
| - videos
| - 128bit_random_id.mp4
| - subtitles
| - same_random_id_EN.srt
| - same_random_id_DE.srt
| - thumbnails
| - 128bit_random_id.jpg
./videos/{channel_id}/{video_id}/video-title-slug-format.info.json
```
## API things learned
### YouTube push notifications in API form exist
Using the pubsubhubbub service provided by Google we will implement downloading videos based on uploads.
The API is based on WebSub which is greatly documented.
The hub will give xml+atom notifications when a video is uploaded by a channel and when a video is deleted.
The goal is to download a video when a notification gets trough, and run a full channel sync when a video is deleted.
This will be next to periodic full channel polling to download videos which the hub has not notified us about.
### Etag is useful
When we will call the api for 50 items in a playlist we also get an etag back.
This is a sort of hash of the returned data.

View File

@@ -1,30 +1,42 @@
import os
import secrets
from flask import Flask
from ayta.extensions import limiter, caching
from . import filters
def create_app(test_config=None):
import os, secrets
from flask import Flask
from ayta.extensions import limiter, caching, celery_init_app, oidc
from werkzeug.middleware.proxy_fix import ProxyFix
from . import filters
config = {'MONGO_CONNECTION': os.environ.get('AYTA_MONGOCONNECTION', 'mongodb://root:example@192.168.66.140:27017'),
'S3_CONNECTION': os.environ.get('AYTA_S3CONNECTION', '192.168.66.111:9001'),
'S3_ACCESSKEY': os.environ.get('AYTA_S3ACCESSKEY', 'lnUiGClFVXVuZbsr'),
'S3_SECRETKEY': os.environ.get('AYTA_S3SECRETKEY', 'Qz9NG7rpcOWdK2WL'),
'OIDC_CLIENT_SECRETS': os.environ.get('AYTA_OIDC_PATH', None),
'CACHE_TYPE': os.environ.get('AYTA_CACHETYPE', 'SimpleCache'),
'CACHE_DEFAULT_TIMEOUT': os.environ.get('AYTA_CACHETIMEOUT', 5),
'CACHE_DEFAULT_TIMEOUT': int(os.environ.get('AYTA_CACHETIMEOUT', 6)),
'SECRET_KEY': os.environ.get('AYTA_SECRETKEY', secrets.token_hex(32)),
'DEBUG': os.environ.get('AYTA_DEBUG', True)
'DEBUG': bool(os.environ.get('AYTA_DEBUG', False)),
'DOMAIN': os.environ.get('AYTA_DOMAIN', 'testing.mashallah.nl'),
'CELERY': dict(broker_url=str(os.environ.get('AYTA_CELERYBROKER', 'amqp://guest:guest@192.168.66.140:5672/')),
task_ignore_result=True,)
}
# Static configuration settings, do not change
config['OIDC_CALLBACK_ROUTE'] = '/api/oidc/callback' # why is this excension not using it? maybe i should implement oidc by myself?
app = Flask(__name__)
app.config.from_mapping(config)
limiter.init_app(app)
caching.init_app(app)
celery_init_app(app)
if app.config['OIDC_CLIENT_SECRETS']:
oidc.init_app(app)
app.wsgi_app = ProxyFix(app.wsgi_app, x_for=1)
app.jinja_env.filters['pretty_duration'] = filters.pretty_duration
app.jinja_env.filters['pretty_time'] = filters.pretty_time
app.jinja_env.filters['current_time'] = filters.current_time
app.jinja_env.filters['epoch_time'] = filters.epoch_time
from .blueprints import watch
from .blueprints import index
@@ -32,6 +44,7 @@ def create_app(test_config=None):
from .blueprints import search
from .blueprints import channel
from .blueprints import auth
from .blueprints import api
app.register_blueprint(watch.bp)
app.register_blueprint(index.bp)
@@ -39,7 +52,6 @@ def create_app(test_config=None):
app.register_blueprint(search.bp)
app.register_blueprint(channel.bp)
app.register_blueprint(auth.bp)
app.add_url_rule("/", endpoint="base")
app.register_blueprint(api.bp)
return app

View File

@@ -1,9 +1,11 @@
from flask import Blueprint, render_template, request, redirect, url_for
from flask import Blueprint, render_template, request, redirect, url_for, flash
from ..nosql import get_nosql
from ..s3 import get_s3
from ..dlp import checkChannelId, getChannelInfo
from ..decorators import login_required
from ..tasks import subscribe_websub_callback, unsubscribe_websub_callback
from datetime import datetime
from secrets import token_urlsafe
bp = Blueprint('admin', __name__, url_prefix='/admin')
@@ -12,6 +14,16 @@ bp = Blueprint('admin', __name__, url_prefix='/admin')
def base():
return render_template('admin/index.html')
@bp.route('/system', methods=['GET', 'POST'])
@login_required
def system():
if request.method == 'POST':
task = request.form.get('task', None)
if task == 'update-value':
pass
return render_template('admin/system.html')
@bp.route('/channel', methods=['GET', 'POST'])
@login_required
def channels():
@@ -31,7 +43,8 @@ def channels():
channelId, originalName = getChannelInfo(channelId, ('channel_id', 'uploader'))
if not get_nosql().insert_new_channel(channelId, originalName, addedDate):
return 'Error inserting new channel, you probably made a mistake somewhere'
flash('Error inserting new channel, you probably made a mistake somewhere')
return redirect(url_for('admin.channels'))
return redirect(url_for('admin.channel', channelId=channelId))
@@ -47,11 +60,22 @@ def channels():
@bp.route('/channel/<channelId>', methods=['GET', 'POST'])
@login_required
def channel(channelId):
channelInfo = get_nosql().get_channel_info(channelId)
if not channelInfo:
flash('That channel ID does not exist in the system')
return redirect(url_for('admin.channels'))
if request.method == 'POST':
task = request.form.get('task', None)
key = request.form.get('key', None)
value = request.form.get('value', None)
if task == 'subscribe-websub':
task = subscribe_websub_callback.delay(channelId)
flash(f"Started task {task.id}")
return redirect(url_for('admin.channel', channelId=channelId))
if task == 'update-value':
if key == 'active':
value = True if value else False
@@ -60,27 +84,19 @@ def channel(channelId):
value = datetime.strptime(value, '%Y-%m-%d')
get_nosql().update_channel_key(channelId, key, value)
channelInfo = get_nosql().get_channel_info(channelId)
if not channelInfo:
return 'That channel ID does not exist in the system'
#if channelInfo.get('added_date'):
# channelInfo['added_date'] = channelInfo['added_date'].strftime("%Y-%m-%d")
return redirect(url_for('admin.channel', channelId=channelId))
return render_template('admin/channel.html', channelInfo=channelInfo)
@bp.route('/runs', methods=['GET', 'POST'])
@bp.route('/run', methods=['GET', 'POST'])
@login_required
def runs():
if request.method == 'POST':
task = request.form.get('task', None)
if task == 'clean_runs':
get_nosql().clean_runs()
else:
pass
return redirect(url_for('admin.runs'))
runs = reversed(list(get_nosql().get_runs()))
return render_template('admin/runs.html', runs=runs)
@@ -91,6 +107,93 @@ def run(runId):
run = get_nosql().get_run(runId)
return render_template('admin/run.html', run=run)
@bp.route('/websub', methods=['GET', 'POST'])
@login_required
def websub():
if request.method == 'POST':
task = request.form.get('task', None)
value = request.form.get('value', None)
if task == 'unsubscribe':
channelId = get_nosql().websub_getCallback(value).get('channel')
task = unsubscribe_websub_callback.delay(value, channelId)
flash(f"Started task {task.id}")
return redirect(url_for('admin.websub'))
elif task == 'clean-retired':
get_nosql().websub_cleanRetired()
return redirect(url_for('admin.websub'))
callbackIds = get_nosql().websub_getCallbacks()
callbacks = {}
for callbackId in callbackIds:
callbacks[callbackId] = get_nosql().websub_getCallback(callbackId)
return render_template('admin/websub.html', callbacks=callbacks)
@bp.route('/reports', methods=['GET', 'POST'])
@login_required
def reports():
if request.method == 'POST':
task = request.form.get('task', None)
value = request.form.get('value', None)
if task == 'close':
get_nosql().close_report(value)
flash(f'Report closed {value}')
return redirect(url_for('admin.reports'))
reports = get_nosql().list_reports()
return render_template('admin/reports.html', reports=reports)
@bp.route('/posters', methods=['GET', 'POST'])
@login_required
def posters():
if request.method == 'POST':
task = request.form.get('task', None)
value = request.form.get('value', None)
if task == 'add-endpoint':
description = request.form.get('description', None)
if not description or len(description) <= 7:
flash('Description must be at least 8 characters long')
if value and len(value) >= 12:
get_nosql().poster_newEndpoint(value, description)
flash(f'Created endpoint ID: {value}')
else:
value = token_urlsafe(16)
get_nosql().poster_newEndpoint(value, description)
flash(f'Created endpoint ID: {value}')
elif task == 'retire':
get_nosql().poster_retireEndpoint(value)
flash(f'Endpoint retired: {value}')
elif task == 'clean-retired':
get_nosql().poster_cleanRetired()
flash(f'Cleaned retired endpoints')
elif task == 'manual-queue':
get_nosql().poster_insertQueue('manual', value)
flash(f'Added to queue: {value}')
elif task == 'delete-queue':
get_nosql().poster_deleteQueue(value)
flash(f'Deleted from queue: {value}')
return redirect(url_for('admin.posters'))
endpoints = get_nosql().poster_getEndpoints()
queue = get_nosql().poster_getQueue()
return render_template('admin/posters.html', endpoints=endpoints, queue=queue)
@bp.route('/files', methods=['GET', 'POST'])
@login_required
def files():

66
ayta/blueprints/api.py Normal file
View File

@@ -0,0 +1,66 @@
from flask import Blueprint, render_template, request, redirect, url_for, flash, abort
from ..nosql import get_nosql
from ..extensions import caching, caching_unless
import re
bp = Blueprint('api', __name__, url_prefix='/api')
@bp.route('/websub/<cap>', methods=['GET', 'POST'])
def websub(cap):
if request.method == 'GET':
topic = request.args.get('hub.topic')
challenge = request.args.get('hub.challenge')
mode = request.args.get('hub.mode')
lease_seconds = request.args.get('hub.lease_seconds')
if mode not in ['subscribe', 'unsubscribe']:
return abort(400)
if not get_nosql().websub_existsCallback(cap):
return abort(404)
if mode == 'unsubscribe':
get_nosql().websub_retireCallback(cap)
return challenge
if not all([topic, challenge, mode, lease_seconds]):
return abort(400)
if not get_nosql().websub_activateCallback(cap, lease_seconds):
return abort(500)
return challenge
if get_nosql().websub_existsCallback(cap):
if not get_nosql().websub_savePost(cap, str(request.data)):
return abort(500)
return '', 202
return abort(404)
@bp.route('/poster/<cap>', methods=['POST'])
def poster(cap):
# if endpoint does not exist
if not get_nosql().poster_isActive(cap):
return abort(404)
videoId = request.form.get('v')
# if request is not valid
if not videoId:
return abort(400)
# if requested string is not correct
if not re.match(r"^[a-zA-Z0-9_-]{11}$", videoId):
return abort(422)
# if given string is already in the archive
if get_nosql().check_exists(videoId):
return abort(409)
# try to insert
if get_nosql().poster_insertQueue(cap, videoId):
return '', 202
else:
return abort(409)

View File

@@ -1,10 +1,10 @@
from flask import Blueprint, redirect, url_for, render_template, request, session, flash, current_app
from ..extensions import limiter, caching, caching_only_get
from ..extensions import limiter, caching, caching_unless
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
corr = '$argon2id$v=19$m=64,t=3,p=4$YmY5RTV0bU9tRkx3Q0FvUw$VfPI6BowKvsO4pI1aRslXfbigerssHrHQnQNDhgR8Og'
corr = '$argon2id$v=19$m=65536,t=3,p=4$XzX9K2MKRrGWEf/0iHf2AA$m6Q/aHoj1/uct+8a00QTS5xVWnANeMPKVUg4P822sbM'
bp = Blueprint('auth', __name__, url_prefix='/auth')
@@ -17,11 +17,11 @@ def base():
def logout():
session.pop('username', None)
flash('You have been logged out')
return redirect(url_for('index.base'))
return redirect(url_for('auth.login'))
@bp.route('/login', methods=['GET', 'POST'])
@limiter.limit('10 per day', override_defaults=False)
@caching.cached(unless=caching_only_get)
@caching.cached(unless=caching_unless)
def login():
if request.method == 'POST':
password = request.form.get('password', None)
@@ -29,21 +29,25 @@ def login():
if current_app.config.get('DEBUG'):
session['username'] = 'admin'
flash('You have been logged in')
return redirect(url_for('admin.base'))
return redirect(request.args.get('next', url_for('admin.base')))
if not password:
flash('Password was empty')
return 'password required!'
return redirect(url_for('auth.login'))
try:
ph = PasswordHasher()
if ph.verify(corr, password):
session['username'] = 'admin'
flash('You have been logged in')
return redirect(url_for('admin.base'))
return redirect(request.args.get('next', url_for('admin.base')))
except VerifyMismatchError:
flash('Wrong password')
return redirect(url_for('auth.login'))
except:
flash('Something went wrong')
return redirect(url_for('auth.login'))
return render_template('login.html')

View File

@@ -1,34 +1,66 @@
from flask import Blueprint, render_template
from flask import Blueprint, render_template, flash, url_for, redirect
from ..nosql import get_nosql
from ..s3 import get_s3
from ..extensions import caching
from ..extensions import caching, caching_unless
bp = Blueprint('channel', __name__, url_prefix='/channel')
@bp.route('')
@caching.cached()
@caching.cached(unless=caching_unless)
def base():
channels = {}
channels = []
channelIds = get_nosql().list_all_channels()
for channelId in channelIds:
channels[channelId] = get_nosql().get_channel_info(channelId)
channels[channelId]['video_count'] = get_nosql().get_channel_videos_count(channelId)
channel = get_nosql().get_channel_info(channelId)
channel['video_count'] = get_nosql().get_channel_videos_count(channelId)
channels.append(channel)
channels = sorted(channels, key=lambda x: x.get('added_date'), reverse=True)
return render_template('channel/index.html', channels=channels)
@bp.route('/<channelId>')
@caching.cached()
@caching.cached(unless=caching_unless)
def channel(channelId):
channelInfo = get_nosql().get_channel_info(channelId)
if not channelInfo:
return 'That channel ID does not exist in the system'
flash('That channel ID does not exist in the system')
return redirect(url_for('channel.base'))
videoIds = get_nosql().get_channel_videoIds(channelId)
videos = {}
videos = []
for videoId in videoIds:
videos[videoId] = get_nosql().get_video_info(videoId, limited=True)
videos.append(get_nosql().get_video_info(videoId, limited=True))
videos = sorted(videos, key=lambda x: x.get('upload_date'), reverse=True)
return render_template('channel/channel.html', channel=channelInfo, videos=videos)
return render_template('channel/channel.html', channel=channelInfo, videos=videos)
@bp.route('/orphaned')
@caching.cached(unless=caching_unless)
def orphaned():
videoIds = get_nosql().get_orphaned_videos()
videos = []
for videoId in videoIds:
videos.append(get_nosql().get_video_info(videoId, limited=True))
videos = sorted(videos, key=lambda x: x.get('epoch', 0), reverse=True)
return render_template('channel/orphaned.html', videos=videos)
@bp.route('/recent')
@caching.cached(unless=caching_unless)
def recent():
videoIds = get_nosql().get_recent_videos()
videos = []
for videoId in videoIds:
videos.append(get_nosql().get_video_info(videoId, limited=True))
videos = sorted(videos, key=lambda x: x.get('epoch', 0), reverse=True)
return render_template('channel/recent.html', videos=videos)

View File

@@ -1,14 +1,19 @@
from flask import Blueprint, render_template
from ..extensions import caching
from flask import Blueprint, render_template, send_from_directory
from ..extensions import caching, caching_unless
bp = Blueprint('index', __name__, url_prefix='/')
@bp.route('', methods=['GET'])
@caching.cached()
@caching.cached(unless=caching_unless)
def base():
return render_template('index.html')
@bp.route('help', methods=['GET'])
@caching.cached()
@caching.cached(unless=caching_unless)
def help():
return render_template('index.html')
return render_template('help.html')
@bp.route('robots.txt', methods=['GET'])
@caching.cached(unless=caching_unless)
def robots():
return render_template('robots.txt')

View File

@@ -1,10 +1,26 @@
from flask import Blueprint, render_template
from flask import Blueprint, render_template, request, flash, redirect, url_for
from ..nosql import get_nosql
from ..extensions import caching
from ..extensions import limiter, caching, caching_unless
bp = Blueprint('search', __name__, url_prefix='/search')
@bp.route('')
@caching.cached()
@bp.route('', methods=['GET', 'POST'])
@limiter.limit('50 per day', override_defaults=False)
@caching.cached(unless=caching_unless)
def base():
return render_template('search/index.html', stats=get_nosql().gen_stats())
if request.method == 'POST':
task = request.form.get('task')
if task == 'search':
query = request.form.get('query')
if not 3 <= len(query) <= 64:
flash('Query too short or too long, must be between 3 and 64')
return redirect(url_for('search.base'))
results = get_nosql().search_videos(query)
return render_template('search/index.html', results=results, query=query)
return render_template('search/index.html', stats=get_nosql().gen_stats())

View File

@@ -1,18 +1,38 @@
from flask import Blueprint, render_template, request
from flask import Blueprint, render_template, request, flash, redirect, url_for
from ..nosql import get_nosql
from ..extensions import caching, caching_v_parameter
from ..extensions import caching, caching_v_parameter, caching_unless
bp = Blueprint('watch', __name__, url_prefix='/watch')
@bp.route('', methods=['GET'])
@caching.cached(make_cache_key=caching_v_parameter)
@bp.route('', methods=['GET', 'POST'])
@caching.cached(make_cache_key=caching_v_parameter, unless=caching_unless)
def base():
render = {}
vGet = request.args.get('v')
if not vGet:
flash('Thats not how it works pal')
return redirect(url_for('index.base'))
if not get_nosql().check_exists(vGet):
return render_template('watch/404.html')
flash('The requested video is not in the archive')
return redirect(url_for('index.base'))
render = {}
if request.method == 'POST':
reason = request.form.get('reason')
if reason not in ['auto-video', 'metadata', 'illegal']:
flash('Invalid report reason')
return redirect(url_for('watch.base', v=vGet))
else:
reportId = get_nosql().insert_report(vGet, reason)
if reportId:
flash(f'Report has been received: {reportId}')
return redirect(url_for('watch.base', v=vGet))
else:
flash('Something went wrong with reporting')
return redirect(url_for('watch.base', v=vGet))
render['info'] = get_nosql().get_video_info(vGet)
render['params'] = request.args.get('v')

View File

@@ -2,10 +2,10 @@ import yt_dlp
def checkChannelId(channelId):
if len(channelId) < 24: # channelId lengths are 24 characters
if len(channelId) <= 23: # channelId lengths are 24 characters
return False
if len(channelId) > 25: # But some are 25, idk why
if len(channelId) >= 26: # But some are 25, idk why
return False
if channelId[0:2] not in ['UC', 'UU']:

View File

@@ -3,23 +3,49 @@ from flask_limiter.util import get_remote_address
from flask_caching import Cache
from flask import request
from celery import Celery, Task
from flask_oidc import OpenIDConnect
def caching_only_get(*args, **kwargs):
if request.method == 'GET':
return False
from flask import Flask, request, session
def celery_init_app(app: Flask) -> Celery:
class FlaskTask(Task):
def __call__(self, *args: object, **kwargs: object) -> object:
with app.app_context():
return self.run(*args, **kwargs)
celery_app = Celery(app.name, task_cls=FlaskTask)
celery_app.config_from_object(app.config["CELERY"])
celery_app.set_default()
app.extensions["celery"] = celery_app
return celery_app
def caching_unless(*args, **kwargs):
# if it is not a get request
if request.method != 'GET':
return True
# if username is defined in session cookie
if session.get('username'):
return True
# in the case that a user is not logged in but a message needs to be flashed, do not cache page
if session.get('_flashes'):
return True
return True
# do cache page
return False
def caching_v_parameter(*args, **kwargs):
return request.args.get('v')
limiter = Limiter(
get_remote_address,
default_limits=['1000 per day', '100 per hour'],
default_limits=['86400 per day', '3600 per hour'],
storage_uri="memory://",
)
caching = Cache()
caching = Cache()
oidc = OpenIDConnect()

View File

@@ -1,18 +1,28 @@
from datetime import datetime
def pretty_duration(seconds):
minutes, seconds = divmod(seconds, 60)
return f'{minutes}:{seconds} minutes'
if seconds is None:
return None
minutes, seconds = divmod(seconds, 60) # split total seconds to minute and remaining seconds
return f'{minutes}:{seconds:>02} minutes' # return mm:ss format including padding in seconds
def pretty_time(time):
try:
return time.strftime('%d %b %Y')
return time.strftime('%d %b %Y') # try to return pretty time if given object is datetime
except:
try:
return datetime.strptime(time, '%Y%m%d').strftime('%d %b %Y')
return datetime.strptime(time, '%Y%m%d').strftime('%d %b %Y') # try to parse str and give back pretty time
except:
return time
return time # return given time
def current_time(null):
return datetime.utcnow().isoformat(sep=" ", timespec="seconds")
def epoch_time(time):
try:
return datetime.fromtimestamp(time).strftime('%d %b %Y')
except:
return None
def current_time(null=None, object=False):
if object:
return datetime.utcnow().replace(microsecond=0)
return datetime.utcnow().isoformat(sep=" ", timespec="seconds") # return time in iso format without milliseconds

File diff suppressed because it is too large Load Diff

Binary file not shown.

Before

Width:  |  Height:  |  Size: 318 B

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 401 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

View File

Before

Width:  |  Height:  |  Size: 1.1 KiB

After

Width:  |  Height:  |  Size: 1.1 KiB

View File

@@ -1,44 +0,0 @@
function channelSort() {
const sortOption = document.querySelector(".sort").value;
const [sortBy, direction] = sortOption.split("-");
const isInt = sortBy !== "search";
const container = document.querySelector(".channels.flex-grid");
[...container.children]
.sort((a,b)=>{
const dir = direction ? 1 : -1;
let valA = a.dataset[sortBy];
let valB = b.dataset[sortBy];
if (isInt) {
valA = parseInt(valA);
valB = parseInt(valB);
}
return (valA>valB?1:-1)*dir;
})
.forEach(node=>container.appendChild(node));
}
function channelSearch() {
let searchTerm = document.querySelector(".search").value.toLowerCase();
const allowedClasses = [];
const filteredClasses = [];
document.querySelectorAll('.searchable').forEach((e) => {
let filtered = false;
for (const c of allowedClasses) {
if (!e.querySelector(`.${c}`)) filtered = true;
}
for (const c of filteredClasses) {
if (e.querySelector(`.${c}`)) filtered = true;
}
if (!filtered && (searchTerm === "" || e.dataset.search.toLowerCase().includes(searchTerm))) {
e.classList.remove("hide");
} else {
e.classList.add("hide");
}
});
}
window.addEventListener("load", () => {
channelSearch();
});

File diff suppressed because it is too large Load Diff

47
ayta/tasks.py Normal file
View File

@@ -0,0 +1,47 @@
from celery import shared_task
from flask import current_app
@shared_task()
def subscribe_websub_callback(channelId):
import requests
from .nosql import get_nosql
callbackId = get_nosql().websub_newCallback(channelId)
url = 'https://pubsubhubbub.appspot.com/subscribe'
data = {
'hub.callback': f'https://{current_app.config["DOMAIN"]}/api/websub//{callbackId}',
'hub.topic': f'https://www.youtube.com/xml/feeds/videos.xml?channel_id={channelId}',
'hub.verify': 'async',
'hub.mode': 'subscribe',
'hub.verify_token': '',
'hub.secret': '',
'hub.lease_numbers': '86400',
}
get_nosql().websub_requestingCallback(callbackId)
response = requests.post(url, data=data)
if response.status_code == 202:
return True
return False
@shared_task()
def unsubscribe_websub_callback(callbackId, channelId):
import requests
from .nosql import get_nosql
url = 'https://pubsubhubbub.appspot.com/subscribe'
data = {'hub.callback': f'https://{current_app.config["DOMAIN"]}/api/websub/{callbackId}',
'hub.topic': f'https://www.youtube.com/xml/feeds/videos.xml?channel_id={channelId}',
'hub.verify': 'async',
'hub.mode': 'unsubscribe'
}
get_nosql().websub_retiringCallback(callbackId)
response = requests.post(url, data=data)
if response.status_code == 202:
return True
return False

View File

@@ -1,13 +1,18 @@
{% extends 'material_base.html' %}
{% block title %}Channel administration page | AYTA{% endblock %}
{% block title %}Channel administration page{% endblock %}
{% block description %}Channel administration page of the AYTA system{% endblock %}
{% block content %}
<div class="row">
<div class="col s12">
<div class="col s12 l11">
<h4>{{ channelInfo.original_name }} administration page</h4>
<p>The update actions below directly apply to the database!</p>
</div>
<div class="col s12 l1 m-5">
<form method="POST">
<input title="Requests callback URL from youtube API" type="submit" value="subscribe-websub" name="task">
</form>
</div>
</div>
<div class="row">
<div class="col s12 l4">

View File

@@ -1,5 +1,5 @@
{% extends 'material_base.html' %}
{% block title %}Channels administration page | AYTA{% endblock %}
{% block title %}Channels administration page{% endblock %}
{% block description %}Channels administration page of the AYTA system{% endblock %}
{% block content %}
@@ -43,30 +43,6 @@
</div>
</div>
</div>
<div class="col s12 l4 m-4">
<div class="card green">
<div class="card-content white-text">
<span class="card-title">Placeholder</span>
<p>I am a very simple card. I am good at containing small bits of information. I am convenient because I require little markup to use effectively.</p>
</div>
<div class="card-action">
<a href="#">This is a link</a>
<a href="#">This is a link</a>
</div>
</div>
</div>
<div class="col s12 l4 m-4">
<div class="card green">
<div class="card-content white-text">
<span class="card-title">Placeholder</span>
<p>I am a very simple card. I am good at containing small bits of information. I am convenient because I require little markup to use effectively.</p>
</div>
<div class="card-action">
<a href="#">This is a link</a>
<a href="#">This is a link</a>
</div>
</div>
</div>
</div>
<div class="divider"></div>
<div class="row">

View File

@@ -1,5 +1,5 @@
{% extends 'material_base.html' %}
{% block title %}Administration page | AYTA{% endblock %}
{% block title %}Administration page{% endblock %}
{% block description %}Administration page of the AYTA system{% endblock %}
{% block content %}
@@ -16,7 +16,7 @@
</div>
<div class="row">
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.channels') }}">
<a href="{{ url_for('admin.system') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">System</span>
@@ -45,5 +45,35 @@
</div>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.websub') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">WebSub</span>
<p class="grey-text">Edit WebSub YouTube links</p>
</div>
</div>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.reports') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Reports</span>
<p class="grey-text">View user reports</p>
</div>
</div>
</a>
</div>
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.posters') }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">Posters</span>
<p class="grey-text">User extension posters</p>
</div>
</div>
</a>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,150 @@
{% extends 'material_base.html' %}
{% block title %}Posters administration page{% endblock %}
{% block description %}Posters administration page of the AYTA system{% endblock %}
{% block content %}
<div class="row">
<div class="col s12 l11">
<h4>Posters administration page</h4>
</div>
<div class="col s12 l1 m-5">
<form method="POST">
<input title="Prunes all deleted endpoints, but keeps last 3 days" type="submit" value="clean-retired" name="task">
</form>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s12">
<h5>Poster options</h5>
</div>
</div>
<div class="row">
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Create new endpoint</span>
<form method="post">
<div class="row">
<div class="col s12 input-field">
<input placeholder="Custom endpoint" name="value" type="text" class="validate" minlength="12">
<span class="supporting-text">Leaving this empty will create a random secure string</span>
</div>
<div class="col s12 input-field">
<input placeholder="Description" name="description" type="text" class="validate" minlength="8" maxlength="64" required>
<span class="supporting-text">Description for the endpoint for better administration</span>
</div>
<button class="btn mt-4" type="submit" name="task" value="add-endpoint">Create</button>
</div>
</form>
</div>
</div>
</div>
<div class="col s12 l4 m-4">
<div class="card">
<div class="card-content">
<span class="card-title">Queue manually</span>
<form method="post">
<div class="row">
<div class="col s12 input-field">
<input placeholder="Youtube video ID" name="value" type="text" class="validate" minlength="11" maxlength="11" required>
<span class="supporting-text">Must be a valid Youtube video ID</span>
</div>
<div class="col s12 mt-5 input-field">
<div class="switch">
<label>Queue<input type="checkbox" value="direct" name="value" disabled><span class="lever"></span>Direct</label>
<span class="supporting-text">Queue up or start directly</span>
</div>
</div>
<button class="btn mt-4" type="submit" name="task" value="manual-queue">Queue</button>
</div>
</form>
</div>
</div>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Registered endpoints</h5>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
<label for="filter_query">Filter results</label>
</div>
</div>
<div class="row">
<div class="col s12">
<table class="striped highlight responsive-table">
<thead>
<tr>
<th>Actions</th>
<th>id</th>
<th>description</th>
<th>status</th>
<th>created_time</th>
<th>retired_time</th>
</tr>
</thead>
<tbody>
{% for endpoint in endpoints %}
<tr class="filterable">
<td>
<form method="post">
<input type="text" value="{{ endpoint.get('id') }}" name="value" hidden>
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="retire" title="Retire endpoint" {% if endpoint.get('status') != 'active' %}disabled{% endif %}>🗑️</button>
</form>
</td>
<td>{{ endpoint.get('id') }}</td>
<td>{{ endpoint.get('description') }}</td>
<td>{{ endpoint.get('status') }}</td>
<td>{{ endpoint.get('created_time') }}</td>
<td>{{ endpoint.get('retired_time') }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Queued ID's</h5>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
<label for="filter_query">Filter results</label>
</div>
</div>
<div class="row">
<div class="col s12">
<table class="striped highlight responsive-table">
<thead>
<tr>
<th>Actions</th>
<th>id</th>
<th>endpoint</th>
<th>status</th>
<th>created_time</th>
</tr>
</thead>
<tbody>
{% for id in queue %}
<tr class="filterable">
<td>
<form method="post">
<input type="text" value="{{ id.get('id') }}" name="value" hidden>
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="delete-queue" title="Delete from queue" {% if id.get('status') != 'queued' %}disabled{% endif %}>🗑️</button>
</form>
</td>
<td>{{ id.get('id') }}</td>
<td>{{ id.get('endpoint') }}</td>
<td>{{ id.get('status') }}</td>
<td>{{ id.get('created_time') }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,71 @@
{% extends 'material_base.html' %}
{% block title %}Reports administration page{% endblock %}
{% block description %}Reports administration page of the AYTA system{% endblock %}
{% block content %}
<div class="row">
<div class="col s12 l11">
<h4>Reports administration page</h4>
</div>
<div class="col s12 l1 m-5">
<form method="POST">
<input title="Prunes all closed reports, but keeps last 30 days" type="submit" value="clean-closed" name="task">
</form>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s12">
<h5>Report options</h5>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>All reports</h5>
</div>
<div class="col s6 l3 m-4 input-field">
<input id="filter_query" type="text">
<label for="filter_query">Filter results</label>
</div>
</div>
<div class="row">
<div class="col s12">
{% if reports is not defined %}
<p>No reports!</p>
{% else %}
<table class="striped highlight responsive-table">
<thead>
<tr>
<th>Actions</th>
<th>_id</th>
<th>videoId</th>
<th>status</th>
<th>reason</th>
<th>reporting_time</th>
<th>closing_time</th>
</tr>
</thead>
<tbody>
{% for report in reports %}
<tr class="filterable">
<td>
<form method="post">
<input type="text" value="{{ report.get('_id') }}" name="value" hidden>
<button class="btn-small waves-effect waves-light" type="submit" name="task" value="close" title="Close the report" {% if report.get('status') != 'open' %}disabled{% endif %}>✅</button>
</form>
</td>
<td>{{ report.get('_id') }}</td>
<td><a href="{{ url_for('watch.base') }}?v={{ report.get('videoId') }}">{{ report.get('videoId') }}</a></td>
<td>{{ report.get('status') }}</td>
<td>{{ report.get('reason') }}</td>
<td>{{ report.get('reporting_time') }}</td>
<td>{{ report.get('closing_time') }}</td>
</tr>
{% endfor %}
</tbody>
</table>
{% endif %}
</div>
</div>
{% endblock %}

View File

@@ -1,24 +1,26 @@
{% extends 'base.html' %}
{% block title %}Runs administration page | AYTA{% endblock %}
{% block description %}Cron Runs administration page of the AYTA system{% endblock %}
{% extends 'material_base.html' %}
{% block title %}Runs administration page{% endblock %}
{% block description %}Cron logs administration page of the AYTA system{% endblock %}
{% block content %}
<div class="container channels">
<div class="head">
<div class="title">
<h1 style="display: inline-block;">Cron Run administration page</h1>
<h2 class="subtitle">{{ run.get('_id') }}</h2>
<p><b>Started at</b> {{ run.get('time') }}</p>
<p><b>Finished at</b> {{ run.get('finish_time', 'Probably still running') }}</p>
{% for channel_run in run.get('channel_runs') %}
<hr>
<p><b>Run ID</b> {{ channel_run.get('_id') }}</p>
<p><b>Channel ID</b> {{ channel_run.get('id') }} | <b>Time</b> {{ channel_run.get('time') }} | <b>Exit code</b> {{ channel_run.get('exit_code') }}</p>
<textarea class="info" id={{ channel_run.get('_id') }}>{{ channel_run.get('log') }}</textarea>
{% endfor %}
</div>
<div class="row">
<div class="col s12">
<h4>Cron Run administration page</h4>
</div>
<div class="col s12">
<h5>{{ run.get('_id') }}</h5>
<p><b>Started:</b> {{ run.get('time') }} </p>
<p><b>Finished:</b> {{ run.get('finish_time', 'Probably still running') }}</p>
</div>
</div>
<div class="divider"></div>
{% for channel_run in run.get('channel_runs') %}
<div class="row">
<div class="col s12">
<p><b>Run ID</b> {{ channel_run.get('_id') }}</p>
<p><b>Channel ID</b> {{ channel_run.get('id') }} | <b>Time</b> {{ channel_run.get('time') }} | <b>Exit code</b> {{ channel_run.get('exit_code') }}</p>
<textarea class="info" id={{ channel_run.get('_id') }}>{{ channel_run.get('log') }}</textarea>
</div>
</div>
{% endfor %}
{% endblock %}

View File

@@ -1,31 +1,38 @@
{% extends 'base.html' %}
{% block title %}Runs administration page | AYTA{% endblock %}
{% block description %}Cron Runs administration page of the AYTA system{% endblock %}
{% extends 'material_base.html' %}
{% block title %}Runs administration page{% endblock %}
{% block description %}Cron logs administration page of the AYTA system{% endblock %}
{% block content %}
<div class="container channels">
<div class="head">
<div class="title">
<h1 style="display: inline-block;">Runs administration page</h1>
</div>
<form class="center" method="POST">
<input type="submit" value="clean_runs" name="task">
</form>
<div class="row">
<div class="col s12 l11">
<h4>Runs administration page</h4>
</div>
<h2 class="subtitle">Cron runs list</h2>
<div class="channels flex-grid">
{% for run in runs %}
<div class="card">
<a href="{{ url_for('admin.run', runId=run.get('_id')) }}" class="inner">
<div class="content">
<div class="title">{{ run.get('_id') }}</div>
<div class="meta">Amount of channel logs {{ run.get('channel_runs')|length }}</div>
<div class="description"><b>Started</b> {{ run.get('time') }}</div>
<div class="description"><b>Finished</b> {{ run.get('finish_time', 'Probably still running') }}</div>
</div>
</a>
</div>
{% endfor %}
<div class="col s12 l1 m-5">
<form method="POST">
<input title="Prunes all runs, but keeps last 3" type="submit" value="clean_runs" name="task">
</form>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s6 l9">
<h5>Cron runs list</h5>
</div>
</div>
<div class="row">
{% for run in runs %}
<div class="col s6 l4 m-4">
<a href="{{ url_for('admin.run', runId=run.get('_id')) }}">
<div class="card black-text">
<div class="card-content">
<span class="card-title">{{ run.get('_id') }}</span>
<p class="grey-text">Amount of channel logs {{ run.get('channel_runs')|length }}</p>
<p><b>Started:</b> {{ run.get('time') }} </p>
<p><b>Finished:</b> {{ run.get('finish_time', 'Probably still running') }}</p>
</div>
</div>
</a>
</div>
{% endfor %}
</div>
{% endblock %}

View File

@@ -0,0 +1,37 @@
{% extends 'material_base.html' %}
{% block title %}Administration page{% endblock %}
{% block description %}Administration page of the AYTA system{% endblock %}
{% block content %}
<div class="row">
<div class="col s12">
<h4>Administration page</h4>
</div>
</div>
<div class="divider"></div>
<div class="row">
<div class="col s12">
<h5>AYTA system settings</h5>
</div>
</div>
<div class="row">
<div class="col s6 l4 m-4">
<div class="card black-text">
<div class="card-content">
<form method="POST">
<div class="input-field">
<span class="supporting-text">Enable WebSub</span>
<input class="validate" type="text" value="websub" name="key" hidden>
</div>
<div class="input-field m-4">
<div class="switch">
<label>Off<input type="checkbox" value="None" name="value"><span class="lever"></span>On</label>
</div>
</div>
<button class="btn icon-right waves-effect waves-light" type="submit" name="task" value="update-value">Set</button>
</form>
</div>
</div>
</div>
</div>
{% endblock %}

Some files were not shown because too many files have changed in this diff Show More