1
mirror of https://git.videolan.org/git/ffmpeg.git synced 2024-08-25 10:51:49 +02:00
Commit Graph

13 Commits

Author SHA1 Message Date
Guo, Yejun
9bcf2aa477 vf_dnn_processing.c: add dnn backend openvino
We can try with the srcnn model from sr filter.
1) get srcnn.pb model file, see filter sr
2) convert srcnn.pb into openvino model with command:
python mo_tf.py --input_model srcnn.pb --data_type=FP32 --input_shape [1,960,1440,1] --keep_shape_ops

See the script at https://github.com/openvinotoolkit/openvino/tree/master/model-optimizer
We'll see srcnn.xml and srcnn.bin at current path, copy them to the
directory where ffmpeg is.

I have also uploaded the model files at https://github.com/guoyejun/dnn_processing/tree/master/models

3) run with openvino backend:
ffmpeg -i input.jpg -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=openvino:model=srcnn.xml:input=x:output=srcnn/Maximum -y srcnn.ov.jpg
(The input.jpg resolution is 720*480)

Also copy the logs on my skylake machine (4 cpus) locally with openvino backend
and tensorflow backend. just for your information.

$ time ./ffmpeg -i 480p.mp4 -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=tensorflow:model=srcnn.pb:input=x:output=y -y srcnn.tf.mp4
…
frame=  343 fps=2.1 q=31.0 Lsize=    2172kB time=00:00:11.76 bitrate=1511.9kbits/s speed=0.0706x
video:1973kB audio:187kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.517637%
[aac @ 0x2f5db80] Qavg: 454.353
real    2m46.781s
user    9m48.590s
sys     0m55.290s

$ time ./ffmpeg -i 480p.mp4 -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=openvino:model=srcnn.xml:input=x:output=srcnn/Maximum -y srcnn.ov.mp4
…
frame=  343 fps=4.0 q=31.0 Lsize=    2172kB time=00:00:11.76 bitrate=1511.9kbits/s speed=0.137x
video:1973kB audio:187kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.517640%
[aac @ 0x31a9040] Qavg: 454.353
real    1m25.882s
user    5m27.004s
sys     0m0.640s

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-07-02 09:56:55 +08:00
Guo, Yejun
2114c42418 avfilter/vf_dnn_processing.c: fix typo for the linesize of dnn data
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
2020-04-07 11:03:25 +08:00
Linjie Fu
acc6f632b4 lavfi/vf_dnn_processing: Fix compile warning of mixed declarations and code
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
2020-03-19 14:27:23 +08:00
Guo, Yejun
e35f966853 avfilter/vf_dnn_processing.c: add frame size change support for planar yuv format
The Y channel is handled by dnn, and also resized by dnn. The UV channels
are resized with swscale.

The command to use espcn.pb (see vf_sr) looks like:
./ffmpeg -i 480p.jpg -vf format=yuv420p,dnn_processing=dnn_backend=tensorflow:model=espcn.pb:input=x:output=y -y tmp.espcn.jpg

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Reviewed-by: Pedro Arthur <bygrandao@gmail.com>
2020-03-12 18:22:51 +08:00
Guo, Yejun
bd50453894 avfilter/vf_dnn_processing.c: add planar yuv format support
Only the Y channel is handled by dnn, the UV channels are copied
without changes.

The command to use srcnn.pb (see vf_sr) looks like:
./ffmpeg -i 480p.jpg -vf format=yuv420p,scale=w=iw*2:h=ih*2,dnn_processing=dnn_backend=tensorflow:model=srcnn.pb:input=x:output=y -y srcnn.jpg

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Reviewed-by: Pedro Arthur <bygrandao@gmail.com>
2020-03-12 18:22:39 +08:00
Guo, Yejun
d86a8c056b avfilter/vf_dnn_processing.c: use swscale for uint8<->float32 convert
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Reviewed-by: Pedro Arthur <bygrandao@gmail.com>
2020-03-12 18:22:18 +08:00
Guo, Yejun
4e1ae43b17 lavfi/dnn_processing: refine code to use function av_image_copy_plane for data copy
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-01-14 11:29:43 -03:00
Guo, Yejun
37d24a6c8f vf_dnn_processing: add support for more formats gray8 and grayf32
The following is a python script to halve the value of the gray
image. It demos how to setup and execute dnn model with python+tensorflow.
It also generates .pb file which will be used by ffmpeg.

import tensorflow as tf
import numpy as np
from skimage import color
from skimage import io
in_img = io.imread('input.jpg')
in_img = color.rgb2gray(in_img)
io.imsave('ori_gray.jpg', np.squeeze(in_img))
in_data = np.expand_dims(in_img, axis=0)
in_data = np.expand_dims(in_data, axis=3)
filter_data = np.array([0.5]).reshape(1,1,1,1).astype(np.float32)
filter = tf.Variable(filter_data)
x = tf.placeholder(tf.float32, shape=[1, None, None, 1], name='dnn_in')
y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', name='dnn_out')
sess=tf.Session()
sess.run(tf.global_variables_initializer())
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
tf.train.write_graph(graph_def, '.', 'halve_gray_float.pb', as_text=False)
print("halve_gray_float.pb generated, please use \
path_to_ffmpeg/tools/python/convert.py to generate halve_gray_float.model\n")
output = sess.run(y, feed_dict={x: in_data})
output = output * 255.0
output = output.astype(np.uint8)
io.imsave("out.jpg", np.squeeze(output))

To do the same thing with ffmpeg:
- generate halve_gray_float.pb with the above script
- generate halve_gray_float.model with tools/python/convert.py
- try with following commands
  ./ffmpeg -i input.jpg -vf format=grayf32,dnn_processing=model=halve_gray_float.model:input=dnn_in:output=dnn_out:dnn_backend=native out.native.png
  ./ffmpeg -i input.jpg -vf format=grayf32,dnn_processing=model=halve_gray_float.pb:input=dnn_in:output=dnn_out:dnn_backend=tensorflow out.tf.png

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-01-07 10:51:38 -03:00
Guo, Yejun
04e6f8a143 vf_dnn_processing: remove parameter 'fmt'
do not request AVFrame's format in vf_ddn_processing with 'fmt',
but to add another filter for the format.

command examples:
./ffmpeg -i input.jpg -vf format=bgr24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native -y out.native.png
./ffmpeg -i input.jpg -vf format=rgb24,dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:dnn_backend=native -y out.native.png

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2020-01-07 10:35:59 -03:00
Guo, Yejun
ed9fc2e3c5 avfilter/vf_dnn_processing: refine code for better naming
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2019-12-13 11:41:10 -03:00
leozhang
c79307b7de avfilter/vf_dnn_processing: correct duplicate statement
Signed-off-by: leozhang <leozhang@qiyi.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-11-08 14:57:01 +01:00
Guo, Yejun
f6e942251c avfilter/vf_dnn_processing: fix fate-source
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-11-08 14:56:38 +01:00
Guo, Yejun
4d980a8ceb avfilter/vf_dnn_processing: add a generic filter for image proccessing with dnn networks
This filter accepts all the dnn networks which do image processing.
Currently, frame with formats rgb24 and bgr24 are supported. Other
formats such as gray and YUV will be supported next. The dnn network
can accept data in float32 or uint8 format. And the dnn network can
change frame size.

The following is a python script to halve the value of the first
channel of the pixel. It demos how to setup and execute dnn model
with python+tensorflow. It also generates .pb file which will be
used by ffmpeg.

import tensorflow as tf
import numpy as np
import imageio
in_img = imageio.imread('in.bmp')
in_img = in_img.astype(np.float32)/255.0
in_data = in_img[np.newaxis, :]
filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0, 1.]).reshape(1,1,3,3).astype(np.float32)
filter = tf.Variable(filter_data)
x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', name='dnn_out')
sess=tf.Session()
sess.run(tf.global_variables_initializer())
output = sess.run(y, feed_dict={x: in_data})
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb', as_text=False)
output = output * 255.0
output = output.astype(np.uint8)
imageio.imsave("out.bmp", np.squeeze(output))

To do the same thing with ffmpeg:
- generate halve_first_channel.pb with the above script
- generate halve_first_channel.model with tools/python/convert.py
- try with following commands
  ./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=native -y out.native.png
  ./ffmpeg -i input.jpg -vf dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:fmt=rgb24:dnn_backend=tensorflow -y out.tf.png

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Pedro Arthur <bygrandao@gmail.com>
2019-11-07 15:46:00 -03:00