FFmpeg is a set of open source tools for audio and video processing, such as creating, converting/transcoding, and publishing media content.
The FFmpeg docker images are compiled with the following audio and video codecs:
Codec | Version | Codec | Version |
---|---|---|---|
fdk-acc | 0.1.6 | x265 | 2.9 |
mp3lame | 3.100 | vpx | 1.7.0 |
opus | 1.2.1 | aom | 1.0.0 |
ogg | 1.3.3 | SVT-HEVC | 1.3.0 |
vorbis | 1.3.6 | SVT-AV1 | custom |
x264 | stable | SVT-VP9 | custom |
The FFmpeg builds included the following patches for feature enhancement, better performance or bug fixes:
Patch | Description |
---|---|
11625 | Enhance 1:N transcoding performance. |
11035 | Fix libvpx to run on Intel(R) Xeon(R) processors. |
H.265 FLV | Support H.265 in FLV for RTMP streaming. |
IE_FILTERS_01 | Intel inference engine detection filter. |
IE_FILTERS_02 | New filter to do inference classify. |
IE_FILTERS_03 | IE metadata convertor muxer. |
IE_FILTERS_04 | Kafka protocol producer. |
IE_FILTERS_05 | Support object detection and featured face identification. |
IE_FILTERS_06 | Send metadata in a packet and refine the json format. |
IE_FILTERS_07 | Refine features of IE filters. |
IE_FILTERS_08 | Fixed extra comma in iemetadata. |
IE_FILTERS_09 | Add source as option source url calculate nano times. |
IE_FILTERS_10 | Fixed buffer overflow issue in iemetadata. |
IE_FILTERS_11 | Add RGBP pixel format |
IE_FILTERS_12 | Add more devices into target. |
IE_FILTERS_13 | Enable vaapi scale for IE inference filters. |
IE_FILTERS_14 | Iemetadata it will provide data frame by frame. |
IE_FILTERS_15 | Add libcjson for model pre/post processing. |
IE_FILTERS_16 | Change IE filters to use model proc. |
In GPU images, the FFmpeg docker images are accelerated through vaapi and/or qsv (Intel Media SDK).
Transcode raw yuv420 content to SVT-HEVC and mp4:
ffmpeg -f rawvideo -vcodec rawvideo -s 320x240 -r 30 -pix_fmt yuv420p -i test.yuv -c:v libsvt_hevc -y test.mp4
1:N Transcoding:
ffmpeg -i input.h264 -vf "scale=1280:720" -pix_fmt nv12 -f null /dev/null -vf "scale=720:480" -pix_fmt nv12 -f null /dev/null -abr_pipeline
Encoding/decoding with vaapi:
ffmpeg -y -vaapi_device /dev/dri/renderD128 -f rawvideo -video_size 320x240 -r 30 -i test.yuv -vf 'format=nv12, hwupload' -c:v h264_vaapi -y test.mp4
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -i test.mp4 -f null /dev/null
Encoding/decoding with qsv (Intel Media SDK):
ffmpeg -y -init_hw_device qsv=hw -filter_hw_device hw -f rawvideo -pix_fmt yuv420p -s:v 320x240 -i test.yuv -vf hwupload=extra_hw_frames=64,format=qsv -c:v h264_qsv -b:v 5M test.mp4
ffmpeg -hwaccel qsv -c:v h264_qsv -i test.mp4 -f null /dev/null
Face detection and emotion identification, save metadata to json format:
ffmpeg -i ~/Videos/xxx.mp4 -vf detect=model=./face-detection-adas-0001/FP32/face-detection-adas-0001.xml, \
classify=model=./emotions_recognition/emotions-recognition-retail-0003.xml:model_proc=emotions-recognition-retail-0003.json \
-an -f iemetadata -source_url $URL -custom_tag $TAG emotion-meta.json
Object Detection:
ffmpeg -i ~/Videos/xxx.mp4 -vf detect=model=./mobilenet-ssd.xml:model_proc=mobilenet-ssd.json -an -f null /dev/null
Face detection and reidentification:
ffmpeg -i ~/Videos/xxx.mp4 -vf detect=model=./face-detection-retail-0004.xml, \
classify=model=./face-reidentification-retail-0095.xml:label=./labels.txt:name=face_id:feature_file=./registered_faces.bin -an -f null /dev/nul
GPU decdoe + face detection
ffmpeg -flags unaligned -hwaccel vaapi -hwaccel_output_format vaapi -hwaccel_device /dev/dri/renderD128 \
# uncomment to choose different devices: CPU=2 GPU=3 VPU=5 HDDL=6
#-i $STREAM -vf "detect=model=$D_FACE_RT_MODEL:device=$CPU" -an -f null - \
#-i $STREAM -vf "detect=model=$D_FACE_RT_FP16_MODEL:device=$GPU" -an -f null -
#-i $STREAM -vf "detect=model=$D_FACE_RT_FP16_MODEL:device=$VPU" -an -f null -
#-i $STREAM -vf "detect=model=$D_FACE_RT_FP16_MODEL:device=$HDDL" -an -f null -