dic*_*oce 4 encoding ffmpeg vaapi
我已经看到了使用libavcodec和vaapi来加速解码的各种例子,但你如何使用它来加速编码呢?
林正浩*_*林正浩 10
As of today, FFmpeg and libav have implemented hardware-accelerated encoding via VAAPI on supported platforms and hardware SKUs, and I have written a write-up on the same that will enable you to set up, deploy and use both ffmpeg and libav to achieve the same effect.
And in the same note, I've added references to hardware surface limits so you'll know what hardware platforms support specific video codecs for use with VAAPI encoding.
A sample FFmpeg build that's loaded via the environment-modules system is shown below, step-by-step:
Building a VAAPI-enabled FFmpeg with support for VP8/9 decode and encode hardware acceleration on a Skylake validation testbed as an example:
Build platform: Ubuntu 16.04LTS.
First things first:
Build the dependency chain first.
This is the C for Media Runtime GPU Kernel Manager for Intel G45 & HD Graphics family. it's a prerequisite for building the intel-hybrid-driver package.
git clone https://github.com/01org/cmrt
cd cmrt
./autogen.sh
./configure
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
sudo ldconfig -vvvv
Run Code Online (Sandbox Code Playgroud)
This package provides support for WebM project VPx codecs. GPU acceleration is provided via media kernels executed on Intel GEN GPUs. The hybrid driver provides the CPU bound entropy (e.g., CPBAC) decoding and manages the GEN GPU media kernel parameters and buffers.
It's a prerequisite to building libva with the desired configuration so we can access VPX-series hybrid decode capabilities on supported hardware configurations.
git clone https://github.com/01org/intel-hybrid-driver
cd intel-hybrid-driver
./autogen.sh
./configure
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
sudo ldconfig -vvv
Run Code Online (Sandbox Code Playgroud)
This package provides the VA-API (Video Acceleration API) user mode driver for Intel GEN Graphics family SKUs.
The current video driver back-end provides a bridge to the GEN GPUs through the packaging of buffers and commands to be sent to the i915 driver for exercising both hardware and shader functionality for video decode, encode, and processing.
it also provides a wrapper to the intel-hybrid-driver when called up to handle VP8/9 hybrid decode tasks on supported hardware (Must be configured with the --enable-hybrid-codec option).
git clone https://github.com/01org/intel-vaapi-driver
cd intel-vaapi-driver
./autogen.sh
./configure --enable-hybrid-codec
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
sudo ldconfig -vvvv
Run Code Online (Sandbox Code Playgroud)
Libva is an implementation for VA-API (Video Acceleration API)
VA-API is an open-source library and API specification, which provides access to graphics hardware acceleration capabilities for video processing. It consists of a main library and driver-specific acceleration backends for each supported hardware vendor.
git clone https://github.com/01org/libva
cd libva
./autogen.sh
./configure
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
sudo ldconfig -vvvv
Run Code Online (Sandbox Code Playgroud)
When done, test the VAAPI supported featureset by running vainfo:
vainfo
Run Code Online (Sandbox Code Playgroud)
The output on my current testbed is:
libva info: VA-API version 0.40.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/local/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_40
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.40 (libva 1.7.3)
vainfo: Driver version: Intel i965 driver for Intel(R) Skylake - 1.8.3.pre1 (glk-alpha-58-g5a984ae)
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Simple : VAEntrypointEncSlice
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointEncSliceLP
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointEncSliceLP
VAProfileH264MultiviewHigh : VAEntrypointVLD
VAProfileH264MultiviewHigh : VAEntrypointEncSlice
VAProfileH264StereoHigh : VAEntrypointVLD
VAProfileH264StereoHigh : VAEntrypointEncSlice
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointEncPicture
VAProfileVP8Version0_3 : VAEntrypointVLD
VAProfileVP8Version0_3 : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileVP9Profile0 : VAEntrypointVLD
Run Code Online (Sandbox Code Playgroud)
Making a usable FFmpeg build to test the encoders:
Now, we will build an FFmpeg binary that can take advantage of VAAPI to test the encode and decode capabilities on Skylake, using a custom prefix because we load FFmpeg via the environment-modules system on the testbed.
Prepare the target directories first:
sudo mkdir -p /apps/ffmpeg/dyn
sudo chown -Rc $USER:$USER /apps/ffmpeg/dyn
mkdir -p ~/ffmpeg_sources
Run Code Online (Sandbox Code Playgroud)
Include extra components as needed:
(一个).构建和部署nasm: Nasm是x264和FFmpeg使用的x86优化的汇编程序.强烈推荐或您的结果可能非常慢.
请注意,我们现在已经从Yasm切换到nasm,因为这是x265,x264等采用的当前汇编程序.
cd ~/ffmpeg_sources
wget wget http://www.nasm.us/pub/nasm/releasebuilds/2.14rc0/nasm-2.14rc0.tar.gz
tar xzvf nasm-2.14rc0.tar.gz
cd nasm-2.14rc0
./configure --prefix="/apps/ffmpeg/dyn" --bindir="/apps/ffmpeg/dyn/bin"
make -j$(nproc) VERBOSE=1
make -j$(nproc) install
make -j$(nproc) distclean
Run Code Online (Sandbox Code Playgroud)
(b)中.静态构建和部署libx264: 该库提供H.264视频编码器.有关更多信息和用法示例,请参阅H.264编码指南.这需要使用--enable-gpl --enable-libx264配置ffmpeg .
cd ~/ffmpeg_sources
wget http://download.videolan.org/pub/x264/snapshots/last_x264.tar.bz2
tar xjvf last_x264.tar.bz2
cd x264-snapshot*
PATH="/apps/ffmpeg/dyn/bin:$PATH" ./configure --prefix="/apps/ffmpeg/dyn" --bindir="/apps/ffmpeg/dyn/bin" --enable-static --disable-opencl
PATH="/apps/ffmpeg/dyn/bin:$PATH" make -j$(nproc) VERBOSE=1
make -j$(nproc) install VERBOSE=1
make -j$(nproc) distclean
Run Code Online (Sandbox Code Playgroud)
(C).构建和配置libx265: 该库提供H.265/HEVC视频编码器.有关更多信息和用法示例,请参阅H.265编码指南.
sudo apt-get install cmake mercurial
cd ~/ffmpeg_sources
hg clone https://bitbucket.org/multicoreware/x265
cd ~/ffmpeg_sources/x265/build/linux
PATH="$/apps/ffmpeg/dyn/bin:$PATH" cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX="/apps/ffmpeg/dyn" -DENABLE_SHARED:bool=off ../../source
make -j$(nproc) VERBOSE=1
make -j$(nproc) install VERBOSE=1
make -j$(nproc) clean VERBOSE=1
Run Code Online (Sandbox Code Playgroud)
(d). Build and deploy the libfdk-aac library: This provides an AAC audio encoder. See the AAC Audio Encoding Guide for more information and usage examples. This requires ffmpeg to be configured with --enable-libfdk-aac (and --enable-nonfree if you also included --enable-gpl).
cd ~/ffmpeg_sources
wget -O fdk-aac.tar.gz https://github.com/mstorsjo/fdk-aac/tarball/master
tar xzvf fdk-aac.tar.gz
cd mstorsjo-fdk-aac*
autoreconf -fiv
./configure --prefix="/apps/ffmpeg/dyn" --disable-shared
make -j$(nproc)
make -j$(nproc) install
make -j$(nproc) distclean
Run Code Online (Sandbox Code Playgroud)
(e). Build and configure libvpx
cd ~/ffmpeg_sources
git clone https://github.com/webmproject/libvpx/
cd libvpx
./configure --prefix="/apps/ffmpeg/dyn" --enable-runtime-cpu-detect --enable-vp9 --enable-vp8 \
--enable-postproc --enable-vp9-postproc --enable-multi-res-encoding --enable-webm-io --enable-vp9-highbitdepth --enable-onthefly-bitpacking --enable-realtime-only \
--cpu=native --as=yasm
time make -j$(nproc)
time make -j$(nproc) install
time make clean -j$(nproc)
time make distclean
Run Code Online (Sandbox Code Playgroud)
(f). Build LibVorbis
cd ~/ffmpeg_sources
wget -c -v http://downloads.xiph.org/releases/vorbis/libvorbis-1.3.5.tar.xz
tar -xvf libvorbis-1.3.5.tar.xz
cd libvorbis-1.3.5
./configure --enable-static --prefix="/apps/ffmpeg/dyn"
time make -j$(nproc)
time make -j$(nproc) install
time make clean -j$(nproc)
time make distclean
Run Code Online (Sandbox Code Playgroud)
(g). Build FFmpeg:
cd ~/ffmpeg_sources
git clone https://github.com/FFmpeg/FFmpeg -b master
cd FFmpeg
PATH="/apps/ffmpeg/dyn/bin:$PATH" PKG_CONFIG_PATH="/apps/ffmpeg/dyn/lib/pkgconfig" ./configure \
--pkg-config-flags="--static" \
--prefix="/apps/ffmpeg/dyn" \
--extra-cflags="-I/apps/ffmpeg/dyn/include" \
--extra-ldflags="-L/apps/ffmpeg/dyn/lib" \
--bindir="/apps/ffmpeg/dyn/bin" \
--enable-debug=3 \
--enable-vaapi \
--enable-libvorbis \
--enable-libvpx \
--enable-gpl \
--cpu=native \
--enable-opengl \
--enable-libfdk-aac \
--enable-libx264 \
--enable-libx265 \
--enable-nonfree
PATH="/apps/ffmpeg/dyn/bin:$PATH" make -j$(nproc)
make -j$(nproc) install
make -j$(nproc) distclean
hash -r
Run Code Online (Sandbox Code Playgroud)
Note: To get debug builds, omit the distclean step and you'll find the ffmpeg_g binary under the sources subdirectory.
We often want debug builds when an issue crops up and a gdb trace may be required for debugging purposes.
The environment-modules file for FFmpeg (edit as necessary if your prefixes differ, and to add conflicts if needed):
less /usr/share/modules/modulefiles/ffmpeg/vaapi
#%Module1.0#####################################################################
##
## ffmpeg media transcoder modulefile
## By Dennis Mungai <dmngaie@gmail.com>
## February, 2016
##
# for Tcl script use only
set appname ffmpeg
set version dyn
set prefix /apps/${appname}/${version}
set exec_prefix ${prefix}/bin
conflict ffmpeg/git
prepend-path PATH ${exec_prefix}
prepend-path LD_LIBRARY_PATH ${prefix}/lib
Run Code Online (Sandbox Code Playgroud)
To load and test, run:
module load ffmpeg/vaapi
Run Code Online (Sandbox Code Playgroud)
Confirm all is sane via:
which ffmpeg
Run Code Online (Sandbox Code Playgroud)
Expected output:
/apps/ffmpeg/dyn/bin/ffmpeg
Run Code Online (Sandbox Code Playgroud)
Sample snippets to test the new encoders:
Confirm that the VAAPI encoders have been built successfully:
ffmpeg -hide_banner -encoders | grep vaapi
V..... h264_vaapi H.264/AVC (VAAPI) (codec h264)
V..... hevc_vaapi H.265/HEVC (VAAPI) (codec hevc)
V..... mjpeg_vaapi MJPEG (VAAPI) (codec mjpeg)
V..... mpeg2_vaapi MPEG-2 (VAAPI) (codec mpeg2video)
V..... vp8_vaapi VP8 (VAAPI) (codec vp8)
Run Code Online (Sandbox Code Playgroud)
See the help documentation for each encoder in question:
ffmpeg -hide_banner -h encoder='encoder name'
Run Code Online (Sandbox Code Playgroud)
Test the encoders;
Using GNU parallel, we will encode some mp4 files (4k H.264 test samples, 40 minutes each, AAC 6-channel audio) on the ~/src path on the system to VP8 and HEVC respectively using the examples below. Note that I've tuned the encoders to suit my use-cases, and re-scaling to 1080p is enabled. Adjust as necessary.
To VP8, launching 10 encode jobs simultaneously:
parallel -j 10 --verbose '/apps/ffmpeg/dyn/bin/ffmpeg -loglevel debug -threads 4 -hwaccel vaapi -i "{}" -vaapi_device /dev/dri/renderD129 -c:v vp8_vaapi -loop_filter_level:v 63 -loop_filter_sharpness:v 15 -b:v 4500k -maxrate:v 7500k -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' -c:a libvorbis -b:a 384k -ac 6 -f webm "{.}.webm"' ::: $(find . -type f -name '*.mp4')
Run Code Online (Sandbox Code Playgroud)
To HEVC with GNU Parallel:
To HEVC Main Profile, launching 10 encode jobs simultaneously:
parallel -j 4 --verbose '/apps/ffmpeg/dyn/bin/ffmpeg -loglevel debug -threads 4 -hwaccel vaapi -i "{}" -vaapi_device /dev/dri/renderD129 -c:v hevc_vaapi -qp:v 19 -b:v 2100k -maxrate:v 3500k -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' -c:a libvorbis -b:a 384k -ac 6 -f matroska "{.}.mkv"' ::: $(find . -type f -name '*.mp4')
Run Code Online (Sandbox Code Playgroud)
Some notes:
Hey guys, small update: VP9 hardware-accelerated encoding is now available for FFmpeg. However, you'll need an Intel Kabylake-based Integrated GPU to take advantage of this feature.
And now, with the new vp9_vaapi encoder, here's what we get.
Encoder options now available:
ffmpeg -h vp9_vaapi
Run Code Online (Sandbox Code Playgroud)
Output:
Encoder vp9_vaapi [VP9 (VAAPI)]:
General capabilities: delay
Threading capabilities: none
Supported pixel formats: vaapi_vld
vp9_vaapi AVOptions:
-loop_filter_level <int> E..V.... Loop filter level (from 0 to 63) (default 16)
-loop_filter_sharpness <int> E..V.... Loop filter sharpness (from 0 to 15) (default 4)
Run Code Online (Sandbox Code Playgroud)
What happens when you try to pull this off on unsupported hardware, say Skylake?
See the sample output below:
[Parsed_format_0 @ 0x42cb500] compat: called with args=[nv12]
[Parsed_format_0 @ 0x42cb500] Setting 'pix_fmts' to value 'nv12'
[Parsed_scale_vaapi_2 @ 0x42cc300] Setting 'w' to value '1920'
[Parsed_scale_vaapi_2 @ 0x42cc300] Setting 'h' to value '1080'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'video_size' to value '3840x2026'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'pix_fmt' to value '0'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'time_base' to value '1/1000'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'pixel_aspect' to value '1/1'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'frame_rate' to value '24000/1001'
[graph 0 input from stream 0:0 @ 0x42cce00] w:3840 h:2026 pixfmt:yuv420p tb:1/1000 fr:24000/1001 sar:1/1 sws_param:flags=2
[format @ 0x42cba40] compat: called with args=[vaapi_vld]
[format @ 0x42cba40] Setting 'pix_fmts' to value 'vaapi_vld'
[auto_scaler_0 @ 0x42cd580] Setting 'flags' to value 'bicubic'
[auto_scaler_0 @ 0x42cd580] w:iw h:ih flags:'bicubic' interl:0
[Parsed_format_0 @ 0x42cb500] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the filter 'Parsed_format_0'
[AVFilterGraph @ 0x42ca360] query_formats: 6 queried, 4 merged, 1 already done, 0 delayed
[auto_scaler_0 @ 0x42cd580] w:3840 h:2026 fmt:yuv420p sar:1/1 -> w:3840 h:2026 fmt:nv12 sar:1/1 flags:0x4
[hwupload @ 0x42cbcc0] Surface format is nv12.
[AVHWFramesContext @ 0x42ccbc0] Created surface 0x4000000.
[AVHWFramesContext @ 0x42ccbc0] Direct mapping possible.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000001.
[AVHWFramesContext @ 0x42c3e40] Direct mapping possible.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000002.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000003.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000004.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000005.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000006.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000007.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000008.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000009.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x400000a.
[vp9_vaapi @ 0x409da40] Encoding entrypoint not found (19 / 6).
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
[AVIOContext @ 0x40fdac0] Statistics: 0 seeks, 0 writeouts
[aac @ 0x40fcb00] Qavg: -nan
[AVIOContext @ 0x409f820] Statistics: 32768 bytes read, 0 seeks
Conversion failed!
Run Code Online (Sandbox Code Playgroud)
The interesting bits are the entrypoint warnings for VP9 encoding being absent on this particular platform, as confirmed by vainfo's output:
libva info: VA-API version 0.40.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/local/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_40
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.40 (libva 1.7.3)
vainfo: Driver version: Intel i965 driver for Intel(R) Skylake - 1.8.4.pre1 (glk-alpha-71-gc3110dc)
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Simple : VAEntrypointEncSlice
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointEncSliceLP
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointEncSliceLP
VAProfileH264MultiviewHigh : VAEntrypointVLD
VAProfileH264MultiviewHigh : VAEntrypointEncSlice
VAProfileH264StereoHigh : VAEntrypointVLD
VAProfileH264StereoHigh : VAEntrypointEncSlice
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointEncPicture
VAProfileVP8Version0_3 : VAEntrypointVLD
VAProfileVP8Version0_3 : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileVP9Profile0 : VAEntrypointVLD
Run Code Online (Sandbox Code Playgroud)
The VLD (for Variable Length Decode) entry point for VP9 profile 0 is the furthest that Skylake comes to in terms of VP9 hardware-acceleration.
These with Kabylake test beds, run these encode tests and report back :-)
2020 更新:FFMPEG 现在完全支持 VAAPI 编码 + 解码
你可以用类似的东西进行编码:
ffmpeg -y -vaapi_device /dev/dri/renderD128 -i file_in.avi -vf 'format=nv12,hwupload' -c:v h264_vaapi file_out.mp4