Understanding pixel format and profile when encoding 10-bit video in ffmpeg with nvenc

Question

Understanding pixel format and profile when encoding 10-bit video in ffmpeg with nvenc

I am trying to encode a 10-bit H.265 video from a 8-bit H.264 source using ffmpeg with CUDA hardware acceleration.

Without hardware acceleration, a typical command would be ffmpeg -i input.mkv -pix_fmt yuv420p10le -c:v libx265 -crf 21 -x265-params profile=main10 out.mkv.

Using CUDA (on a Pascal 1050 Ti), I expect the corresponding command to be ffmpeg -i input.mkv -pix_fmt yuv420p10le -c:v hevc_nvenc -profile:v main10 -cq 21 out.mkv.

However, when I list the supported encoder settings using ffmpeg -h encoder=hevc_nvenc (output of command pasted below), even though it supports main10, no 10-bit pixel format is supported: Supported pixel formats: yuv420p nv12 p010le yuv444p p016le yuv444p16le bgr0 rgb0 cuda d3d11. Does the encoding work for my use case, or not?

ffmpeg version 4.3.1-2021-01-01-full_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 10.2.0 (Rev5, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-libsnappy --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libzvbi --enable-librav1e --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Encoder hevc_nvenc [NVIDIA NVENC hevc encoder]:
    General capabilities: delay hardware
    Threading capabilities: none
    Supported hardware devices: cuda cuda d3d11va d3d11va
    Supported pixel formats: yuv420p nv12 p010le yuv444p p016le yuv444p16le bgr0 rgb0 cuda d3d11
hevc_nvenc AVOptions:
  -preset            <int>        E..V...... Set the encoding preset (from 0 to 11) (default medium)
     default         0            E..V......
     slow            1            E..V...... hq 2 passes
     medium          2            E..V...... hq 1 pass
     fast            3            E..V...... hp 1 pass
     hp              4            E..V......
     hq              5            E..V......
     bd              6            E..V......
     ll              7            E..V...... low latency
     llhq            8            E..V...... low latency hq
     llhp            9            E..V...... low latency hp
     lossless        10           E..V...... lossless
     losslesshp      11           E..V...... lossless hp
  -profile           <int>        E..V...... Set the encoding profile (from 0 to 4) (default main)
     main            0            E..V......
     main10          1            E..V......
     rext            2            E..V......
  -level             <int>        E..V...... Set the encoding level restriction (from 0 to 186) (default auto)
     auto            0            E..V......
     1               30           E..V......
     1.0             30           E..V......
     2               60           E..V......
     2.0             60           E..V......
     2.1             63           E..V......
     3               90           E..V......
     3.0             90           E..V......
     3.1             93           E..V......
     4               120          E..V......
     4.0             120          E..V......
     4.1             123          E..V......
     5               150          E..V......
     5.0             150          E..V......
     5.1             153          E..V......
     5.2             156          E..V......
     6               180          E..V......
     6.0             180          E..V......
     6.1             183          E..V......
     6.2             186          E..V......
  -tier              <int>        E..V...... Set the encoding tier (from 0 to 1) (default main)
     main            0            E..V......
     high            1            E..V......
  -rc                <int>        E..V...... Override the preset rate-control (from -1 to INT_MAX) (default -1)
     constqp         0            E..V...... Constant QP mode
     vbr             1            E..V...... Variable bitrate mode
     cbr             2            E..V...... Constant bitrate mode
     vbr_minqp       8388612      E..V...... Variable bitrate mode with MinQP (deprecated)
     ll_2pass_quality 8388616      E..V...... Multi-pass optimized for image quality (deprecated)
     ll_2pass_size   8388624      E..V...... Multi-pass optimized for constant frame size (deprecated)
     vbr_2pass       8388640      E..V...... Multi-pass variable bitrate mode (deprecated)
     cbr_ld_hq       8            E..V...... Constant bitrate low delay high quality mode
     cbr_hq          16           E..V...... Constant bitrate high quality mode
     vbr_hq          32           E..V...... Variable bitrate high quality mode
  -rc-lookahead      <int>        E..V...... Number of frames to look ahead for rate-control (from 0 to INT_MAX) (default 0)
  -surfaces          <int>        E..V...... Number of concurrent surfaces (from 0 to 64) (default 0)
  -cbr               <boolean>    E..V...... Use cbr encoding mode (default false)
  -2pass             <boolean>    E..V...... Use 2pass encoding mode (default auto)
  -gpu               <int>        E..V...... Selects which NVENC capable GPU to use. First GPU is 0, second is 1, and so on. (from -2 to INT_MAX) (default any)
     any             -1           E..V...... Pick the first device available
     list            -2           E..V...... List the available devices
  -delay             <int>        E..V...... Delay frame output by the given amount of frames (from 0 to INT_MAX) (default INT_MAX)
  -no-scenecut       <boolean>    E..V...... When lookahead is enabled, set this to 1 to disable adaptive I-frame insertion at scene cuts (default false)
  -forced-idr        <boolean>    E..V...... If forcing keyframes, force them as IDR frames. (default false)
  -spatial_aq        <boolean>    E..V...... set to 1 to enable Spatial AQ (default false)
  -spatial-aq        <boolean>    E..V...... set to 1 to enable Spatial AQ (default false)
  -temporal_aq       <boolean>    E..V...... set to 1 to enable Temporal AQ (default false)
  -temporal-aq       <boolean>    E..V...... set to 1 to enable Temporal AQ (default false)
  -zerolatency       <boolean>    E..V...... Set 1 to indicate zero latency operation (no reordering delay) (default false)
  -nonref_p          <boolean>    E..V...... Set this to 1 to enable automatic insertion of non-reference P-frames (default false)
  -strict_gop        <boolean>    E..V...... Set 1 to minimize GOP-to-GOP rate fluctuations (default false)
  -aq-strength       <int>        E..V...... When Spatial AQ is enabled, this field is used to specify AQ strength. AQ strength scale is from 1 (low) - 15 (aggressive) (from 1 to 15) (default 8)
  -cq                <float>      E..V...... Set target quality level (0 to 51, 0 means automatic) for constant quality mode in VBR rate control (from 0 to 51) (default 0)
  -aud               <boolean>    E..V...... Use access unit delimiters (default false)
  -bluray-compat     <boolean>    E..V...... Bluray compatibility workarounds (default false)
  -init_qpP          <int>        E..V...... Initial QP value for P frame (from -1 to 51) (default -1)
  -init_qpB          <int>        E..V...... Initial QP value for B frame (from -1 to 51) (default -1)
  -init_qpI          <int>        E..V...... Initial QP value for I frame (from -1 to 51) (default -1)
  -qp                <int>        E..V...... Constant quantization parameter rate control method (from -1 to 51) (default -1)
  -weighted_pred     <int>        E..V...... Set 1 to enable weighted prediction (from 0 to 1) (default 0)
  -b_ref_mode        <int>        E..V...... Use B frames as references (from 0 to 2) (default disabled)
     disabled        0            E..V...... B frames will not be used for reference
     each            1            E..V...... Each B frame will be used for reference
     middle          2            E..V...... Only (number of B frames)/2 will be used for reference
  -dpb_size          <int>        E..V...... Specifies the DPB size used for encoding (0 means automatic) (from 0 to INT_MAX) (default 0)

Run Code Online (Sandbox Code Playgroud)

ffmpeg log of a 2-minute encode of a UHD video:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.83.100
  Duration: 00:40:30.04, start: 0.000000, bitrate: 25161 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 24993 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 164 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> hevc (hevc_nvenc))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
Incompatible pixel format 'yuv420p10le' for codec 'hevc_nvenc', auto-selecting format 'p010le'
Output #0, matroska, to 'out.mkv':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.45.100
    Stream #0:0(und): Video: hevc (hevc_nvenc) (Main 10), p010le, 3840x2160 [SAR 1:1 DAR 16:9], q=-1--1, 23.98 fps, 1k tbn, 23.98 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : Lavc58.91.100 hevc_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 4000000 vbv_delay: N/A
    Stream #0:1(und): Audio: aac (LC) ([255][0][0][0] / 0x00FF), 48000 Hz, stereo, fltp, 164 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

Run Code Online (Sandbox Code Playgroud)

MediaInfo of the output.

General
Format                         : Matroska
Format version                 : Version 4 / Version 2
File size                      : 288 MiB
Duration                       : 2 min 0 s
Overall bit rate               : 20.2 Mb/s
Writing application            : Lavf58.45.100
Writing library                : Lavf58.45.100
ErrorDetectionType             : Per level 1

Video
ID                             : 1
Format                         : HEVC
Format/Info                    : High Efficiency Video Coding
Format profile                 : Main 10@L5@Main
Codec ID                       : V_MPEGH/ISO/HEVC
Duration                       : 2 min 0 s
Width                          : 3 840 pixels
Height                         : 2 160 pixels
Display aspect ratio           : 16:9
Frame rate mode                : Constant
Frame rate                     : 23.976 (24000/1001) FPS
Color space                    : YUV
Chroma subsampling             : 4:2:0
Bit depth                      : 10 bits
Writing library                : Lavc58.91.100 hevc_nvenc
Default                        : Yes
Forced                         : No
DURATION                       : 00:02:00.037000000
HANDLER_NAME                   : VideoHandler

Audio
ID                             : 2
Format                         : AAC
Format/Info                    : Advanced Audio Codec
Format profile                 : LC
Codec ID                       : A_AAC
Duration                       : 2 min 0 s
Channel(s)                     : 2 channels
Channel positions              : Front: L R
Sampling rate                  : 48.0 kHz
Frame rate                     : 46.875 FPS (1024 SPF)
Compression mode               : Lossy
Delay relative to video        : -42 ms
Default                        : Yes
Forced                         : No
DURATION                       : 00:02:00.000000000
HANDLER_NAME                   : SoundHandler

Run Code Online (Sandbox Code Playgroud)

If I disable the fallback to p010le, the output would be Bit depth : 8 bits but Format profile : Main 10@L5@Main. What does it mean? The encoding process is 10-bit while the output is reduced to 8-bit?

Answer 1

Rad*_*ech 7

...尽管它支持 main10，但不支持 10 位像素格式：

硬件 HEVC 编码器使用像素格式p010le并p016le用于 10 位输出，其中第一个生成yuv 4:2:0，第二个生成yuv 4:4:4。

如果我禁用回退到 p010le，输出将为位深度：8 位，但格式配置文件：Main 10@L5@Main。这是什么意思？

该配置文件指定设备能够播放视频的最低能力，反之亦然，编码器指定可用于对视频进行编码的最大值。

这意味着：如果您指定视频为 Main 10@L5@Main，则它可以在任何支持 10 位格式且能够至少 25Mbps 解码的电视上播放。然而，这并没有告诉编码器如何实际编码视频，而是告诉它视频每个样本最多可以有 10 位，并且比特率不能超过 25Mbps，这意味着如果编码器创建 5Mbps 的 8 位视频，它仍然满足给定的条件，并且可以将视频标记为Main 10@L5@Main。

如果您想告诉编码器应该使用什么颜色深度和比特率，则必须通过其他参数指定（见下文）。

以下是我使用 Pascal 编码器（GTX 10x0 卡）将视频从 AVC 转换为 HEVC 10 位的命令：

ffmpeg -y -hide_banner -hwaccel nvdec -hwaccel_device 0 -vsync 0 -i "input.mp4" -c copy -c:v:0 hevc_nvenc -profile:v main10 -pix_fmt p010le -rc:v:0 vbr_hq -rc-lookahead 32 -cq 21 -qmin 1 -qmax 51 -b:v:0 10M -maxrate:v:0 20M -gpu 0 "output.mkv"

Run Code Online (Sandbox Code Playgroud)

类似的命令可用于较新的 Turing (GTX 20x0) 和 Ampere (RTX 30x0) 编码器：

ffmpeg -y -hide_banner -vsync 0 -hwaccel cuda -hwaccel_output_format cuda  -hwaccel_device 0 -c:v:0 h264_cuvid -i "input.mp4" -vf "hwdownload,format=nv12" -c copy -c:v:0 hevc_nvenc -profile:v main10 -pix_fmt p010le -rc:v:0 vbr -tune hq -preset p5 -multipass 1 -bf 4 -b_ref_mode 1 -nonref_p 1 -rc-lookahead 75 -spatial-aq 1 -aq-strength 8 -temporal-aq 1 -cq 21 -qmin 1 -qmax 99 -b:v:0 10M -maxrate:v:0 20M -gpu 0 "output.mkv"

Run Code Online (Sandbox Code Playgroud)

参数解释：

-pix_fmt p010le将8bit输入转换为10bit；请注意，转换是由 CPU 完成的，因此它会使编码速度变慢，但会产生质量更高的视频，并且在 CRF 中比特率也更低（文件更小）。-vf "hwdownload,format=nv12"对于 CUDA 解码器，必须与（或-vf "hwdownload,format=p010le"对于 10 位输入视频）一起使用，将解码帧从 CUDA 复制到 CPU 中进行转换（NVDEC 解码器自动将帧发送到 CPU 中。）-profile main10需要指定以允许 10 位编码，但不会准确影响编码器的方式对视频进行编码 - 编码器本身不会更改输入的位深度！
-rc:v:0 vbr_hq -cq 21 -qmin 1 -qmax 99需要完全启用 CRF 模式。增加qmin以降低比特率峰值，降低比特率qmax峰值以防止低质量帧（建议在没有 AQ 的情况下进行编码）。在图灵和安培上使用-rc:v:0 vbr -tune hq代替vbr_hq可以获得相同的结果。顺便说一句，HEVC 的质量是推荐质量-cq 28（或-cq 30启用 AQ）。
-b:v:0 10M -maxrate:v:0 20M指定目标设备支持的建议和最大比特率。对于主层@L5，您可以使用最大。25M，@L6 最大为 60Mbps（对于 30fps 视频）。硬件编码器也需要知道如何计算 CRF 模式下的 QP 值。我使用 10M/20M 来存储存储在 NAS 上并通过 LAN 在电视上播放的视频。
present=slow启用 2 遍处理和其他高级优化；由于硬件编码器比软件编码器更快，因此您可以使用慢速预设，但仍然比 CPU更快的预设获得更快的处理速度。在 Ampere 上，您必须使用-preset p5 -multipass 2等于慢速预设的速度（您可以达到p7等于非常慢的速度，但在大多数情况下对文件大小几乎没有额外影响；您可以使用-multipass 14 倍快的第一遍）。
hwaccell启用硬件解码器并指定哪个设备将解码视频（如果您有 SLI）。根据您的 CPU 速度，您可以测试哪个最适合您。NVDEC可以解码任何MPEG视频，但速度较慢；为了获得更快的 CUDA，您必须指定源是 AVC、HEVC 还是 AV1。对于 DivX、Xvid 和非 MPEG 输入，完全删除它以切换到使用 CPU 的软件解码器。
-bf 4 -b_ref_mode 1 -nonref_p 1启用 Turing 和 Ampere 上改进的 B 帧处理（请注意，不支持h264_nvenc）。
或者，如果您的源光照不均匀（闪烁的灯光或大量淡入/淡出），您可以-bf 0 -weighted_pred 1使用加权预测而不是 B 帧，以获得更好的质量和更小的文件（但是，禁用 B 帧会增加其他文件的文件大小）光源稳定）。
-rc-lookahead 75 -spatial-aq 1 -aq-strength 8 -temporal-aq 1启用Turing 和 Ampere 支持的自适应量词。这可以在 CRF 模式下以相同或更低的比特率提高视频质量。更改rc-lookahead以获得更快的速度或更好的质量。aq-strength如果您看到颜色非常深的伪影，请增加。
-gpu 0如果您有 SLI 或板载 (Intel/AMD) 卡，请使用您指定对视频进行编码的设备。
除了 CUDA 解码器之外，您还可以添加-resize WIDTHxHEIGHT和/或-crop TOPxBOTTOMxLEFTxRIGHT（在-i参数之前）使用硬件解码器更改输入。这比使用速度更快-vf scale，并且-vf crop是在 CPU 上完成的。

Answer 2

Isa*_*lla 2

根据@Anton1699 的Reddit 帖子：

p010le 相当于 yuv420p10le（它是 10 位视频，子采样率为 4:2:0 = 每像素 15 位）。

我还没有在文档中找到更权威的来源。

p010le由 nvenc 支持。输出日志也表明这是有效的。因此，我将其作为暂定答案。命令示例：

ffmpeg -i input.mkv -pix_fmt p010le -c:v hevc_nvenc -profile:v main10 -cq 21 out.mkv

p010le 不等于 yuv420p10le。 p010le 打包了色度和平面亮度...重点是虽然 hevc 没有任何像这样的简单东西，但内部它非常复杂。 p010le 适用于 Windows 和现在的 Linux 中使用的输出表面。 (3认同)

归档时间：	5 年，1 月前
查看次数：	29350 次
最近记录：	4 年，7 月前