将视频帧从 ffmpeg 传输到 numpy 数组，而不将整个电影加载到内存中

Question

将视频帧从 ffmpeg 传输到 numpy 数组，而不将整个电影加载到内存中

我不确定我所要求的是否可行或有效，但我正在尝试以有序但“按需”的方式从视频加载帧。

基本上我现在所拥有的是通过管道将整个未压缩视频读入缓冲区stdout，例如：

H, W = 1080, 1920 # video dimensions
video = '/path/to/video.mp4' # path to video

# ffmpeg command
command = [ "ffmpeg",
            '-i', video,
            '-pix_fmt', 'rgb24',
            '-f', 'rawvideo',
            'pipe:1' ]

# run ffmpeg and load all frames into numpy array (num_frames, H, W, 3)
pipe = subprocess.run(command, stdout=subprocess.PIPE, bufsize=10**8)
video = np.frombuffer(pipe.stdout, dtype=np.uint8).reshape(-1, H, W, 3)

# or alternatively load individual frames in a loop
nb_img = H*W*3 # H * W * 3 channels * 1-byte/channel
for i in range(0, len(pipe.stdout), nb_img):
    img = np.frombuffer(pipe.stdout, dtype=np.uint8, count=nb_img, offset=i).reshape(H, W, 3)

Run Code Online (Sandbox Code Playgroud)

我想知道是否可以在 Python 中执行相同的过程，但无需先将整个视频加载到内存中。在我的脑海里，我正在想象这样的事情：

打开缓冲区
按需寻找内存位置
将帧加载到 numpy 数组

我知道还有其他库，例如 OpenCV，可以实现相同的行为，但我想知道：

是否可以使用这种 ffmpeg-pipe-to-numpy-array 操作有效地执行此操作？
这是否会直接破坏 ffmpeg 的加速优势，而不是通过 OpenCV 查找/加载或首先提取帧然后加载单个文件？

Answer 1

Rot*_*tem 7

搜索和提取帧是可能的，而且相对简单，无需将整个电影加载到内存中。

当要寻找的请求帧不是关键帧时，存在一些加速损失。
当 FFmpeg 被请求寻找非关键帧时，它会寻找请求帧之前最近的关键帧，并解码从关键帧到请求帧的所有帧。

演示代码示例执行以下操作：

使用运行帧计数器构建合成 1fps 视频 - 非常适合测试。
将 FFmpeg 作为子进程执行，并将 stdout 作为输出 PIPE。
该代码示例查找第 11 秒，并将持续时间设置为 5 秒。
从 PIPE 读取（并显示）已解码的视频帧，直到没有更多帧可供读取。

这是代码示例：

import numpy as np
import cv2
import subprocess as sp
import shlex

# Build synthetic 1fps video (with a frame counter):
# Set GOP size to 20 frames (place key frame every 20 frames - for testing).
#########################################################################
W, H = 320, 240 # video dimensions
video_path = 'video.mp4'  # path to video
sp.run(shlex.split(f'ffmpeg -y -f lavfi -i testsrc=size={W}x{H}:rate=1 -vcodec libx264 -g 20 -crf 17 -pix_fmt yuv420p -t 60 {video_path}'))
#########################################################################


# ffmpeg command
command = [ 'ffmpeg',
            '-ss', '00:00:11',    # Seek to 11'th second.
            '-i', video_path,
            '-pix_fmt', 'bgr24',  # brg24 for matching OpenCV
            '-f', 'rawvideo',
            '-t', '5',            # Play 5 seconds long
            'pipe:' ]

# Execute FFmpeg as sub-process with stdout as a pipe
process = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)

# Load individual frames in a loop
nb_img = H*W*3  # H * W * 3 channels * 1-byte/channel

# Read decoded video frames from the PIPE until no more frames to read
while True:
    # Read decoded video frame (in raw video format) from stdout process.
    buffer = process.stdout.read(W*H*3)

    # Break the loop if buffer length is not W*H*3 (when FFmpeg streaming ends).
    if len(buffer) != W*H*3:
        break

    img = np.frombuffer(buffer, np.uint8).reshape(H, W, 3)

    cv2.imshow('img', img)  # Show the image for testing
    cv2.waitKey(1000)

process.stdout.close()
process.wait()
cv2.destroyAllWindows()

Run Code Online (Sandbox Code Playgroud)

注意：当预先知道播放持续时间时，
该参数是相关的。如果预先未知播放持续时间，您可以在需要时删除并中断循环。-t 5
-t

时间测量：

测量一次读取所有帧。
循环中逐帧测量阅读。

# 6000 frames:
sp.run(shlex.split(f'ffmpeg -y -f lavfi -i testsrc=size={W}x{H}:rate=1 -vcodec libx264 -g 20 -crf 17 -pix_fmt yuv420p -t 6000 {video_path}'))

# ffmpeg command
command = [ 'ffmpeg',
            '-ss', '00:00:11',    # Seek to 11'th second.
            '-i', video_path,
            '-pix_fmt', 'bgr24',  # brg24 for matching OpenCV
            '-f', 'rawvideo',
            '-t', '5000',         # Play 5000 seconds long (5000 frames).
            'pipe:' ]



# Load all frames into numpy array
################################################################################
t = time.time()

# run ffmpeg and load all frames into numpy array (num_frames, H, W, 3)
process = sp.run(command, stdout=sp.PIPE, bufsize=10**8)
video = np.frombuffer(process.stdout, dtype=np.uint8).reshape(-1, H, W, 3)

elapsed1 = time.time() - t
################################################################################


# Load load individual frames in a loop
################################################################################
t = time.time()

# Execute FFmpeg as sub-process with stdout as a pipe
process = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)

# Read decoded video frames from the PIPE until no more frames to read
while True:
    # Read decoded video frame (in raw video format) from stdout process.
    buffer = process.stdout.read(W*H*3)

    # Break the loop if buffer length is not W*H*3 (when FFmpeg streaming ends).
    if len(buffer) != W*H*3:
        break

    img = np.frombuffer(buffer, np.uint8).reshape(H, W, 3)

elapsed2 = time.time() - t

process.wait()


################################################################################

print(f'Read all frames at once elapsed time: {elapsed1}')
print(f'Read frame by frame elapsed time: {elapsed2}')

Run Code Online (Sandbox Code Playgroud)

结果：

Read all frames at once elapsed time: 7.371837854385376

Read frame by frame elapsed time: 10.089557886123657

结果表明，逐帧读取存在一定的开销。

开销相对较小。
开销有可能与 Python 有关，而不与 FFmpeg 有关。

归档时间：	4 年，8 月前
查看次数：	6331 次
最近记录：	4 年，8 月前