如何解码视频（内存文件/字节字符串）并在python中逐帧执行？

Question

如何解码视频（内存文件/字节字符串）并在python中逐帧执行？

我正在使用 python 进行一些基本的图像处理，并希望对其进行扩展以逐帧处理视频。

我从服务器获取视频作为 blob - .webm 编码 - 并将其作为字节字符串保存在 python 中（b'\x1aE\xdf\xa3\xa3B\x86\x81\x01B\xf7\x81\x01B\xf2\x81\x04B\xf3\x81\x08B\x82\x88matroskaB\x87\x81\x04B\x85\x81\x02\x18S\x80g\x01\xff\xff\xff\xff\xff\xff\xff\x15I\xa9f\x99*\xd7\xb1\x83\x0fB@M\x80\x86ChromeWA\x86Chrome\x16T\xaek\xad\xae\xab\xd7\x81\x01s\xc5\x87\x04\xe8\xfc\x16\t^\x8c\x83\x81\x01\x86\x8fV_MPEG4/ISO/AVC\xe0\x88\xb0\x82\x02\x80\xba\x82\x01\xe0\x1fC\xb6u\x01\xff\xff\xff\xff\xff\xff ... ) 保存。

我知道有cv.VideoCapture，它几乎可以满足我的需求。问题是我必须先将文件写入磁盘，然后再次加载它。例如，将字符串包装到 IOStream 中，并将其提供给某个进行解码的函数似乎更简洁。

在python中是否有一种干净的方法可以做到这一点，或者正在写入磁盘并再次加载它？

Answer 1

Rot*_*tem 7

根据这篇文章，您不能cv.VideoCapture用于在内存流中解码。
您可以通过“管道”到FFmpeg来解码流。

该解决方案有点复杂，写入磁盘要简单得多，并且可能是更清洁的解决方案。

我正在使用 FFmpeg（和 FFprobe）发布解决方案。
FFmpeg 有 Python 绑定，但解决方案是使用subprocess模块将FFmpeg 作为外部应用程序执行。
（Python 绑定与 FFmpeg 一起工作良好，但管道到 FFprobe 不是）。
我使用的是Windows 10，我把ffmpeg.exe和ffprobe.exe在执行文件夹（你可以通过设置执行路径为好）。
对于 Windows，下载最新的（静态喜欢的）稳定版本。

我创建了一个执行以下操作的独立示例：

生成合成视频，并将其保存到 WebM 文件（用作测试的输入）。
将文件作为二进制数据读入内存（用来自服务器的 blob 替换它）。
将二进制流通过管道传输到 FFprobe，以查找视频分辨率。
如果预先知道分辨率，您可以跳过此部分。
管道到 FFprobe 使解决方案比它应该有的更复杂。
将二进制流通过管道传输到 FFmpegstdin进行解码，并从stdout管道中读取解码后的原始帧。
写入stdin是使用 Python 线程分块完成的。
（使用stdinandstdout而不是命名管道的原因是为了 Windows 兼容性）。

管道结构：

 --------------------  Encoded      ---------  Decoded      ------------
| Input WebM encoded | data        | ffmpeg  | raw frames  | reshape to |
| stream (VP9 codec) | ----------> | process | ----------> | NumPy array|
 --------------------  stdin PIPE   ---------  stdout PIPE  -------------

Run Code Online (Sandbox Code Playgroud)

这是代码：

import numpy as np
import cv2
import io
import subprocess as sp
import threading
import json
from functools import partial
import shlex

# Build synthetic video and read binary data into memory (for testing):
#########################################################################
width, height = 640, 480
sp.run(shlex.split('ffmpeg -y -f lavfi -i testsrc=size={}x{}:rate=1 -vcodec vp9 -crf 23 -t 50 test.webm'.format(width, height)))

with open('test.webm', 'rb') as binary_file:
    in_bytes = binary_file.read()
#########################################################################


# /sf/ask/413795371/
# /sf/ask/1091974761/
# Write to stdin in chunks of 1024 bytes.
def writer():
    for chunk in iter(partial(stream.read, 1024), b''):
        process.stdin.write(chunk)
    try:
        process.stdin.close()
    except (BrokenPipeError):
        pass  # For unknown reason there is a Broken Pipe Error when executing FFprobe.


# Get resolution of video frames using FFprobe
# (in case resolution is know, skip this part):
################################################################################
# Open In-memory binary streams
stream = io.BytesIO(in_bytes)

process = sp.Popen(shlex.split('ffprobe -v error -i pipe: -select_streams v -print_format json -show_streams'), stdin=sp.PIPE, stdout=sp.PIPE, bufsize=10**8)

pthread = threading.Thread(target=writer)
pthread.start()

pthread.join()

in_bytes = process.stdout.read()

process.wait()

p = json.loads(in_bytes)

width = (p['streams'][0])['width']
height = (p['streams'][0])['height']
################################################################################


# Decoding the video using FFmpeg:
################################################################################
stream.seek(0)

# FFmpeg input PIPE: WebM encoded data as stream of bytes.
# FFmpeg output PIPE: decoded video frames in BGR format.
process = sp.Popen(shlex.split('ffmpeg -i pipe: -f rawvideo -pix_fmt bgr24 -an -sn pipe:'), stdin=sp.PIPE, stdout=sp.PIPE, bufsize=10**8)

thread = threading.Thread(target=writer)
thread.start()


# Read decoded video (frame by frame), and display each frame (using cv2.imshow)
while True:
    # Read raw video frame from stdout as bytes array.
    in_bytes = process.stdout.read(width * height * 3)

    if not in_bytes:
        break  # Break loop if no more bytes.

    # Transform the byte read into a NumPy array
    in_frame = (np.frombuffer(in_bytes, np.uint8).reshape([height, width, 3]))

    # Display the frame (for testing)
    cv2.imshow('in_frame', in_frame)

    if cv2.waitKey(100) & 0xFF == ord('q'):
        break

if not in_bytes:
    # Wait for thread to end only if not exit loop by pressing 'q'
    thread.join()

try:
    process.wait(1)
except (sp.TimeoutExpired):
    process.kill()  # In case 'q' is pressed.
################################################################################

cv2.destroyAllWindows()

Run Code Online (Sandbox Code Playgroud)

评论：

如果您收到类似"file not found: ffmpeg..."的错误，请尝试使用完整路径。
例如（在 Linux 中）：'/usr/bin/ffmpeg -i pipe: -f rawvideo -pix_fmt bgr24 -an -sn pipe:'

Answer 2

Fir*_*ger 5

Rotem 写下答案两年后，现在有一种更干净/更简单的方法可以使用ImageIO来做到这一点。

\n

注意：假设ffmpeg在您的路径中，您可以使用以下命令生成测试视频来尝试此示例：ffmpeg -f lavfi -i testsrc=duration=10:size=1280x720:rate=30 testsrc.webm

\n

import imageio.v3 as iio\nfrom pathlib import Path\n\nwebm_bytes = Path("testsrc.webm").read_bytes()\n\n# read all frames from the bytes string\nframes = iio.imread(webm_bytes, index=None, format_hint=".webm")\nframes.shape\n# Output:\n#    (300, 720, 1280, 3)\n\nfor frame in iio.imiter(webm_bytes, format_hint=".webm"):\n    print(frame.shape)\n\n# Output:\n#    (720, 1280, 3)\n#    (720, 1280, 3)\n#    (720, 1280, 3)\n#    ...\n\n

Run Code Online (Sandbox Code Playgroud)\n

要使用它，您需要 ffmpeg 后端（它实现了类似于 Rotem 提出的解决方案）：pip install imageio[ffmpeg]

\n

针对 Rotem 的评论，做一些解释：

\n

上面的代码片段使用了imageio==2.16.0. v3 API 是即将推出的面向用户的 API，可简化读写操作。该 API 自以来可用imageio==2.10.0，但是，您必须在 2.16.0 之前的版本上使用和import imageio as iio。iio.v3.imiteriio.v3.imread

\n

读取视频字节的能力已经永远存在（超过 5 年并且还在增加），但（正如我现在意识到的）从未被直接记录......所以我很快就会为此添加一个 PR\xe2\x84\xa2 :)

\n

在 ImageIO (v2 API) 的旧版本（在 v2.9.0 上测试）上，您仍然可以读取视频字节字符串；然而，这稍微冗长一些：

\n

import imageio as iio\nimport numpy as np\nfrom pathlib import Path\n\nwebm_bytes = Path("testsrc.webm").read_bytes()\n\n# read all frames from the bytes string\nframes = np.stack(iio.mimread(webm_bytes, format="FFMPEG", memtest=False))\n\n# iterate over frames one by one\nreader = iio.get_reader(webm_bytes, format="FFMPEG")\nfor frame in reader:\n    print(frame.shape)\nreader.close()\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	5 年，12 月前
查看次数：	3672 次
最近记录：	5 年，8 月前