OpenCV:VideoCapture的阅读框将视频推向奇怪的错误位置

tim*_*geb 9 python video ubuntu opencv video-processing

(一旦符合条件,我将在这个问题上立即获得500点声望奖励 - 除非问题已经结束.)

一句话的问题

VideoCapture视频中读取帧比预期更远.

说明

我需要cv2在特定时间间隔内从100 fps(根据和VLC媒体播放器)视频中读取和分析帧.在下面的最小示例中,我试图读取三分钟视频的前十秒的所有帧.

我正在创建一个cv2.VideoCapture对象,我从中读取帧,直到达到所需的位置(以毫秒为单位).在我的实际代码中,每个框架都进行了分析,但这一事实与展示错误无关.

检查VideoCapture读取帧后的当前帧和毫秒位置会产生正确的值,因此VideoCapture 认为它位于正确的位置 - 但事实并非如此.保存最后一个读取帧的图像显示我的迭代超过目标时间超过两分钟.

更奇怪的是,如果我手动将捕捉的毫秒位置设置VideoCapture.set为10秒(VideoCapture.get读取帧后返回相同的值)并保存图像,则视频处于(几乎)正确的位置!

演示视频文件

如果您想运行MCVE,则需要demo.avi视频文件.您可以下载它这里.

MCVE

这款MCVE经过精心设计和评论.如果有任何不清楚的地方,请在问题下留言.

如果你正在使用的OpenCV 3必须更换的所有实例cv2.cv.CV_cv2..(对我来说,这两个版本都会出现问题.)

import cv2

# set up capture and print properties
print 'cv2 version = {}'.format(cv2.__version__)
cap = cv2.VideoCapture('demo.avi')
fps = cap.get(cv2.cv.CV_CAP_PROP_FPS)
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('initial attributes: fps = {}, pos_msec = {}, pos_frames = {}'
      .format(fps, pos_msec, pos_frames))

# get first frame and save as picture
_, frame = cap.read()
cv2.imwrite('first_frame.png', frame)

# advance 10 seconds, that's 100*10 = 1000 frames at 100 fps
for _ in range(1000):
    _, frame = cap.read()
    # in the actual code, the frame is now analyzed

# save a picture of the current frame
cv2.imwrite('after_iteration.png', frame)

# print properties after iteration
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after iteration: pos_msec = {}, pos_frames = {}'
      .format(pos_msec, pos_frames))

# assert that the capture (thinks it) is where it is supposed to be
# (assertions succeed)
assert pos_frames == 1000 + 1 # (+1: iteration started with second frame)
assert pos_msec == 10000 + 10

# manually set the capture to msec position 10010
# note that this should change absolutely nothing in theory
cap.set(cv2.cv.CV_CAP_PROP_POS_MSEC, 10010)

# print properties  again to be extra sure
pos_msec = cap.get(cv2.cv.CV_CAP_PROP_POS_MSEC)
pos_frames = cap.get(cv2.cv.CV_CAP_PROP_POS_FRAMES)
print ('attributes after setting msec pos manually: pos_msec = {}, pos_frames = {}'
      .format(pos_msec, pos_frames))

# save a picture of the next frame, should show the same clock as
# previously taken image - but does not
_, frame = cap.read()
cv2.imwrite('after_setting.png', frame)
Run Code Online (Sandbox Code Playgroud)

MCVE输出

这些print陈述产生以下输出.

cv2 version = 2.4.9.1
初始属性:fps = 100.0,pos_msec = 0.0,pos_frames = 0.0
读取后的属性:
手动设置msec pos后pos_msec = 10010.0,pos_frames = 1001.0 属性:pos_msec = 10010.0,pos_frames = 1001.0

如您所见,所有属性都具有预期值.

imwrite 保存以下图片.

first_frame.png first_frame.png

after_iteration.png after_iteration.png

after_setting.png after_setting.png

您可以在第二张图片中看到问题.9:26:15(图中实时时钟)的目标错过了超过两分钟.手动设置目标时间(第三张图片)将视频设置为(几乎)正确的位置.

我做错了什么,我该如何解决?

到目前为止尝试过

cv2 2.4.9.1 @ Ubuntu 16.04
cv2 2.4.13 @ Scientific Linux 7.3(三台电脑)
cv2 3.1.0 @ Scientific Linux 7.3(三台电脑)

使用创建捕获

cap = cv2.VideoCapture('demo.avi', apiPreference=cv2.CAP_FFMPEG)
Run Code Online (Sandbox Code Playgroud)

cap = cv2.VideoCapture('demo.avi', apiPreference=cv2.CAP_GSTREAMER)
Run Code Online (Sandbox Code Playgroud)

在OpenCV 3中(版本2似乎没有apiPreference参数).使用cv2.CAP_GSTREAMER时间非常长(运行MCVE大约需要2-3分钟),但两种api-preferences都会产生相同的错误图像.

ffmpeg 直接使用读取帧时(相当于教程),会生成正确的输出图像.

import numpy as np
import subprocess as sp
import pylab

# video properties
path = './demo.avi'
resolution = (593, 792)
framesize = resolution[0]*resolution[1]*3

# set up pipe
FFMPEG_BIN = "ffmpeg"
command = [FFMPEG_BIN,
           '-i', path,
           '-f', 'image2pipe',
           '-pix_fmt', 'rgb24',
           '-vcodec', 'rawvideo', '-']
pipe = sp.Popen(command, stdout = sp.PIPE, bufsize=10**8)

# read first frame and save as image
raw_image = pipe.stdout.read(framesize)
image = np.fromstring(raw_image, dtype='uint8')
image = image.reshape(resolution[0], resolution[1], 3)
pylab.imshow(image)
pylab.savefig('first_frame_ffmpeg_only.png')
pipe.stdout.flush()

# forward 1000 frames
for _ in range(1000):
    raw_image = pipe.stdout.read(framesize)
    pipe.stdout.flush()

# save frame 1001
image = np.fromstring(raw_image, dtype='uint8')
image = image.reshape(resolution[0], resolution[1], 3)
pylab.imshow(image)
pylab.savefig('frame_1001_ffmpeg_only.png')

pipe.terminate()
Run Code Online (Sandbox Code Playgroud)

这会产生正确的结果!(正确的时间戳9:26:15)

frame_1001_ffmpeg_only.png: frame_1001_ffmpeg_only.png

附加信息

在评论中,我被要求提供我的cvconfig.h档案.我似乎只有cv2版本3.1.0下的此文件/opt/opencv/3.1.0/include/opencv2/cvconfig.h.

这里是此文件的粘贴.

如果它有帮助,我能够提取以下视频信息VideoCapture.get.

亮度0.0
对比度0.0
convert_rgb 0.0
曝光0.0
格式0.0
fourcc 1684633187.0
fps 100.0
frame_count 18000.0
frame_height 593.0
frame_width 792.0
增益0.0
色调0.0
模式0.0
openni_baseline 0.0
openni_focal_length 0.0
openni_frame_max_depth 0.0
openni_output_mode 0.0
openni_registration 0.0
pos_avi_ratio 0.01
pos_frames 0.0
pos_msec 0.0
整流0.0
饱和度0.0

Leo*_*eon 5

您的视频文件数据仅包含 1313 个非重复帧(即持续时间每秒 7 到 8 帧):

$ ffprobe -i demo.avi -loglevel fatal -show_streams -count_frames|grep frame
has_b_frames=0
r_frame_rate=100/1
avg_frame_rate=100/1
nb_frames=18000
nb_read_frames=1313        # !!!
Run Code Online (Sandbox Code Playgroud)

转换带有ffmpeg16697 个重复帧的 avi 文件(由于某种原因,添加了 10 个额外的帧,16697=18010-1313)。

$ ffmpeg -i demo.avi demo.mp4
...
frame=18010 fps=417 Lsize=3705kB time=03:00.08 bitrate=168.6kbits/s dup=16697
#                                                                   ^^^^^^^^^
...
Run Code Online (Sandbox Code Playgroud)

顺便说一句,因此转换后的视频 ( demo.mp4) 没有讨论的问题,即 OpenCV 正确处理它。

在这种情况下,重复的帧实际上并不存在于 avi 文件中,而是每个重复的帧都由重复前一帧的指令表示。这可以检查如下:

$ ffplay -loglevel trace demo.avi
...
[ffplay_crop @ 0x7f4308003380] n:16 t:2.180000 pos:1311818.000000 x:0 y:0 x+w:792 y+h:592
[avi @ 0x7f4310009280] dts:574 offset:574 1/100 smpl_siz:0 base:1000000 st:0 size:81266
video: delay=0.130 A-V=0.000094
    Last message repeated 9 times
video: delay=0.130 A-V=0.000095
video: delay=0.130 A-V=0.000094
video: delay=0.130 A-V=0.000095
[avi @ 0x7f4310009280] dts:587 offset:587 1/100 smpl_siz:0 base:1000000 st:0 size:81646
[ffplay_crop @ 0x7f4308003380] n:17 t:2.320000 pos:1393538.000000 x:0 y:0 x+w:792 y+h:592
video: delay=0.140 A-V=0.000091
    Last message repeated 4 times
video: delay=0.140 A-V=0.000092
    Last message repeated 1 times
video: delay=0.140 A-V=0.000091
    Last message repeated 6 times
...
Run Code Online (Sandbox Code Playgroud)

在上面的日志中,具有实际数据的帧由以“ [avi @ 0xHHHHHHHHHHH]”开头的行表示。“ video: delay=xxxxx A-V=yyyyy”消息表示最后一帧必须显示xxxxx更多秒。

cv2.VideoCapture()跳过此类重复帧,仅读取具有真实数据的帧。这是来自 opencv 2.4 分支的相应(尽管经过稍微编辑)代码(注意,顺便说一句,使用了 ffmpeg 下面的代码,我通过在 gdb 下运行 python 并在 上设置断点来验证CvCapture_FFMPEG::grabFrame):

$ ffprobe -i demo.avi -loglevel fatal -show_streams -count_frames|grep frame
has_b_frames=0
r_frame_rate=100/1
avg_frame_rate=100/1
nb_frames=18000
nb_read_frames=1313        # !!!
Run Code Online (Sandbox Code Playgroud)