如何编码C++程序中生成的多个图像中的视频而不将单独的帧图像写入磁盘？

Question

如何编码C++程序中生成的多个图像中的视频而不将单独的帧图像写入磁盘？

ksb*_*496 18 c++ video ffmpeg image x264

我正在编写C++代码,其中在执行其中实现的一些操作之后生成N个不同帧的序列.每个帧完成后,我将它作为IMG_%d.png写在磁盘上,最后我使用x264编解码器通过ffmpeg将它们编码为视频.

该程序主要部分的汇总伪代码如下:

std::vector<int> B(width*height*3);
for (i=0; i<N; i++)
{
  // void generateframe(std::vector<int> &, int)
  generateframe(B, i); // Returns different images for different i values.
  sprintf(s, "IMG_%d.png", i+1);
  WriteToDisk(B, s); // void WriteToDisk(std::vector<int>, char[])
}

Run Code Online (Sandbox Code Playgroud)

这种实现的问题是所需帧的数量N通常很高(N~100000)以及图像的分辨率(1920x1080),导致磁盘过载,产生数十GB的写入周期每次执行后.

为了避免这种情况,我一直试图找到关于将存储在向量B中的每个图像直接解析为编码器(如x264)的文档(无需将中间图像文件写入磁盘).虽然发现了一些有趣的话题,但没有一个能够专门解决我想要的内容,因为其中许多都涉及在磁盘上使用现有图像文件执行编码器,而其他人则提供其他编程语言(如Python)的解决方案(在这里你)可以为该平台找到一个完全令人满意的解决方案.

我想要获得的伪代码类似于:

std::vector<int> B(width*height*3);
video_file=open_video("Generated_Video.mp4", ...[encoder options]...);
for (i=0; i<N; i++)
{
  generateframe(B, i+1);
  add_frame(video_file, B);
}
video_file.close();

Run Code Online (Sandbox Code Playgroud)

根据我在相关主题上所读到的内容,x264 C++ API可能能够做到这一点,但是,如上所述,我没有找到满意的答案来解决我的具体问题.我尝试直接学习和使用ffmpeg源代码,但是它的低易用性和编译问题都迫使我放弃这种可能性,因为我只是一个非专业的程序员(我把它视为一种爱好而不幸的是我不能浪费很多时候学习如此苛刻的东西).

我想到的另一个可能的解决方案是找到一种方法在C++代码中调用ffmpeg二进制文件,并以某种方式设法将每次迭代的图像数据(存储在B中)传输到编码器,让每个帧的添加(即,不是"关闭"要写入的视频文件)直到最后一帧,以便可以添加更多帧直到到达第N个帧,其中视频文件将被"关闭".换句话说,通过C++程序调用ffmpeg.exe将第一帧写入视频,但让编码器"等待"更多帧.然后再次调用ffmpeg添加第二帧并使编码器再次"等待"以获得更多帧,依此类推,直到到达最后一帧,视频将完成.但是,我不知道如何进行或者实际上是否可行.

编辑1:

正如回复中所建议的那样,我一直在记录有关命名管道的信息,并尝试在我的代码中使用它们.首先,应该注意我正在使用Cygwin,所以我的命名管道是在Linux下创建的.我使用的修改过的伪代码(包括相应的系统库)如下:

FILE *fd;
mkfifo("myfifo", 0666);

for (i=0; i<N; i++)
{
  fd=fopen("myfifo", "wb");
  generateframe(B, i+1);
  WriteToPipe(B, fd); // void WriteToPipe(std::vector<int>, FILE *&fd)
  fflush(fd);
  fd=fclose("myfifo");
}
unlink("myfifo");

Run Code Online (Sandbox Code Playgroud)

WriteToPipe是对先前WriteToFile函数的略微修改,其中我确保发送图像数据的写缓冲区足够小以适应管道缓冲限制.

然后我在Cygwin终端编译并编写以下命令:

./myprogram | ffmpeg -i pipe:myfifo -c:v libx264 -preset slow -crf 20 Video.mp4

Run Code Online (Sandbox Code Playgroud)

但是,当"fopen"行(即第一次fopen调用)的i = 0时,它仍然停留在循环中.如果我没有调用ffmpeg,那将是很自然的,因为服务器(我的程序)将等待客户端程序连接到管道的"另一侧",但事实并非如此.看起来他们无法以某种方式通过管道连接,但我无法找到进一步的文档来克服这个问题.有什么建议吗？

Answer 1

ksb*_*496 21

经过一番激烈的斗争后,我终于在学习了一些如何使用FFmpeg和libx264 C API达到我的特定目的后,设法让它工作,感谢一些用户在本网站和其他一些用户提供的有用信息,以及一些FFmpeg的文档示例.为了便于说明,接下来将详细介绍.

首先,编译了libx264 C库,然后使用配置选项--enable-gpl --enable-libx264编译FFmpeg.现在让我们开始编码.达到要求目的的代码的相关部分如下:

包括:

#include <stdint.h>
extern "C"{
#include <x264.h>
#include <libswscale/swscale.h>
#include <libavcodec/avcodec.h>
#include <libavutil/mathematics.h>
#include <libavformat/avformat.h>
#include <libavutil/opt.h>
}

Run Code Online (Sandbox Code Playgroud)

Makefile上的LDFLAGS:

-lx264 -lswscale -lavutil -lavformat -lavcodec

Run Code Online (Sandbox Code Playgroud)

内部代码(为简单起见,将省略错误检查,并在需要时执行变量声明而不是开头以便更好地理解):

av_register_all(); // Loads the whole database of available codecs and formats.

struct SwsContext* convertCtx = sws_getContext(width, height, AV_PIX_FMT_RGB24, width, height, AV_PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL); // Preparing to convert my generated RGB images to YUV frames.

// Preparing the data concerning the format and codec in order to write properly the header, frame data and end of file.
char *fmtext="mp4";
char *filename;
sprintf(filename, "GeneratedVideo.%s", fmtext);
AVOutputFormat * fmt = av_guess_format(fmtext, NULL, NULL);
AVFormatContext *oc = NULL;
avformat_alloc_output_context2(&oc, NULL, NULL, filename);
AVStream * stream = avformat_new_stream(oc, 0);
AVCodec *codec=NULL;
AVCodecContext *c= NULL;
int ret;

codec = avcodec_find_encoder_by_name("libx264");

// Setting up the codec:
av_dict_set( &opt, "preset", "slow", 0 );
av_dict_set( &opt, "crf", "20", 0 );
avcodec_get_context_defaults3(stream->codec, codec);
c=avcodec_alloc_context3(codec);
c->width = width;
c->height = height;
c->pix_fmt = AV_PIX_FMT_YUV420P;

// Setting up the format, its stream(s), linking with the codec(s) and write the header:
if (oc->oformat->flags & AVFMT_GLOBALHEADER) // Some formats require a global header.
    c->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
avcodec_open2( c, codec, &opt );
av_dict_free(&opt);
stream->time_base=(AVRational){1, 25};
stream->codec=c; // Once the codec is set up, we need to let the container know which codec are the streams using, in this case the only (video) stream.
av_dump_format(oc, 0, filename, 1);
avio_open(&oc->pb, filename, AVIO_FLAG_WRITE);
ret=avformat_write_header(oc, &opt);
av_dict_free(&opt); 

// Preparing the containers of the frame data:
AVFrame *rgbpic, *yuvpic;

// Allocating memory for each RGB frame, which will be lately converted to YUV:
rgbpic=av_frame_alloc();
rgbpic->format=AV_PIX_FMT_RGB24;
rgbpic->width=width;
rgbpic->height=height;
ret=av_frame_get_buffer(rgbpic, 1);

// Allocating memory for each conversion output YUV frame:
yuvpic=av_frame_alloc();
yuvpic->format=AV_PIX_FMT_YUV420P;
yuvpic->width=width;
yuvpic->height=height;
ret=av_frame_get_buffer(yuvpic, 1);

// After the format, code and general frame data is set, we write the video in the frame generation loop:
// std::vector<uint8_t> B(width*height*3);

Run Code Online (Sandbox Code Playgroud)

上面评论的矢量具有与我在我的问题中暴露的相同的结构; 但是,RGB数据以特定方式存储在AVFrame上.因此,为了说明,让我们假设我们有一个指向uint8_t [3]矩阵(int,int)形式结构的指针,其访问给定坐标(x,x的像素的颜色值)的方式y)是矩阵(x,y) - >红色,矩阵(x,y) - >绿色和矩阵(x,y) - >蓝色,分别得到红色,绿色和蓝色的值坐标(x,y).第一个参数代表水平位置,当x增加时,从左到右,第二个参数代表垂直位置,从y开始,从上到下.

就是说,传输数据,编码和写入每个帧的for循环将是以下一个:

Matrix B(width, height);
int got_output;
AVPacket pkt;
for (i=0; i<N; i++)
{
    generateframe(B, i); // This one is the function that generates a different frame for each i.
    // The AVFrame data will be stored as RGBRGBRGB... row-wise, from left to right and from top to bottom, hence we have to proceed as follows:
    for (y=0; y<height; y++)
    {
        for (x=0; x<width; x++)
        {
            // rgbpic->linesize[0] is equal to width.
            rgbpic->data[0][y*rgbpic->linesize[0]+3*x]=B(x, y)->Red;
            rgbpic->data[0][y*rgbpic->linesize[0]+3*x+1]=B(x, y)->Green;
            rgbpic->data[0][y*rgbpic->linesize[0]+3*x+2]=B(x, y)->Blue;
        }
    }
    sws_scale(convertCtx, rgbpic->data, rgbpic->linesize, 0, height, yuvpic->data, yuvpic->linesize); // Not actually scaling anything, but just converting the RGB data to YUV and store it in yuvpic.
    av_init_packet(&pkt);
    pkt.data = NULL;
    pkt.size = 0;
    yuvpic->pts = i; // The PTS of the frame are just in a reference unit, unrelated to the format we are using. We set them, for instance, as the corresponding frame number.
    ret=avcodec_encode_video2(c, &pkt, yuvpic, &got_output);
    if (got_output)
    {
        fflush(stdout);
        av_packet_rescale_ts(&pkt, (AVRational){1, 25}, stream->time_base); // We set the packet PTS and DTS taking in the account our FPS (second argument) and the time base that our selected format uses (third argument).
        pkt.stream_index = stream->index;
        printf("Write frame %6d (size=%6d)\n", i, pkt.size);
        av_interleaved_write_frame(oc, &pkt); // Write the encoded frame to the mp4 file.
        av_packet_unref(&pkt);
    }
}
// Writing the delayed frames:
for (got_output = 1; got_output; i++) {
    ret = avcodec_encode_video2(c, &pkt, NULL, &got_output);
    if (got_output) {
        fflush(stdout);
        av_packet_rescale_ts(&pkt, (AVRational){1, 25}, stream->time_base);
        pkt.stream_index = stream->index;
        printf("Write frame %6d (size=%6d)\n", i, pkt.size);
        av_interleaved_write_frame(oc, &pkt);
        av_packet_unref(&pkt);
    }
}
av_write_trailer(oc); // Writing the end of the file.
if (!(fmt->flags & AVFMT_NOFILE))
    avio_closep(oc->pb); // Closing the file.
avcodec_close(stream->codec);
// Freeing all the allocated memory:
sws_freeContext(convertCtx);
av_frame_free(&rgbpic);
av_frame_free(&yuvpic);
avformat_free_context(oc);

Run Code Online (Sandbox Code Playgroud)

附注:

为了将来参考,由于网上有关时间戳(PTS/DTS)的可用信息看起来如此令人困惑,我接下来将解释我是如何设法通过设置正确的值来解决问题的.错误地设置这些值导致输出大小远大于通过ffmpeg构建的二进制命令行工具获得的输出大小,因为帧数据通过比FPS实际设置的更短的时间间隔进行冗余写入.

首先,应该注意的是,当编码时有两种时间戳:一种与帧相关联(PTS)(预编码阶段)和两种与分组相关联(PTS和DTS)(后编码阶段) .在第一种情况下,看起来帧PTS值可以使用自定义参考单元进行分配(如果需要恒定的FPS,它们必须等间隔的唯一限制),因此可以采用例如帧数作为我们在上面的代码中做了.在第二个中,我们必须考虑以下参数:

输出格式容器的时基,在我们的例子中是mp4(= 12800 Hz),其信息保存在stream-> time_base中.
视频所需的FPS.
如果编码器生成B帧或不生成B帧(在第二种情况下,帧的PTS和DTS值必须设置相同,但如果我们处于第一种情况,则更复杂,如本例所示).有关更多参考,请参阅此相关问题的答案.

这里的关键是幸运的是,没有必要努力计算这些数量,因为libav通过了解上述数据提供了计算与数据包相关的正确时间戳的功能:

av_packet_rescale_ts(AVPacket *pkt, AVRational FPS, AVRational time_base)

Run Code Online (Sandbox Code Playgroud)

由于这些考虑,我终于能够生成一个理智的输出容器和基本上与使用命令行工具获得的压缩率相同的压缩率,这是剩下的两个问题,然后更深入地研究格式标题和预告片以及时间如何邮票设置得当.

Answer 2

Dmi*_*hin 8

感谢您的出色工作，@ksb496！

一项小改进：

c=avcodec_alloc_context3(codec);

Run Code Online (Sandbox Code Playgroud)

应该更好地写成：

c = stream->codec;

Run Code Online (Sandbox Code Playgroud)

以避免内存泄漏。

如果您不介意，我已将完整的可部署库上传到 GitHub： https: //github.com/apc-llc/moviemaker-cpp.git

归档时间：	9 年，12 月前
查看次数：	12748 次
最近记录：	7 年，6 月前