我有代码从相机获取帧然后将其保存到磁盘.代码的结构是:多个线程malloc并将它们的帧复制到新的内存中,将内存排入队列.最后,另一个线程从队列中删除帧并将它们(使用ffmpeg API,原始视频无压缩)写入其文件(实际上我使用自己的内存池,因此只在需要更多缓冲区时调用malloc).我可以同时打开多达8个文件/摄像头.
问题是,在前45秒内一切正常:队列上永远不会有多个帧.但是在我的队列被备份之后,处理只需要几毫秒的时间就会导致ram的使用量增加,因为我无法快速保存帧,因此我必须使用malloc更多的内存来存储它们.
我有一个8核,16GB RAM Windows 7 64位计算机(NTFS,第二个磁盘驱动器中有大量可用空间).该磁盘应该能够写入高达6Gbits/sec.为了及时保存我的数据,我需要能够以50 MB /秒的速度写入数据.我使用"PassMark PerformanceTest"测试了磁盘速度,我有8个线程同时写文件,就像ffmpeg保存文件(同步,未缓存的I/O)一样,它能够达到100MB /秒.那么为什么我的写作不能实现呢?
以下是ffmpeg写入处理监视器日志的方式:
Time of Day Operation File# Result Detail 2:30:32.8759350 PM WriteFile 8 SUCCESS Offset: 749,535,120, Length: 32,768 2:30:32.8759539 PM WriteFile 8 SUCCESS Offset: 749,567,888, Length: 32,768 2:30:32.8759749 PM WriteFile 8 SUCCESS Offset: 749,600,656, Length: 32,768 2:30:32.8759939 PM WriteFile 8 SUCCESS Offset: 749,633,424, Length: 32,768 2:30:32.8760314 PM WriteFile 8 SUCCESS Offset: 749,666,192, Length: 32,768 2:30:32.8760557 PM WriteFile 8 SUCCESS Offset: 749,698,960, Length: 32,768 2:30:32.8760866 PM WriteFile 8 SUCCESS Offset: 749,731,728, Length: 32,768 2:30:32.8761259 PM WriteFile 8 SUCCESS Offset: 749,764,496, Length: 32,768 2:30:32.8761452 PM WriteFile 8 SUCCESS Offset: 749,797,264, Length: 32,768 2:30:32.8761629 PM WriteFile 8 SUCCESS Offset: 749,830,032, Length: 32,768 2:30:32.8761803 PM WriteFile 8 SUCCESS Offset: 749,862,800, Length: 32,768 2:30:32.8761977 PM WriteFile 8 SUCCESS Offset: 749,895,568, Length: 32,768 2:30:32.8762235 PM WriteFile 8 SUCCESS Offset: 749,928,336, Length: 32,768, Priority: Normal 2:30:32.8762973 PM WriteFile 8 SUCCESS Offset: 749,961,104, Length: 32,768 2:30:32.8763160 PM WriteFile 8 SUCCESS Offset: 749,993,872, Length: 32,768 2:30:32.8763352 PM WriteFile 8 SUCCESS Offset: 750,026,640, Length: 32,768 2:30:32.8763502 PM WriteFile 8 SUCCESS Offset: 750,059,408, Length: 32,768 2:30:32.8763649 PM WriteFile 8 SUCCESS Offset: 750,092,176, Length: 32,768 2:30:32.8763790 PM WriteFile 8 SUCCESS Offset: 750,124,944, Length: 32,768 2:30:32.8763955 PM WriteFile 8 SUCCESS Offset: 750,157,712, Length: 32,768 2:30:32.8764072 PM WriteFile 8 SUCCESS Offset: 750,190,480, Length: 4,104 2:30:32.8848241 PM WriteFile 4 SUCCESS Offset: 750,194,584, Length: 32,768 2:30:32.8848481 PM WriteFile 4 SUCCESS Offset: 750,227,352, Length: 32,768 2:30:32.8848749 PM ReadFile 4 END OF FILE Offset: 750,256,128, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal 2:30:32.8848989 PM WriteFile 4 SUCCESS Offset: 750,260,120, Length: 32,768 2:30:32.8849157 PM WriteFile 4 SUCCESS Offset: 750,292,888, Length: 32,768 2:30:32.8849319 PM WriteFile 4 SUCCESS Offset: 750,325,656, Length: 32,768 2:30:32.8849475 PM WriteFile 4 SUCCESS Offset: 750,358,424, Length: 32,768 2:30:32.8849637 PM WriteFile 4 SUCCESS Offset: 750,391,192, Length: 32,768 2:30:32.8849880 PM WriteFile 4 SUCCESS Offset: 750,423,960, Length: 32,768, Priority: Normal 2:30:32.8850400 PM WriteFile 4 SUCCESS Offset: 750,456,728, Length: 32,768 2:30:32.8850727 PM WriteFile 4 SUCCESS Offset: 750,489,496, Length: 32,768, Priority: Normal
这看起来非常有效,但是,从DiskMon实际的磁盘写入看起来非常碎片来回写入可能导致这种慢速.根据此数据(~5MB/s)查看写入速度图表.
TIme Write duration Sector Length MB/sec 95.6 0.00208855 1490439632 896 0.409131784 95.6 0.00208855 1488197000 128 0.058447398 95.6 0.00009537 1482323640 128 1.279965529 95.6 0.00009537 1482336312 768 7.679793174 95.6 0.00009537 1482343992 384 3.839896587 95.6 0.00009537 1482350648 768 7.679793174 95.6 0.00039101 1489278984 1152 2.809730729 95.6 0.00039101 1489393672 896 2.185346123 95.6 0.0001812 1482349368 256 1.347354443 95.6 0.0001812 1482358328 896 4.715740549 95.6 0.0001812 1482370616 640 3.368386107 95.6 0.0001812 1482378040 256 1.347354443 95.6 0.00208855 1488197128 384 0.175342193 95.6 0.00208855 1488202512 640 0.292236989 95.6 0.00208855 1488210320 1024 0.467579182 95.6 0.00009537 1482351416 256 2.559931058 95.6 0.00009537 1482360120 896 8.959758703 95.6 0.00009537 1482371896 640 6.399827645 95.6 0.00009537 1482380088 256 2.559931058 95.7 0.00039101 1489394568 1152 2.809730729 95.7 0.00039101 1489396744 352 0.858528834 95.7 0.00039101 1489507944 544 1.326817289 95.7 0.0001812 1482378296 768 4.042063328 95.7 0.0001812 1482392120 768 4.042063328 95.7 0.0001812 1482400568 512 2.694708885 95.7 0.00208855 1488224144 768 0.350684386 95.7 0.00208855 1488232208 384 0.175342193
我非常有信心这不是我的代码,因为我计算了所有内容,例如enqueing需要一些我们建议线程不会卡在等待彼此.它必须是磁盘写入.所以问题是如何改进我的磁盘写入以及如何分析实际磁盘写入(请记住,我依靠FFmpeg dll来保存,因此我无法直接访问低级写入功能).如果我无法弄明白,我会将所有帧转储到单个顺序二进制文件中(这应该会提高I/O速度),然后在处理后将其拆分为视频文件.
我不知道有多少我的磁盘I/O是缓存(CacheSet只显示C盘缓存的大小),但是从0到45秒拍摄到的视频性能监视器下面的图片(我的队列开始之前堆放)对我来说很奇怪.基本上,修改后的设置和备用设置从非常小的增长到这个大的值.这是缓存的数据吗?是否有可能只有45秒的数据才开始写入磁盘,所以突然一切都变慢了?
(仅供参考,LabVIEW是加载我的DLL的程序.)
我会感激任何帮助.
M.
问题在于重复malloc
,free
这会给系统带来负担。我建议创建一个缓冲池,即在初始化阶段分配N个缓冲区并重用它们,而不是分配和释放内存。既然您提到了 ffmpeg,举一个多媒体的例子,在 gstreamer 中,缓冲区管理以缓冲池的形式进行,在 gstreamer 管道中,缓冲区通常从缓冲池中获取和传递。大多数多媒体系统都这样做。
关于:
The problem is that for the first 45 sec everything works fine: there's never more than one frame on queue. But after that my queue gets backed up, processing takes just a few ms longer resulting in increased ram usage because I cannot save the frames fast enough so I have to malloc more memory to store them.
该应用程序此时正在废弃。malloc
这个时候打电话会让事情变得更糟。我建议实施一种生产者-消费者模型,其中一个根据情况进行等待。根据您的情况,设置 N 个缓冲区的阈值。如果队列中有 N 个缓冲区,则在处理现有缓冲区之前,来自摄像机的新帧不会排队。
另一个想法,为什么不写原始帧而不写编码数据?假设您想要视频,您至少可以编写一个基本的 H264 流(并且 ffmpeg 带有一个很好的 H264 编码器!),或者如果您可以访问 Mpeg-4 复用器,则更好,作为 mp4 文件?这将显着降低内存需求和 IO 负载。