如何从CMSampleBuffer获取Y组件来自AVCaptureSession?

Nih*_*hao 10 iphone stream avcapturesession

嘿,我试图使用AVCaptureSession iphone相机访问原始数据.我按照Apple(提供的指南链接点击这里).

从samplebuffer的原始数据是在YUV格式(我是正确这里有关原始视频帧格式??),如何直接获得为Y分量的数据从存储在samplebuffer原始数据.

Bra*_*son 20

设置返回原始相机帧的AVCaptureVideoDataOutput时,可以使用如下代码设置帧的格式:

[videoOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA] forKey:(id)kCVPixelBufferPixelFormatTypeKey]];
Run Code Online (Sandbox Code Playgroud)

在这种情况下,指定了BGRA像素格式(我用它来匹配OpenGL ES纹理的颜色格式).该格式的每个像素按此顺序具有蓝色,绿色,红色和alpha的一个字节.使用它可以很容易地拉出颜色组件,但是你需要通过从相机原生的YUV颜色空间进行转换来牺牲一点性能.

其他支持的色彩空间是kCVPixelFormatType_420YpCbCr8BiPlanarVideoRangekCVPixelFormatType_420YpCbCr8BiPlanarFullRange上较新的设备,并kCVPixelFormatType_422YpCbCr8在iPhone 3G.的VideoRangeFullRange后缀简单地指示是否字节16之间返回-用于UV或全0 240 - - 255为每个组件235用于Y和16.

我相信AVCaptureVideoDataOutput实例使用的默认颜色空间是YUV 4:2:0平面颜色空间(iPhone 3G除外,其中YUV 4:2:2交错).这意味着视频帧中包含两个图像数据平面,Y平面首先出现.对于生成的图像中的每个像素,该像素的Y值都有一个字节.

您可以通过在委托回调中实现类似的内容来获取此原始Y数据:

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
    CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CVPixelBufferLockBaseAddress(pixelBuffer, 0);

    unsigned char *rawPixelBase = (unsigned char *)CVPixelBufferGetBaseAddress(pixelBuffer);

    // Do something with the raw pixels here

    CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
}
Run Code Online (Sandbox Code Playgroud)

然后,您可以计算出图像上每个X,Y坐标的帧数据中的位置,并拉出与该坐标处的Y分量对应的字节.

来自WWDC 2010的 Apple的FindMyiCone示例(可与视频一起访问)显示了如何处理来自每个帧的原始BGRA数据.我还创建了一个示例应用程序,您可以在此处下载代码,使用iPhone相机的实时视频执行基于颜色的对象跟踪.两者都展示了如何处理原始像素数据,但这些都不能在YUV颜色空间中工作.


Cod*_*odo 18

除了Brad的答案和您自己的代码之外,您还需要考虑以下事项:

由于您的图像有两个独立的平面,因此函数CVPixelBufferGetBaseAddress不会返回平面的基地址,而是返回附加数据结构的基址.这可能是由于当前的实现,您得到的地址足够接近第一个平面,以便您可以看到图像.但这就是它被转移并且左上方有垃圾的原因.接收第一架飞机的正确方法是:

unsigned char *rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
Run Code Online (Sandbox Code Playgroud)

图像中的一行可能比图像的宽度长(由于四舍五入).这就是为什么有单独的函数来获取每行的宽度和字节数.你现在没有这个问题.但这可能会随着iOS的下一个版本而改变.所以你的代码应该是:

int bufferHeight = CVPixelBufferGetHeight(pixelBuffer);
int bufferWidth = CVPixelBufferGetWidth(pixelBuffer);
int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
int size = bufferHeight * bytesPerRow ;

unsigned char *pixel = (unsigned char*)malloc(size);

unsigned char *rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
memcpy (pixel, rowBase, size);
Run Code Online (Sandbox Code Playgroud)

另请注意,您的代码将在iPhone 3G上失败.


Taf*_*soh 7

如果您只需要亮度通道,我建议不要使用BGRA格式,因为它带有转换开销.Apple建议使用BGRA,如果你正在渲染东西,但你不需要它来提取亮度信息.正如Brad已经提到的,最有效的格式是相机原生的YUV格式.

但是,从样本缓冲区中提取正确的字节有点棘手,特别是对于带有交错YUV 422格式的iPhone 3G.所以这是我的代码,适用于iPhone 3G,3GS,iPod Touch 4和iPhone 4S.

#pragma mark -
#pragma mark AVCaptureVideoDataOutputSampleBufferDelegate Methods
#if !(TARGET_IPHONE_SIMULATOR)
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection;
{
    // get image buffer reference
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);

    // extract needed informations from image buffer
    CVPixelBufferLockBaseAddress(imageBuffer, 0);
    size_t bufferSize = CVPixelBufferGetDataSize(imageBuffer);
    void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
    CGSize resolution = CGSizeMake(CVPixelBufferGetWidth(imageBuffer), CVPixelBufferGetHeight(imageBuffer));

    // variables for grayscaleBuffer 
    void *grayscaleBuffer = 0;
    size_t grayscaleBufferSize = 0;

    // the pixelFormat differs between iPhone 3G and later models
    OSType pixelFormat = CVPixelBufferGetPixelFormatType(imageBuffer);

    if (pixelFormat == '2vuy') { // iPhone 3G
        // kCVPixelFormatType_422YpCbCr8     = '2vuy',    
        /* Component Y'CbCr 8-bit 4:2:2, ordered Cb Y'0 Cr Y'1 */

        // copy every second byte (luminance bytes form Y-channel) to new buffer
        grayscaleBufferSize = bufferSize/2;
        grayscaleBuffer = malloc(grayscaleBufferSize);
        if (grayscaleBuffer == NULL) {
            NSLog(@"ERROR in %@:%@:%d: couldn't allocate memory for grayscaleBuffer!", NSStringFromClass([self class]), NSStringFromSelector(_cmd), __LINE__);
            return nil; }
        memset(grayscaleBuffer, 0, grayscaleBufferSize);
        void *sourceMemPos = baseAddress + 1;
        void *destinationMemPos = grayscaleBuffer;
        void *destinationEnd = grayscaleBuffer + grayscaleBufferSize;
        while (destinationMemPos <= destinationEnd) {
            memcpy(destinationMemPos, sourceMemPos, 1);
            destinationMemPos += 1;
            sourceMemPos += 2;
        }       
    }

    if (pixelFormat == '420v' || pixelFormat == '420f') {
        // kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange = '420v', 
        // kCVPixelFormatType_420YpCbCr8BiPlanarFullRange  = '420f',
        // Bi-Planar Component Y'CbCr 8-bit 4:2:0, video-range (luma=[16,235] chroma=[16,240]).  
        // Bi-Planar Component Y'CbCr 8-bit 4:2:0, full-range (luma=[0,255] chroma=[1,255]).
        // baseAddress points to a big-endian CVPlanarPixelBufferInfo_YCbCrBiPlanar struct
        // i.e.: Y-channel in this format is in the first third of the buffer!
        int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0);
        baseAddress = CVPixelBufferGetBaseAddressOfPlane(imageBuffer,0);
        grayscaleBufferSize = resolution.height * bytesPerRow ;
        grayscaleBuffer = malloc(grayscaleBufferSize);
        if (grayscaleBuffer == NULL) {
            NSLog(@"ERROR in %@:%@:%d: couldn't allocate memory for grayscaleBuffer!", NSStringFromClass([self class]), NSStringFromSelector(_cmd), __LINE__);
            return nil; }
        memset(grayscaleBuffer, 0, grayscaleBufferSize);
        memcpy (grayscaleBuffer, baseAddress, grayscaleBufferSize); 
    }

    // do whatever you want with the grayscale buffer
    ...

    // clean-up
    free(grayscaleBuffer);
}
#endif
Run Code Online (Sandbox Code Playgroud)