ScreenCaptureKit/CVPixelBuffer 格式会产生意想不到的结果

jn_*_*pdx 7 cgimage swift cvpixelbuffer screencapturekit

我有一个使用 ScreenCaptureKit 的项目。由于超出问题范围的各种原因,我配置 ScreenCaptureKit 使用的格式是kCVPixelFormatType_32BGRA——我需要原始 BGRA 数据,稍后会对其进行操作。

当我从数据构造一个CGImage或时,显示和一些窗口看起来很好(完整的代码包含在问题的底部 - 这只是转换的摘录)。NSImage

guard let cvPixelBuffer = sampleBuffer.imageBuffer else { return }
CVPixelBufferLockBaseAddress(cvPixelBuffer, .readOnly)
defer { CVPixelBufferUnlockBaseAddress(cvPixelBuffer, .readOnly) }
let vImageBuffer: vImage_Buffer = vImage_Buffer(data: CVPixelBufferGetBaseAddress(cvPixelBuffer),
                                                        height: vImagePixelCount(CVPixelBufferGetHeight(cvPixelBuffer)),
                                                        width: vImagePixelCount(CVPixelBufferGetWidth(cvPixelBuffer)),
                                                        rowBytes: CVPixelBufferGetWidth(cvPixelBuffer) * 4)
        
let cgImageFormat: vImage_CGImageFormat = vImage_CGImageFormat(
                    bitsPerComponent: 8,
                    bitsPerPixel: 32,
                    colorSpace: CGColorSpaceCreateDeviceRGB(),
                    bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.last.rawValue),
                    renderingIntent: .defaultIntent
                )!
if let cgImage: CGImage = try? vImageBuffer.createCGImage(format: cgImageFormat) {
  let nsImage = NSImage(cgImage: cgImage, size: .init(width: CGFloat(cgImage.width), height: CGFloat(cgImage.height)))
  Task { @MainActor in
    self.image = nsImage
  }
}
Run Code Online (Sandbox Code Playgroud)

显示的结果图像看起来很合理(除了不正确的颜色,因为传入的数据是 BGRA 并且 CGImage 需要 RGBA——这在我的项目的其他地方处理)。

在此输入图像描述

然而,某些窗口(不是全部)会出现非常奇怪的扭曲和撕裂效果。以 Calendar.app 为例:

在此输入图像描述

这是 Mail.app,它的损坏程度较小

在此输入图像描述

据我所知,CVPixelBuffer每种情况的格式都是相同的。当我CVPixelBuffer使用调试器检查(而不是转换为CGImage/ NSImage)时,它CVPixelBuffer在 QuickLook 中完美显示,因此实际数据也没有损坏 - 只是格式有些问题我不明白。

问题:

如何才能从这些窗口中可靠地获取 RGBA 数据,就像始终为显示器返回 RGBA 数据一样?


完整的、可运行的示例代码:


class ScreenCaptureManager: NSObject, ObservableObject {
    @Published var availableWindows: [SCWindow] = []
    @Published var availableDisplays: [SCDisplay] = []
    @Published var image: NSImage?
    private var stream: SCStream?
    private let videoSampleBufferQueue = DispatchQueue(label: "com.sample.VideoSampleBufferQueue")
    
    func getAvailableContent() {
        Task { @MainActor in
            do {
                let availableContent: SCShareableContent = try await SCShareableContent.excludingDesktopWindows(true,
                                                                                                                onScreenWindowsOnly: true)
                self.availableWindows = availableContent.windows
                self.availableDisplays = availableContent.displays
            } catch {
                print(error)
            }
        }
    }
    
    func basicStreamConfig() -> SCStreamConfiguration {
        let streamConfig = SCStreamConfiguration()
        streamConfig.minimumFrameInterval = CMTime(value: 1, timescale: 5)
        streamConfig.showsCursor = true
        streamConfig.queueDepth = 5
        streamConfig.pixelFormat = kCVPixelFormatType_32BGRA
        return streamConfig
    }
    
    func startCaptureForDisplay(display: SCDisplay) {
        Task { @MainActor in
            try? await stream?.stopCapture()
            let filter = SCContentFilter(display: display, including: availableWindows)
            let streamConfig = basicStreamConfig()
            streamConfig.width = Int(display.frame.width * 2)
            streamConfig.height = Int(display.frame.height * 2)
            stream = SCStream(filter: filter, configuration: streamConfig, delegate: self)
            do {
                try stream?.addStreamOutput(self, type: .screen, sampleHandlerQueue: videoSampleBufferQueue)
                try await stream?.startCapture()
            } catch {
                print("ERROR: ", error)
            }
        }
    }
    
    func startCaptureForWindow(window: SCWindow) {
        Task { @MainActor in
            try? await stream?.stopCapture()
            let filter = SCContentFilter(desktopIndependentWindow: window)
            let streamConfig = basicStreamConfig()
            streamConfig.width = Int(window.frame.width * 2)
            streamConfig.height = Int(window.frame.height * 2)
            
            stream = SCStream(filter: filter, configuration: streamConfig, delegate: self)
            do {
                try stream?.addStreamOutput(self, type: .screen, sampleHandlerQueue: videoSampleBufferQueue)
                try await stream?.startCapture()
            } catch {
                print(error)
            }
        }
    }
}

extension ScreenCaptureManager: SCStreamOutput, SCStreamDelegate {
    func stream(_: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of _: SCStreamOutputType) {
        guard let cvPixelBuffer = sampleBuffer.imageBuffer else { return }
        
        print("PixelBuffer", cvPixelBuffer)
        
        CVPixelBufferLockBaseAddress(cvPixelBuffer, .readOnly)

        defer {
            CVPixelBufferUnlockBaseAddress(cvPixelBuffer, .readOnly)
        }
        
        let vImageBuffer: vImage_Buffer = vImage_Buffer(data: CVPixelBufferGetBaseAddress(cvPixelBuffer),
                                                        height: vImagePixelCount(CVPixelBufferGetHeight(cvPixelBuffer)),
                                                        width: vImagePixelCount(CVPixelBufferGetWidth(cvPixelBuffer)),
                                                        rowBytes: CVPixelBufferGetWidth(cvPixelBuffer) * 4)
        
        let cgImageFormat: vImage_CGImageFormat = vImage_CGImageFormat(
                    bitsPerComponent: 8,
                    bitsPerPixel: 32,
                    colorSpace: CGColorSpaceCreateDeviceRGB(),
                    bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.last.rawValue),
                    renderingIntent: .defaultIntent
                )!
        if let cgImage: CGImage = try? vImageBuffer.createCGImage(format: cgImageFormat) {
            let nsImage = NSImage(cgImage: cgImage, size: .init(width: CGFloat(cgImage.width), height: CGFloat(cgImage.height)))
            Task { @MainActor in
                self.image = nsImage
            }
        }
    }

    func stream(_: SCStream, didStopWithError error: Error) {
        print("JN: Stream error", error)
    }
}

struct ContentView: View {
    @StateObject private var screenCaptureManager = ScreenCaptureManager()
    
    var body: some View {
        HStack {
            ScrollView {
                ForEach(screenCaptureManager.availableDisplays, id: \.displayID) { display in
                    HStack {
                        Text("Display: \(display.width) x \(display.height)")
                    }.frame(height: 60).frame(maxWidth: .infinity).border(Color.black).contentShape(Rectangle())
                    .onTapGesture {
                        screenCaptureManager.startCaptureForDisplay(display: display)
                    }
                }
                ForEach(screenCaptureManager.availableWindows.filter { $0.title != nil && !$0.title!.isEmpty }, id: \.windowID) { window in
                    HStack {
                        Text(window.title!)
                    }.frame(height: 60).frame(maxWidth: .infinity).border(Color.black).contentShape(Rectangle())
                    .onTapGesture {
                        screenCaptureManager.startCaptureForWindow(window: window)
                    }
                }
            }
            .frame(width: 200)
            Divider()
            
            if let image = screenCaptureManager.image {
                Image(nsImage: image)
                    .resizable()
                    .aspectRatio(contentMode: .fit)
                    .frame(maxWidth: .infinity, maxHeight: .infinity)
            }
        }
        .frame(width: 800, height: 600, alignment: .leading)
        .onAppear {
            screenCaptureManager.getAvailableContent()
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

注意:我知道显示捕获内容的 NSImage 并不是预览内容的最有效方法 - 它只是为了在此处显示问题

jn_*_*pdx 2

ScreenCaptureKit 可以返回在每行末尾都有填充字节的CVPixelBuffers (通过)。CMSampleBuffer我的代码的问题行是:

rowBytes: CVPixelBufferGetWidth(cvPixelBuffer) * 4
Run Code Online (Sandbox Code Playgroud)

该行假设 是rowBytes图像的宽度乘以 4,因为在 RGBA 格式中,每个像素有四个字节。

这行应该是:

rowBytes: CVPixelBufferGetBytesPerRow(cvPixelBuffer)
Run Code Online (Sandbox Code Playgroud)

该值可能因硬件而异,如下所述:技术问答 QA1829

看起来,rowBytes在本地计算机上快速移动信息时,拥有硬件对齐的参数可以提高一些效率。但是,当将数据移动到其他地方(例如通过网络)时,目标通常会期望额外的字节不会出现在数据中,这意味着您必须在传输数据之前复制没有额外填充字节的行。