使用 ML 套件(使用 CMSampleBuffer)从实时视频流中识别文本

dav*_*ave 6 ios firebase swift firebase-mlkit

我正在尝试修改 Google在此处提供的设备上文本识别示例,以使其与实时摄像头馈送一起使用。

当将相机放在文本上时(适用于图像示例),我的控制台在最终耗尽内存之前在流中生成以下内容:

2018-05-16 10:48:22.129901+1200 TextRecognition[32138:5593533] An empty result returned from from GMVDetector for VisionTextDetector.

这是我的视频捕获方法:

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {

        if let textDetector = self.textDetector {

            let visionImage = VisionImage(buffer: sampleBuffer)
            let metadata = VisionImageMetadata()
            metadata.orientation = .rightTop
            visionImage.metadata = metadata

            textDetector.detect(in: visionImage) { (features, error) in
                guard error == nil, let features = features, !features.isEmpty else {
                    // Error. You should also check the console for error messages.
                    // ...
                    return
                }

                // Recognized and extracted text
                print("Detected text has: \(features.count) blocks")
                // ...
            }

        }

    }
Run Code Online (Sandbox Code Playgroud)

这是正确的方法吗?

Don*_*hen 7

ML Kit 早已从 Firebase 中迁移出来,成为一个独立的 SDK(迁移指南)。

Swift 中的快速入门示例应用程序展示了如何使用 ML Kit(带有 CMSampleBuffer)从实时视频流中进行文本识别,现已在此处提供:

https://github.com/googlesamples/mlkit/tree/master/ios/quickstarts/textrecognition/TextRecognitionExample

实时反馈在 CameraViewController.swift 中实现:

https://github.com/googlesamples/mlkit/blob/master/ios/quickstarts/textrecognition/TextRecognitionExample/CameraViewController.swift