结合CoreML和ARKit

Swi*_*bit 5 swift ios11 xcode9-beta arkit coreml

我正在尝试使用Apple 网站上给定的inceptionV3模型将CoreML和ARKit结合到我的项目中.

我从ARKit的标准模板开始(Xcode 9 beta 3)

我重用了ARSCNView启动的会话,而不是对新的摄像机会话进行归属.

在我的viewDelegate结束时,我写道:

sceneView.session.delegate = self
Run Code Online (Sandbox Code Playgroud)

然后我扩展我的viewController以符合ARSessionDelegate协议(可选协议)

// MARK: ARSessionDelegate
extension ViewController: ARSessionDelegate {

    func session(_ session: ARSession, didUpdate frame: ARFrame) {

        do {
            let prediction = try self.model.prediction(image: frame.capturedImage)
            DispatchQueue.main.async {
                if let prob = prediction.classLabelProbs[prediction.classLabel] {
                    self.textLabel.text = "\(prediction.classLabel) \(String(describing: prob))"
                }
            }
        }
        catch let error as NSError {
            print("Unexpected error ocurred: \(error.localizedDescription).")
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

起初我尝试了这段代码,但后来发现初始版需要像Image类型的像素缓冲区.<RGB,<299,299>.

虽然没有重新开始,但我想我只是调整框架的大小然后尝试从中获得预测.我正在调整使用此功能(从https://github.com/yulingtianxia/Core-ML-Sample获取)

func resize(pixelBuffer: CVPixelBuffer) -> CVPixelBuffer? {
    let imageSide = 299
    var ciImage = CIImage(cvPixelBuffer: pixelBuffer, options: nil)
    let transform = CGAffineTransform(scaleX: CGFloat(imageSide) / CGFloat(CVPixelBufferGetWidth(pixelBuffer)), y: CGFloat(imageSide) / CGFloat(CVPixelBufferGetHeight(pixelBuffer)))
    ciImage = ciImage.transformed(by: transform).cropped(to: CGRect(x: 0, y: 0, width: imageSide, height: imageSide))
    let ciContext = CIContext()
    var resizeBuffer: CVPixelBuffer?
    CVPixelBufferCreate(kCFAllocatorDefault, imageSide, imageSide, CVPixelBufferGetPixelFormatType(pixelBuffer), nil, &resizeBuffer)
    ciContext.render(ciImage, to: resizeBuffer!)
    return resizeBuffer
} 
Run Code Online (Sandbox Code Playgroud)

不幸的是,这还不足以使其发挥作用.这是捕获的错误:

Unexpected error ocurred: Input image feature image does not match model description.
2017-07-20 AR+MLPhotoDuplicatePrediction[928:298214] [core] 
    Error Domain=com.apple.CoreML Code=1 
    "Input image feature image does not match model description" 
    UserInfo={NSLocalizedDescription=Input image feature image does not match model description, 
    NSUnderlyingError=0x1c4a49fc0 {Error Domain=com.apple.CoreML Code=1 
    "Image is not expected type 32-BGRA or 32-ARGB, instead is Unsupported (875704422)" 
    UserInfo={NSLocalizedDescription=Image is not expected type 32-BGRA or 32-ARGB, instead is Unsupported (875704422)}}}
Run Code Online (Sandbox Code Playgroud)

不知道我能从这里做些什么.

如果有更好的建议将两者结合起来,我全都听见了.

编辑:我也尝试了@dfd建议的YOLO-CoreML-MPSNNGraph中的resizePixelBuffer方法,错误完全一样.

编辑2:所以我将像素格式更改为kCVPixelFormatType_32BGRA(与resizePixelBuffer中传递的pixelBuffer格式不同).

let pixelFormat = kCVPixelFormatType_32BGRA // line 48
Run Code Online (Sandbox Code Playgroud)

我不再有错误了.但是一旦我尝试做出预测,AVCaptureSession就会停止.似乎我遇到了同样的问题Enric_SA正在苹果开发者论坛上运行.

Edit3:所以我尝试实现rickster解决方案.适用于inceptionV3.我想尝试一个功能观察(VNClassificationObservation).目前,使用TinyYolo无法正常工作.边界是错误的.试图搞清楚.

ric*_*ter 5

不要自己处理图像以将它们提供给Core ML.使用Vision.(不,不是那个.这一个.)Vision采用ML模型和几种图像类型中的任何一种(包括CVPixelBuffer)并自动将图像调整到正确的尺寸和宽高比以及像素格式以供模型评估,然后为您提供模型的结果.

这是您需要的代码的粗略骨架:

var request: VNRequest

func setup() {
    let model = try VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model)
    request = VNCoreMLRequest(model: model, completionHandler: myResultsMethod)
}

func classifyARFrame() {
    let handler = VNImageRequestHandler(cvPixelBuffer: session.currentFrame.capturedImage,
        orientation: .up) // fix based on your UI orientation
    handler.perform([request])
}

func myResultsMethod(request: VNRequest, error: Error?) {
    guard let results = request.results as? [VNClassificationObservation]
        else { fatalError("huh") }
    for classification in results {
        print(classification.identifier, // the scene label
              classification.confidence)
    }
}
Run Code Online (Sandbox Code Playgroud)

有关更多指示,请参阅另一个问题的答案.