Jam*_*mes 28 android machine-learning objective-c ios tensorflow
我有一个Android应用程序,模仿Tensorflow Android演示模型,用于分类图像,
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android
原始应用程序使用张量流图(.pb)文件来分类来自Inception v3的一组通用图像(我认为)
然后我按照Tensorflow for Poets博客中的说明,为我自己的图像训练自己的图形,
https://petewarden.com/2016/02/28/tensorflow-for-poets/
在更改设置后,这在Android应用程序中运行得很好,
ClassifierActivity
private static final int INPUT_SIZE = 299;
private static final int IMAGE_MEAN = 128;
private static final float IMAGE_STD = 128.0f;
private static final String INPUT_NAME = "Mul";
private static final String OUTPUT_NAME = "final_result";
private static final String MODEL_FILE = "file:///android_asset/optimized_graph.pb";
private static final String LABEL_FILE = "file:///android_asset/retrained_labels.txt";
Run Code Online (Sandbox Code Playgroud)
为了将应用程序移植到iOS,我使用iOS相机演示, https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/ios/camera
并使用相同的图形文件并更改了设置,
CameraExampleViewController.mm
// If you have your own model, modify this to the file name, and make sure
// you've added the file to your app resources too.
static NSString* model_file_name = @"tensorflow_inception_graph";
static NSString* model_file_type = @"pb";
// This controls whether we'll be loading a plain GraphDef proto, or a
// file created by the convert_graphdef_memmapped_format utility that wraps a
// GraphDef and parameter file that can be mapped into memory from file to
// reduce overall memory usage.
const bool model_uses_memory_mapping = false;
// If you have your own model, point this to the labels file.
static NSString* labels_file_name = @"imagenet_comp_graph_label_strings";
static NSString* labels_file_type = @"txt";
// These dimensions need to match those the model was trained with.
const int wanted_input_width = 299;
const int wanted_input_height = 299;
const int wanted_input_channels = 3;
const float input_mean = 128f;
const float input_std = 128.0f;
const std::string input_layer_name = "Mul";
const std::string output_layer_name = "final_result";
Run Code Online (Sandbox Code Playgroud)
在此之后,该应用程序正在iOS上工作,但是......
Android上的应用程序在检测机密图像时比iOS运行得更好.如果我用图像填充相机的视口,两者都表现相似.但通常情况下,要检测的图像只是摄像头视图端口的一部分,在Android上这似乎没有太大影响,但在iOS上它影响很大,因此iOS无法对图像进行分类.
我的猜测是Android正在裁剪,如果摄像头视图端口到中央299x299区域,那么iOS正在将其摄像机视口缩放到中央299x299区域.
谁能证实这一点?有没有人知道如何修复iOS演示以更好地检测聚焦图像?(让它裁剪)
在演示Android类中,
ClassifierActivity.onPreviewSizeChosen()
rgbFrameBitmap = Bitmap.createBitmap(previewWidth, previewHeight, Config.ARGB_8888);
croppedBitmap = Bitmap.createBitmap(INPUT_SIZE, INPUT_SIZE, Config.ARGB_8888);
frameToCropTransform =
ImageUtils.getTransformationMatrix(
previewWidth, previewHeight,
INPUT_SIZE, INPUT_SIZE,
sensorOrientation, MAINTAIN_ASPECT);
cropToFrameTransform = new Matrix();
frameToCropTransform.invert(cropToFrameTransform);
Run Code Online (Sandbox Code Playgroud)
在iOS上有,
CameraExampleViewController.runCNNOnFrame()
const int sourceRowBytes = (int)CVPixelBufferGetBytesPerRow(pixelBuffer);
const int image_width = (int)CVPixelBufferGetWidth(pixelBuffer);
const int fullHeight = (int)CVPixelBufferGetHeight(pixelBuffer);
CVPixelBufferLockFlags unlockFlags = kNilOptions;
CVPixelBufferLockBaseAddress(pixelBuffer, unlockFlags);
unsigned char *sourceBaseAddr =
(unsigned char *)(CVPixelBufferGetBaseAddress(pixelBuffer));
int image_height;
unsigned char *sourceStartAddr;
if (fullHeight <= image_width) {
image_height = fullHeight;
sourceStartAddr = sourceBaseAddr;
} else {
image_height = image_width;
const int marginY = ((fullHeight - image_width) / 2);
sourceStartAddr = (sourceBaseAddr + (marginY * sourceRowBytes));
}
const int image_channels = 4;
assert(image_channels >= wanted_input_channels);
tensorflow::Tensor image_tensor(
tensorflow::DT_FLOAT,
tensorflow::TensorShape(
{1, wanted_input_height, wanted_input_width, wanted_input_channels}));
auto image_tensor_mapped = image_tensor.tensor<float, 4>();
tensorflow::uint8 *in = sourceStartAddr;
float *out = image_tensor_mapped.data();
for (int y = 0; y < wanted_input_height; ++y) {
float *out_row = out + (y * wanted_input_width * wanted_input_channels);
for (int x = 0; x < wanted_input_width; ++x) {
const int in_x = (y * image_width) / wanted_input_width;
const int in_y = (x * image_height) / wanted_input_height;
tensorflow::uint8 *in_pixel =
in + (in_y * image_width * image_channels) + (in_x * image_channels);
float *out_pixel = out_row + (x * wanted_input_channels);
for (int c = 0; c < wanted_input_channels; ++c) {
out_pixel[c] = (in_pixel[c] - input_mean) / input_std;
}
}
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, unlockFlags);
Run Code Online (Sandbox Code Playgroud)
我认为这个问题在这里,
tensorflow::uint8 *in_pixel =
in + (in_y * image_width * image_channels) + (in_x * image_channels);
float *out_pixel = out_row + (x * wanted_input_channels);
Run Code Online (Sandbox Code Playgroud)
我的理解是,这只是通过选择每个第x个像素而不是将原始图像缩放到299大小来缩放到299大小.因此,这会导致缩放不良和图像识别不良.
解决方案是首先将pixelBuffer缩放到299号.我试过这个,
UIImage *uiImage = [self uiImageFromPixelBuffer: pixelBuffer];
float scaleFactor = (float)wanted_input_height / (float)fullHeight;
float newWidth = image_width * scaleFactor;
NSLog(@"width: %d, height: %d, scale: %f, height: %f", image_width, fullHeight, scaleFactor, newWidth);
CGSize size = CGSizeMake(wanted_input_width, wanted_input_height);
UIGraphicsBeginImageContext(size);
[uiImage drawInRect:CGRectMake(0, 0, newWidth, size.height)];
UIImage *destImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
pixelBuffer = [self pixelBufferFromCGImage: destImage.CGImage];
Run Code Online (Sandbox Code Playgroud)
并将图像转换为pixle缓冲区,
- (CVPixelBufferRef) pixelBufferFromCGImage: (CGImageRef) image
{
NSDictionary *options = @{
(NSString*)kCVPixelBufferCGImageCompatibilityKey : @YES,
(NSString*)kCVPixelBufferCGBitmapContextCompatibilityKey : @YES,
};
CVPixelBufferRef pxbuffer = NULL;
CVReturn status = CVPixelBufferCreate(kCFAllocatorDefault, CGImageGetWidth(image),
CGImageGetHeight(image), kCVPixelFormatType_32ARGB, (__bridge CFDictionaryRef) options,
&pxbuffer);
if (status!=kCVReturnSuccess) {
NSLog(@"Operation failed");
}
NSParameterAssert(status == kCVReturnSuccess && pxbuffer != NULL);
CVPixelBufferLockBaseAddress(pxbuffer, 0);
void *pxdata = CVPixelBufferGetBaseAddress(pxbuffer);
CGColorSpaceRef rgbColorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate(pxdata, CGImageGetWidth(image),
CGImageGetHeight(image), 8, 4*CGImageGetWidth(image), rgbColorSpace,
kCGImageAlphaNoneSkipFirst);
NSParameterAssert(context);
CGContextConcatCTM(context, CGAffineTransformMakeRotation(0));
CGAffineTransform flipVertical = CGAffineTransformMake( 1, 0, 0, -1, 0, CGImageGetHeight(image) );
CGContextConcatCTM(context, flipVertical);
CGAffineTransform flipHorizontal = CGAffineTransformMake( -1.0, 0.0, 0.0, 1.0, CGImageGetWidth(image), 0.0 );
CGContextConcatCTM(context, flipHorizontal);
CGContextDrawImage(context, CGRectMake(0, 0, CGImageGetWidth(image),
CGImageGetHeight(image)), image);
CGColorSpaceRelease(rgbColorSpace);
CGContextRelease(context);
CVPixelBufferUnlockBaseAddress(pxbuffer, 0);
return pxbuffer;
}
- (UIImage*) uiImageFromPixelBuffer: (CVPixelBufferRef) pixelBuffer {
CIImage *ciImage = [CIImage imageWithCVPixelBuffer: pixelBuffer];
CIContext *temporaryContext = [CIContext contextWithOptions:nil];
CGImageRef videoImage = [temporaryContext
createCGImage:ciImage
fromRect:CGRectMake(0, 0,
CVPixelBufferGetWidth(pixelBuffer),
CVPixelBufferGetHeight(pixelBuffer))];
UIImage *uiImage = [UIImage imageWithCGImage:videoImage];
CGImageRelease(videoImage);
return uiImage;
}
Run Code Online (Sandbox Code Playgroud)
不确定这是否是调整大小的最佳方法,但这有效.但它似乎使图像分类更糟糕,而不是更好......
任何想法或图像转换/调整大小的问题?
由于您未使用YOLO Detector,因此MAINTAIN_ASPECT标志设置为false.因此,Android应用上的图像不会被裁剪,但它会缩放.但是,在提供的代码片段中,我没有看到标志的实际初始化.确认该标志的值实际上false在您的应用中.
我知道这不是一个完整的解决方案,但希望这可以帮助您调试问题.
| 归档时间: |
|
| 查看次数: |
897 次 |
| 最近记录: |