在多核设备上运行TensorFlow

Chr*_*our 9 java android tensorflow

我有一个基本的Android TensorFlowInference示例,在单个线程中运行良好.

public class InferenceExample {

    private static final String MODEL_FILE = "file:///android_asset/model.pb";
    private static final String INPUT_NODE = "intput_node0";
    private static final String OUTPUT_NODE = "output_node0"; 
    private static final int[] INPUT_SIZE = {1, 8000, 1};
    public static final int CHUNK_SIZE = 8000;
    public static final int STRIDE = 4;
    private static final int NUM_OUTPUT_STATES = 5;

    private static TensorFlowInferenceInterface inferenceInterface;

    public InferenceExample(final Context context) {
        inferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), MODEL_FILE);
    }

    public float[] run(float[] data) {

        float[] res = new float[CHUNK_SIZE / STRIDE * NUM_OUTPUT_STATES];

        inferenceInterface.feed(INPUT_NODE, data, INPUT_SIZE[0], INPUT_SIZE[1], INPUT_SIZE[2]);
        inferenceInterface.run(new String[]{OUTPUT_NODE});
        inferenceInterface.fetch(OUTPUT_NODE, res);

        return res;
    }
}
Run Code Online (Sandbox Code Playgroud)

该示例崩溃时有各种异常,包括java.lang.ArrayIndexOutOfBoundsException和按照以下示例java.lang.NullPointerException运行时ThreadPool所以我猜它不是线程安全的.

InferenceExample inference = new InferenceExample(context);

ExecutorService executor = Executors.newFixedThreadPool(NUMBER_OF_CORES);    
Collection<Future<?>> futures = new LinkedList<Future<?>>();

for (int i = 1; i <= 100; i++) {
    Future<?> result = executor.submit(new Runnable() {
        public void run() {
           inference.call(randomData);
        }
    });
    futures.add(result);
}

for (Future<?> future:futures) {
    try { future.get(); }
    catch(ExecutionException | InterruptedException e) {
        Log.e("TF", e.getMessage());
    }
}
Run Code Online (Sandbox Code Playgroud)

是否有可能利用多核Android设备TensorFlowInferenceInterface

Chr*_*our 1

为了使InferenceExample线程安全,我更改了TensorFlowInferenceInterface方法static并创建了run方法synchronized

private TensorFlowInferenceInterface inferenceInterface;

public InferenceExample(final Context context) {
    inferenceInterface = new TensorFlowInferenceInterface(assets, model);
}

public synchronized float[] run(float[] data) { ... }
Run Code Online (Sandbox Code Playgroud)

InterferenceExample然后我循环遍历整个实例列表numThreads

for (int i = 1; i <= 100; i++) {
    final int id = i % numThreads;
    Future<?> result = executor.submit(new Runnable() {
        public void run() {
            list.get(id).run(data);
        }
    });
    futures.add(result);
}
Run Code Online (Sandbox Code Playgroud)

这确实提高了性能,但是在 8 核设备上,峰值为numThreads2,并且在 Android Studio Monitor 中仅显示约 50% 的 CPU 使用率。