小编Kat*_*Lee的帖子

Tensorflow lite 模型在推理过程中比常规模型慢。为什么？

我有一个常规模型，我用来tf.lite.TFLiteConverter.from_keras_model_file将其转换为 .tflite 模型。然后我使用解释器来进行图像的推理。

tf.logging.set_verbosity(tf.logging.DEBUG)
interpreter = tf.lite.Interpreter(model_path)
interpreter.allocate_tensors()
input_index = interpreter.get_input_details()[0]["index"]
output_index= interpreter.get_output_details()[0]["index"]
for loop:
    (read image)
    interpreter.set_tensor(input_index, image)
    interpreter.invoke()
    result = interpreter.get_tensor(output_index)

Run Code Online (Sandbox Code Playgroud)

对于常规模型，我使用以下方法进行预测。

model = keras.models.load_model({h5 model path}, custom_objects={'loss':loss})
for loop:
    (read image)
    result = model.predict(image)

Run Code Online (Sandbox Code Playgroud)

然而，推理 .tflite 模型所花费的时间比常规模型要长得多。我还尝试了 .tflite 上的训练后量化，但与其他两个模型相比，该模型是最慢的。是否有意义？为什么会出现这种情况？有没有办法让 TensorFlow Lite 模型比常规模型更快？谢谢。

tensorflow tensorflow-lite

Kat*_*Lee

lucky-day

8
推荐指数

0
解决办法

5648
查看次数

为什么GPU比Caffe中的CPU需要更长的时间来处理第一帧视频？

我试着运行SegNet web demo 的代码.我用的时候

caffe.set_mode_cpu()

Run Code Online (Sandbox Code Playgroud)

平均帧速率约为每帧9秒.

但是,当我使用时

caffe.set_mode_gpu()

Run Code Online (Sandbox Code Playgroud)

处理第一帧需要大约107 秒,从第二帧开始大约需要每秒5帧.

为什么使用GPU需要比使用CPU处理第一帧时更长的时间？在处理第一帧时有没有办法提高帧速率？

谢谢!!!!真的需要帮助.

cpu gpu caffe

Kat*_*Lee

2017 02-22

6
推荐指数

0
解决办法

179
查看次数

Tensorflow 中的“Flex Op”是什么意思？

当我研究 Tensorflow Lite 时，我发现自定义操作将作为“flex op”导出，而不是作为本机导出。我不明白什么是“flex op”，什么是“native”。谢谢你！

tensorflow tensorflow-lite

Kat*_*Lee

lucky-day

5
推荐指数

1
解决办法

2082
查看次数

Cmake 链接共享库：包含库中的头文件时“没有这样的文件或目录”

我正在学习使用 Cmake 构建一个库。构建库的代码结构如下：

include:  
   Test.hpp       
   ITest.hpp     // interface
src:
   Test.cpp
   ITest.cpp

Run Code Online (Sandbox Code Playgroud)

在CMakeLists.txt中，我用来建库的语句是：

file(GLOB SRC_LIST "src/iTest.cpp" "src/Test.cpp" "include/Test.hpp"
        "include/iTest.hpp"  "include/deadreckoning.hpp")
add_library(test SHARED ${SRC_LIST})
target_link_libraries( test ${OpenCV_LIBS})  // link opencv libs to libtest.so

Run Code Online (Sandbox Code Playgroud)

然后我又写了一个测试文件（main.cpp），把库复制粘贴到同一个目录下，链接库，调用库里面的函数。这个 CMakeLists.txt 是

cmake_minimum_required(VERSION 2.8)
project(myapp)

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -pthread -O3 -Wall -ftree-vectorize -ffast-math -funroll-loops")

add_executable(myapp main.cpp)
target_link_libraries(myapp "/home/labUser/test_lib/libtest.so")

Run Code Online (Sandbox Code Playgroud)

如果我不将头文件包含在库中，则 main.cpp 将成功编译并运行：

#include <iostream>
using namespace std;

int main(){
    cout << "hello world" << endl;
    return -1;
}

Run Code Online (Sandbox Code Playgroud)

但是当我包含头文件时#include "ITest.hpp"，它有错误：

fatal error: iTest.hpp: No such file or …

Run Code Online (Sandbox Code Playgroud)

c++ cmake shared-libraries

Kat*_*Lee

lucky-day

4
推荐指数

1
解决办法

9459
查看次数

ARM NEON：将二进制每像素 8 位图像（仅 0/1）转换为每像素 1 位？

我正在执行一项任务，将大型二进制标签图像（uint8_t每个像素有 8 位 ( )，每个像素只能是 0 或 1（或 255））转换为uint64_t数字数组，数字中的每个位uint64_t代表一个标签像素。

例如，

输入数组：0 1 1 0 ... (00000000 00000001 00000001 00000000 ...)

或输入数组：0 255 255 0 ... (00000000 11111111 11111111 00000000 ...)

输出数组（数字）：（6因为将每个转换uint8_t为位后，它变成了0110）

目前实现这一点的C代码是：

 for (int j = 0; j < width >> 6; j++) {\n        uint8_t* in_ptr= in + (j << 6);\n        uint64_t out_bits = 0;\n        if (in_ptr[0]) out_bits |= 0x0000000000000001;\n …

Run Code Online (Sandbox Code Playgroud)

arm neon

Kat*_*Lee

lucky-day

2
推荐指数

1
解决办法

461
查看次数