TensorFlow:有没有办法测量模型的FLOPS？

Question

TensorFlow:有没有办法测量模型的FLOPS？

我能得到的最接近的例子是这个问题:https://github.com/tensorflow/tensorflow/issues/899

使用此最小可重现代码:

import tensorflow as tf
import tensorflow.python.framework.ops as ops 
g = tf.Graph()
with g.as_default():
  A = tf.Variable(tf.random_normal( [25,16] ))
  B = tf.Variable(tf.random_normal( [16,9] ))
  C = tf.matmul(A,B) # shape=[25,9]
for op in g.get_operations():
  flops = ops.get_stats_for_node_def(g, op.node_def, 'flops').value
  if flops is not None:
    print 'Flops should be ~',2*25*16*9
    print '25 x 25 x 9 would be',2*25*25*9 # ignores internal dim, repeats first
    print 'TF stats gives',flops

Run Code Online (Sandbox Code Playgroud)

但是,返回的FLOPS始终为None.有没有办法具体测量FLOPS,尤其是PB文件？

Answer 1

BiB*_*iBi 14

我想建立在Tobias Schnek的答案以及回答原始问题:如何从pb文件中获取FLOP .

从Tobias运行第一段代码回答TensorFlow 1.6.0

g = tf.Graph()
run_meta = tf.RunMetadata()
with g.as_default():
    A = tf.Variable(tf.random_normal([25,16]))
    B = tf.Variable(tf.random_normal([16,9]))
    C = tf.matmul(A,B)

    opts = tf.profiler.ProfileOptionBuilder.float_operation()    
    flops = tf.profiler.profile(g, run_meta=run_meta, cmd='op', options=opts)
    if flops is not None:
        print('Flops should be ~',2*25*16*9)
        print('TF stats gives',flops.total_float_ops)

Run Code Online (Sandbox Code Playgroud)

我们得到以下输出:

Flops should be ~ 7200
TF stats gives 8288

Run Code Online (Sandbox Code Playgroud)

那么,为什么我们得到的8288不是预期的结果7200=2*25*16*9^[a]？答案就在于张量A和B初始化的方式.使用高斯分布进行初始化会花费一些FLOP.改变的定义A,并B通过

    A = tf.Variable(initial_value=tf.zeros([25, 16]))
    B = tf.Variable(initial_value=tf.zeros([16, 9]))

Run Code Online (Sandbox Code Playgroud)

给出预期的输出7200.

通常,网络的变量在其他方案中用高斯分布初始化.大多数时候,我们对初始化FLOP不感兴趣,因为它们在初始化期间完成一次,并且在训练期间也不会发生,也不会发生推断.那么,如何在不考虑初始化FLOP的情况下获得FLOP的确切数量？

使用a 冻结图形pb.pb实际上,从文件计算FLOP 是OP的用例.

以下代码段说明了这一点:

import tensorflow as tf
from tensorflow.python.framework import graph_util

def load_pb(pb):
    with tf.gfile.GFile(pb, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
    with tf.Graph().as_default() as graph:
        tf.import_graph_def(graph_def, name='')
        return graph

# ***** (1) Create Graph *****
g = tf.Graph()
sess = tf.Session(graph=g)
with g.as_default():
    A = tf.Variable(initial_value=tf.random_normal([25, 16]))
    B = tf.Variable(initial_value=tf.random_normal([16, 9]))
    C = tf.matmul(A, B, name='output')
    sess.run(tf.global_variables_initializer())
    flops = tf.profiler.profile(g, options = tf.profiler.ProfileOptionBuilder.float_operation())
    print('FLOP before freezing', flops.total_float_ops)
# *****************************        

# ***** (2) freeze graph *****
output_graph_def = graph_util.convert_variables_to_constants(sess, g.as_graph_def(), ['output'])

with tf.gfile.GFile('graph.pb', "wb") as f:
    f.write(output_graph_def.SerializeToString())
# *****************************


# ***** (3) Load frozen graph *****
g2 = load_pb('./graph.pb')
with g2.as_default():
    flops = tf.profiler.profile(g2, options = tf.profiler.ProfileOptionBuilder.float_operation())
    print('FLOP after freezing', flops.total_float_ops)

Run Code Online (Sandbox Code Playgroud)

输出

FLOP before freezing 8288
FLOP after freezing 7200

Run Code Online (Sandbox Code Playgroud)

^并[a]一般的矩阵乘法的FLOP是MQ(2P-1)的产品,其中AB A[m, p]和B[p, q]但TensorFlow返回2mpq出于某种原因.已经打开了一个问题来理解原因.

Answer 2

小智 12

有点晚了但也许它将来会帮助一些游客.对于您的示例,我成功测试了以下代码段:

g = tf.Graph()
run_meta = tf.RunMetadata()
with g.as_default():
    A = tf.Variable(tf.random_normal( [25,16] ))
    B = tf.Variable(tf.random_normal( [16,9] ))
    C = tf.matmul(A,B) # shape=[25,9]

    opts = tf.profiler.ProfileOptionBuilder.float_operation()    
    flops = tf.profiler.profile(g, run_meta=run_meta, cmd='op', options=opts)
    if flops is not None:
        print('Flops should be ~',2*25*16*9)
        print('25 x 25 x 9 would be',2*25*25*9) # ignores internal dim, repeats first
        print('TF stats gives',flops.total_float_ops)

Run Code Online (Sandbox Code Playgroud)

也可以将分析器与Keras以下代码段结合使用:

import tensorflow as tf
import keras.backend as K
from keras.applications.mobilenet import MobileNet

run_meta = tf.RunMetadata()
with tf.Session(graph=tf.Graph()) as sess:
    K.set_session(sess)
    net = MobileNet(alpha=.75, input_tensor=tf.placeholder('float32', shape=(1,32,32,3)))

    opts = tf.profiler.ProfileOptionBuilder.float_operation()    
    flops = tf.profiler.profile(sess.graph, run_meta=run_meta, cmd='op', options=opts)

    opts = tf.profiler.ProfileOptionBuilder.trainable_variables_parameter()    
    params = tf.profiler.profile(sess.graph, run_meta=run_meta, cmd='op', options=opts)

    print("{:,} --- {:,}".format(flops.total_float_ops, params.total_parameters))

Run Code Online (Sandbox Code Playgroud)

我希望我能帮忙!

第一个片段输出`Flops should be ~ 7200`，`TF stats 给出 8288`。为什么会有这种差异？我以这个答案为基础来解释它。 (2认同)

Answer 3

小智 5

上述方法不再适用于 TF2.0，因为分析器方法已被弃用并移至compat.v1. 看来这个功能还是需要实现的。

以下是 Github 上的一个问题： https ://github.com/tensorflow/tensorflow/issues/32809

归档时间：	8 年，6 月前
查看次数：	9515 次
最近记录：	6 年，2 月前