小编Bob*_*ith的帖子

如何在Google Colab中使用TensorFlow 2.0将tf.Keras模型转换为TPU？

由于TF 2.0没有tf.contrib层，如何转换我的模型以在TPU上运行训练，而无权访问tf.contrib.tpu.keras_to_tpu_model()？

我尝试寻找代码，但所有代码都在TensorFlow 1.x上运行

我的数据在中.npy，我有一个简单的模型，我仅在使用它model.compile()并对其model.fit()进行训练，但看起来该模型在CPU上运行（每纪元需要30分钟，而在GPU上则是2分钟）。

keras tensorflow google-colaboratory google-cloud-tpu

Leo*_*Leo

2019 04-13

5
推荐指数

1
解决办法

2429
查看次数

运行时错误：混合不同的 tf.distribute.Strategy 对象

你好？我在使用TPU编译模型时遇到了一些问题，部分代码如下：

resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TF_MASTER)

tf.contrib.distribute.initialize_tpu_system(resolver)

strategy = tf.contrib.distribute.TPUStrategy(resolver)

with strategy.scope():

  model = create_model()

  model.compile(optimizer=tf.keras.optimizers.Adadelta(),loss='categorical_crossentropy',metrics='accuracy'])

Run Code Online (Sandbox Code Playgroud)

我得到了 RuntimeError: enter image description here

你能帮助我吗？

google-colaboratory google-cloud-tpu tpu

o s*_* sy

2019 09-28

5
推荐指数

1
解决办法

844
查看次数

Keras：TPU模型的所有操作都必须具有恒定的形状

我正在使用预训练的keras模型，并且想通过Google Colaboratory在TPU上运行它，但是出现以下错误：

ValueError：图层在非批量尺寸中具有可变的形状。对于所有操作，TPU模型必须具有恒定的形状。

您可能必须为RNN / TimeDistributed图层指定“ input_length”。

图层：输入形状：[（无，128，768），（无，1）]输出形状：（无，无，768）

我正在使用keras-xlnet。据我了解，按此处和此处说明编译模型时，TPU必须具有固定的批处理大小。

该模型是从检查点加载的：

from keras_xlnet import Tokenizer, load_trained_model_from_checkpoint, 
      ATTENTION_TYPE_BI

checkpoint_path = 'xlnet_cased_L-12_H-768_A-12'

tokenizer = Tokenizer(os.path.join(checkpoint_path, 'spiece.model'))
model = load_trained_model_from_checkpoint(
    config_path=os.path.join(checkpoint_path, 'xlnet_config.json'),
    checkpoint_path=os.path.join(checkpoint_path, 'xlnet_model.ckpt'),
    batch_size=BATCH_SIZE,
    memory_len=512,
    target_len=SEQ_LEN,
    in_train_phase=False,
    attention_type=ATTENTION_TYPE_BI,
    )
 model.summary()

Run Code Online (Sandbox Code Playgroud)

然后编译模型（经过一些更改）：

from keras_bert import AdamWarmup, calc_train_steps

decay_steps, warmup_steps = calc_train_steps(
    y_train.shape[0],
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    )


model.compile(
    AdamWarmup(decay_steps=decay_steps, warmup_steps=warmup_steps, lr=LR),
    loss='binary_crossentropy',
    )

Run Code Online (Sandbox Code Playgroud)

然后，将模型加载到发生错误的TPU：

tpu_address = 'grpc://' + os.environ['COLAB_TPU_ADDR']
    strategy = tf.contrib.tpu.TPUDistributionStrategy(
    tf.contrib.cluster_resolver.TPUClusterResolver(tpu=tpu_address)
    )

with tf.keras.utils.custom_object_scope(get_custom_objects()):
    tpu_model = tf.contrib.tpu.keras_to_tpu_model(model, strategy=strategy) …

Run Code Online (Sandbox Code Playgroud)

keras tensorflow google-colaboratory google-cloud-tpu tpu

che*_*ose

2019 11-04

5
推荐指数

1
解决办法

135
查看次数

Colab 告诉我创建一个存储桶，但是在哪里呢？

在 Google Colab 上使用 TPU 时（例如在MNIST 示例中）中），我们被告知创建一个 GCS 存储桶。但是，它没有告诉我们在哪里。在不知道 Colab 实例的区域/区域的情况下，我不敢创建存储桶，以免遇到计费问题。

其实有几个问题：

从 Colab 访问 GCS 存储桶是免费的，还是需要支付正常的网络出口费用？
我可以获取colab实例的地域/可用区吗？很可能不会。
如果上述两个问题的问题都是“否”：在 Colab 中使用 TPU 时是否有任何解决方案可以最大限度地降低成本？

google-colaboratory google-cloud-tpu

Dav*_*key

2020 02-14

4
推荐指数

1
解决办法

715
查看次数

TPU比GPU慢？

我刚刚尝试在Google Colab中使用TPU，我想看看TPU比GPU快多少。我惊讶地得到了相反的结果。

以下是NN。

  random_image = tf.random_normal((100, 100, 100, 3))
  result = tf.layers.conv2d(random_image, 32, 7)
  result = tf.reduce_sum(result)

Run Code Online (Sandbox Code Playgroud)

性能结果：

CPU: 8s
GPU: 0.18s
TPU: 0.50s

Run Code Online (Sandbox Code Playgroud)

我不知道为什么。...TPU的完整代码如下：

def calc():
  random_image = tf.random_normal((100, 100, 100, 3))
  result = tf.layers.conv2d(random_image, 32, 7)
  result = tf.reduce_sum(result)
  return result

tpu_ops = tf.contrib.tpu.batch_parallel(calc, [], num_shards=8)

session = tf.Session(tpu_address)
try:
  print('Initializing global variables...')
  session.run(tf.global_variables_initializer())
  print('Warming up...')
  session.run(tf.contrib.tpu.initialize_system())
  print('Profiling')
  start = time.time()
  session.run(tpu_ops)
  end = time.time()
  elapsed = end - start
  print(elapsed)
finally:
  session.run(tf.contrib.tpu.shutdown_system())
  session.close()

Run Code Online (Sandbox Code Playgroud)

gpu machine-learning tensorflow google-colaboratory google-cloud-tpu

fat*_*gon

2018 11-20

3
推荐指数

1
解决办法

2854
查看次数

谷歌 Colab TPU 版本

如何在 Google Colab 中打印我正在使用的 TPU 版本以及 TPU 有多少内存？

我得到以下输出

tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)

tpu_strategy = tf.distribute.experimental.TPUStrategy(tpu)

Run Code Online (Sandbox Code Playgroud)

输出

INFO:tensorflow:Initializing the TPU system: grpc://10.123.109.90:8470
INFO:tensorflow:Initializing the TPU system: grpc://10.123.109.90:8470
INFO:tensorflow:Clearing out eager caches
INFO:tensorflow:Clearing out eager caches
INFO:tensorflow:Finished initializing TPU system.
INFO:tensorflow:Finished initializing TPU system.
WARNING:absl:`tf.distribute.experimental.TPUStrategy` is deprecated, please use  the non experimental symbol `tf.distribute.TPUStrategy` instead.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Workers: …

Run Code Online (Sandbox Code Playgroud)

python google-colaboratory google-cloud-tpu

作者

2020 11-07

2
推荐指数

1
解决办法

2459
查看次数

Google Colab KeyError：“ COLAB_TPU_ADDR”

我正在尝试使用TPU选项在Google Colab上运行一个简单的MNIST分类器。使用Keras创建模型后，我尝试通过以下方法将其转换为TPU：

import tensorflow as tf
import os

tpu_model = tf.contrib.tpu.keras_to_tpu_model(
    model,
    strategy=tf.contrib.tpu.TPUDistributionStrategy(
        tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
    )
)
tpu_model.compile(
    optimizer='rmsprop',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)


print(model.summary())

Run Code Online (Sandbox Code Playgroud)

我得到的错误是：

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-5-63c528142aab> in <module>()
      5     model,
      6     strategy=tf.contrib.tpu.TPUDistributionStrategy(
----> 7         tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
      8     )
      9 )

/usr/lib/python3.6/os.py in __getitem__(self, key)
    667         except KeyError:
    668             # raise KeyError with the original key value
--> 669             raise KeyError(key) from None
    670         return self.decodevalue(value)
    671 

KeyError: 'COLAB_TPU_ADDR'

Run Code Online (Sandbox Code Playgroud)

看来我需要更改TPU地址，但是正在谷歌搜索并且还没有找到任何东西。感谢一些帮助，谢谢！

deep-learning google-colaboratory google-cloud-tpu

Nic*_*Ben

2018 11-20

1
推荐指数

1
解决办法

1119
查看次数

在Google Colab环境中运行Cloud TPU分析器

我正在运行Google Colab笔记本，并尝试捕获TensorBoard中使用的TPU分析数据，但是capture_tpu_profile在运行TensorFlow代码时无法在后台运行。

到目前为止，我尝试使用以下命令在后台运行捕获过程：

!capture_tpu_profile --logdir=gs://<my_logdir> --tpu=$COLAB_TPU_ADDR &

Run Code Online (Sandbox Code Playgroud)

和

!bg capture_tpu_profile --logdir=gs://<my_logdir> --tpu=$COLAB_TPU_ADDR

Run Code Online (Sandbox Code Playgroud)

python tensorflow google-colaboratory google-cloud-tpu

Jan*_*ann

2018 11-20

1
推荐指数

1
解决办法

380
查看次数

标签统计

google-cloud-tpu ×8

google-colaboratory ×8

tensorflow ×4

keras ×2

python ×2

tpu ×2

deep-learning ×1

gpu ×1

machine-learning ×1

标签 统计

小编Bob_ith的帖子

标签统计