我想将 Colab 连接到付费 TPU(从免费 TPU 升级)。我使用本指南创建了一个 JSON 密钥:https : //cloud.google.com/docs/authentication/production#auth-cloud-explicit-python,然后将其上传到 Colab。我可以连接到我的存储,但不能连接到 TPU:
%tensorflow_version 2.x
import tensorflow as tf
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = './gcp-permissions.json'
# Authenticated API request - works.
storage_client = storage.Client.from_service_account_json(
'gcp-permissions.json')
print(list(storage_client.list_buckets())
#Accessing the TPU - does not work. Request times out.
cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(
tpu='My-TPU-Name',
zone='us-central1-a',
project='My-Project-Name'
)
Run Code Online (Sandbox Code Playgroud)
我还尝试了 TPUClusterResolver 调用,仅使用 tpu 名称和 'credentials=gcp-permissions.json' - 结果相同。我已经仔细检查了我的 TPU 是否已在 GCP 控制台中启动并运行。它不是抢占式的。我错过了什么?
谢谢!
google-cloud-platform google-colaboratory google-cloud-tpu tpu
我正在尝试在 TPU 上微调 Huggingface Transformers BERT 模型。它在 Colab 中工作,但当我切换到 GCP 上的付费 TPU 时失败。Jupyter笔记本代码如下:
[1] model = transformers.TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
# works
[2] cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(
tpu='[My TPU]',
zone='us-central1-a',
project='[My Project]'
)
tf.config.experimental_connect_to_cluster(cluster_resolver)
tf.tpu.experimental.initialize_tpu_system(cluster_resolver)
tpu_strategy = tf.distribute.experimental.TPUStrategy(cluster_resolver)
#Also works. Got a bunch of startup messages from the TPU - all good.
[3] with tpu_strategy.scope():
model = TFBertModel.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
#Generates the error below (long). Same line works in Colab.
Run Code Online (Sandbox Code Playgroud)
这是错误消息:
NotFoundError Traceback (most recent call last)
<ipython-input-14-2cfc1a238903> in <module>
1 with tpu_strategy.scope():
----> 2 model …Run Code Online (Sandbox Code Playgroud) google-cloud-platform google-colaboratory google-cloud-tpu bert-language-model huggingface-transformers
tpu ×1