标签: tensorboard

Tensorflow GPU 分析

我正在使用 TF keras API 训练模型，我遇到的问题是我无法最大化 GPU 的使用，它在内存和处理方面都未得到充分利用。

在分析模型时，我可以看到许多标记为的操作，_Send我认为这些操作是在 GPU 和 CPU 之间跳跃的一些数据。

由于我使用的是 keras，因此我没有直接在设备上放置变量，因此我不清楚为什么会发生这种情况或如何优化。

另一个有趣的副作用似乎是，较大的批次会使训练速度变慢，GPU 需要等待很长时间才能从 CPU 获取数据。

分析器还建议：

59.4 % of the total step time sampled is spent on 'Kernel Launch'. It could be due to CPU contention with tf.data. In this case, you may try to set the environment variable TF_GPU_THREAD_MODE=gpu_private.

Run Code Online (Sandbox Code Playgroud)

我已在笔记本顶部设置了此环境变量，但没有效果 - 我不清楚如何检查它是否具有预期效果。

非常感谢您在这里的帮助，我已阅读张量流文档上的所有可用指南。

tensorflow tensorboard tensorflow-datasets tensorflow2.0

den*_*dog

lucky-day

5
推荐指数

0
解决办法

996
查看次数

使用 Pytorch Lightning 时如何将指标（例如验证损失）记录到 TensorBoard？

我正在使用 Pytorch Lightning 来训练我的模型（在 GPU 设备上，使用 DDP），TensorBoard 是 Lightning 使用的默认记录器。

我的代码设置为分别记录每个训练和验证步骤的训练和验证损失。

class MyLightningModel(pl.LightningModule):

    def training_step(self, batch):
        x, labels = batch
        out = self(x)
        loss = F.mse_loss(out, labels)
        self.log("train_loss", loss)
        return loss

    def validation_step(self, batch):
        x, labels = batch
        out = self(x)
        loss = F.mse_loss(out, labels)
        self.log("val_loss", loss)
        return loss

Run Code Online (Sandbox Code Playgroud)

TensorBoard在选项卡中正确绘制train_loss和图表。但是，在选项卡中的左侧栏上，仅在下可见。val_lossSCALERSHPARAMShp_metricMetrics

但是，在HPARAMS选项卡中的左侧栏上，仅hp_metric在下可见Metrics。

我们如何将train_loss和添加val_loss到该Metrics部分？这样，我们就可以使用val_lossinPARALLEL COORDINATES VIEW来代替hp_metric …

python machine-learning tensorboard pytorch pytorch-lightning

Ath*_*dom

2021 04-05

5
推荐指数

1
解决办法

2345
查看次数

标量输出张量板的超参数过滤

我使用 Tensorboard 和 pytorch 来监控模型的训练指标：当涉及到超参数优化时，我最终会遇到相当多具有各种参数组合的标量事件（训练损失）。是否可以根据超参数的值过滤掉其中一些图表？似乎该hparam功能只允许您记录最终指标：理想情况下，我不想使用comment参数SummaryWriter 来执行此操作并使用正则表达式对其进行过滤，因为它提供的字符数量有限。

tensorboard pytorch

Yan*_*tha

lucky-day

5
推荐指数

0
解决办法

199
查看次数

如何在 Tensorboard 投影仪中可视化 Gensim Word2vec 嵌入

按照gensim word2vec 嵌入教程，我训练了一个简单的 word2vec 模型：

from gensim.test.utils import common_texts
from gensim.models import Word2Vec
model = Word2Vec(sentences=common_texts, size=100, window=5, min_count=1, workers=4)
model.save("/content/word2vec.model")

Run Code Online (Sandbox Code Playgroud)

我想使用 TensorBoard 中的嵌入投影仪将其可视化。gensim 文档中有另一个简单的教程。我在 Colab 中做了以下操作：

!python3 -m gensim.scripts.word2vec2tensor -i /content/word2vec.model -o /content/my_model

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/gensim/scripts/word2vec2tensor.py", line 94, in <module>
    word2vec2tensor(args.input, args.output, args.binary)
  File "/usr/local/lib/python3.7/dist-packages/gensim/scripts/word2vec2tensor.py", line 68, in word2vec2tensor
    model = gensim.models.KeyedVectors.load_word2vec_format(word2vec_model_path, binary=binary) …

Run Code Online (Sandbox Code Playgroud)

python gensim word2vec tensorflow tensorboard

G. *_*cia

2021 09-20

5
推荐指数

1
解决办法

769
查看次数

如何在不使用 Tensorboard 训练模型的情况下可视化图形？

我试图在没有训练的情况下在 Tensorboard 中可视化模型。

我检查了这个和那个，但即使对于最简单的模型，这仍然不起作用。

import tensorflow as tf
import tensorflow.keras as keras
# Both tf.__version__ tensorboard.__version__ are 2.5.0

s_model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
])

logdir = '.../logs'
_callbacks = keras.callbacks.TensorBoard(log_dir=logdir)
_callbacks.set_model(s_model) # This is exactly suggested in the link

Run Code Online (Sandbox Code Playgroud)

当我执行上述操作时，我收到错误消息：

图形可视化失败。

错误：GraphDef 格式错误。有时，这可能是由于网络连接不良或难以协调多个 GraphDef 造成的；对于后一种情况，请参考 https://github.com/tensorflow/tensorboard/issues/1929。

我不认为这是一个协调问题，因为它不是自定义函数，如果我编译模型，训练，那么我可以获得我想要的图形可视化。

s_model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy'])

(train_images, train_labels), _ = keras.datasets.fashion_mnist.load_data()
train_images = train_images / 255.0

logdir = '.../logs'
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)

s_model.fit(
    train_images,
    train_labels, …

Run Code Online (Sandbox Code Playgroud)

graph-visualization keras tensorflow tensorboard

Hye*_*oun

lucky-day

5
推荐指数

0
解决办法

1056
查看次数

TensorFlow图像分类

我对TensorFlow很新.我正在使用自己的培训数据库进行图像分类.

但是,在我训练自己的数据集后,我不知道如何对输入图像进行分类.

这是我准备自己的数据集的代码

filenames = ['01.jpg', '02.jpg', '03.jpg', '04.jpg']
label = [0,1,1,1]
filename_queue = tf.train.string_input_producer(filenames)

reader = tf.WholeFileReader()
filename, content = reader.read(filename_queue)
image = tf.image.decode_jpeg(content, channels=3)
image = tf.cast(image, tf.float32)
resized_image = tf.image.resize_images(image, 224, 224)

image_batch , label_batch= tf.train.batch([resized_image,label], batch_size=8, num_threads = 3, capacity=5000)

Run Code Online (Sandbox Code Playgroud)

这是训练数据集的正确代码吗？

之后,我尝试使用它来使用以下代码对输入图像进行分类.

test = ['test.jpg', 'test2.jpg']
test_queue=tf.train.string_input_producer(test)
reader = tf.WholeFileReader()
testname, test_content = reader.read(test_queue)
test = tf.image.decode_jpeg(test_content, channels=3)
test = tf.cast(test, tf.float32)
resized_image = tf.image.resize_images(test, 224,224)

with tf.Session() as sess:
    coord = …

Run Code Online (Sandbox Code Playgroud)

python image-processing tensorflow tensorboard

VIC*_*TOR

2018 09-18

4
推荐指数

1
解决办法

9922
查看次数

TensorBoard无法读取Google云端存储上的摘要

当我将TensorBoard与TensorFlow v0.9.0一起使用时,TensorFlow可以将Google云端存储上的摘要读取为tensorboard --logdir=gs://....

但是,使用TensorFlow v0.11.0的TensorBoard无法读取它.从v0.9.0到v0.11.0有什么变化吗？

错误消息如下.

W tensorflow/core/platform/cloud/google_auth_provider.cc:151] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Unavailable: libcurl failed with error code 23: Failed writing body (101 != 188)". Retrieving token from GCE failed with "Unavailable: Unexpected response code 0".

python google-cloud-storage tensorflow tensorboard

Shu*_*ara

lucky-day

4
推荐指数

1
解决办法

1684
查看次数

TensorFlow:了解tf.summary.scalar中的`collections`参数

我正在与TensorBoard合作tf.summary.scalar.在文档中它有一个arugment collections=None,描述如下:

collections:图表集合键的可选列表.新摘要op将添加到这些集合中.默认为[GraphKeys.SUMMARIES].

我不明白这个描述,以及collections用于什么.有人可以向我解释一下这个问题,或许可以指出一个很好的例子用例吗？

python tensorflow tensorboard

Tok*_*rby

lucky-day

4
推荐指数

1
解决办法

1803
查看次数

tensorboard无法找到事件文件

我尝试使用tensorboard来使用DNN可视化图像分类器.我非常确定目录路径是正确的,但是没有显示数据.当我尝试 tensorboard --inspect --logdir='PATH/' 返回时:在logdir'PATH /'中找不到任何事件文件

我想我的编码一定有问题.

图形

batch_size = 500

graph = tf.Graph()
with graph.as_default():

  # Input data. For the training data, we use a placeholder that will be fed
  # at run time with a training minibatch.
  with tf.name_scope('train_input'):
    tf_train_dataset = tf.placeholder(tf.float32,
                                      shape=(batch_size, image_size * image_size),
                                      name = 'train_x_input')

    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels),
                                     name = 'train_y_input')
  with tf.name_scope('validation_input'):
    tf_valid_dataset = tf.constant(valid_dataset, name = 'valid_x_input')
    tf_test_dataset = tf.constant(test_dataset, name = 'valid_y_input')

  # Variables.
  with tf.name_scope('layer'):
    with tf.name_scope('weights'):
        weights …

Run Code Online (Sandbox Code Playgroud)

deep-learning tensorflow tensorboard

Xiu*_*anc

2017 03-14

4
推荐指数

1
解决办法

9763
查看次数

如何在tf.layers中使用张量板？

由于没有明确定义权重,我如何将它们传递给摘要编写器？

举个例子:

conv1 = tf.layers.conv2d(
    tf.reshape(X,[FLAGS.batch,3,160,320]),
    filters = 16,
    kernel_size = (8,8),
    strides=(4, 4),
    padding='same',
    kernel_initializer=tf.contrib.layers.xavier_initializer(),
    bias_initializer=tf.zeros_initializer(),
    kernel_regularizer=None,
    name = 'conv1',
    activation = tf.nn.elu
    )

Run Code Online (Sandbox Code Playgroud)