我正在尝试将从Tensorflow模型保存的权重导入PyTorch。到目前为止,结果非常相似。当模型要求conv2dwith 时,我遇到了麻烦stride=2。
为了验证不匹配,我在TF和PyTorch之间进行了非常简单的比较。首先,我比较conv2d有stride=1。
import tensorflow as tf
import numpy as np
import torch
import torch.nn.functional as F
np.random.seed(0)
sess = tf.Session()
# Create random weights and input
weights = torch.empty(3, 3, 3, 8)
torch.nn.init.constant_(weights, 5e-2)
x = np.random.randn(1, 3, 10, 10)
weights_tf = tf.convert_to_tensor(weights.numpy(), dtype=tf.float32)
# PyTorch adopts [outputC, inputC, kH, kW]
weights_torch = torch.Tensor(weights.permute((3, 2, 0, 1)))
# Tensorflow defaults to NHWC
x_tf = tf.convert_to_tensor(x.transpose((0, 2, 3, 1)), dtype=tf.float32)
x_torch = …Run Code Online (Sandbox Code Playgroud) 我看到了两种减轻keras模型权重的方法。
第一种方式;
checkpointer = ModelCheckpoint(filepath="weights.hdf5", verbose=1, save_best_only=True)
model.fit(x_train, y_train,
nb_epoch=number_of_epoch,
batch_size=128,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[reduce_lr, checkpointer],
shuffle=True)
Run Code Online (Sandbox Code Playgroud)
第二种方式
model.save_weights("model_weights.h5")
Run Code Online (Sandbox Code Playgroud)
两种方式有什么区别?加载weights.hdf5和加载之间的预测性能model_weights.h5有何不同?
执行以下行时
!pip install kaggle
!kaggle competitions download -c dogs-vs-cats -p /content/
Run Code Online (Sandbox Code Playgroud)
我收到以下错误消息,
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 7, in <module>
from kaggle.cli import main
File "/usr/local/lib/python3.6/dist-packages/kaggle/__init__.py", line 23, in <module>
api.authenticate()
File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 109, in authenticate
self._load_config(config_data)
File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 151, in _load_config
raise ValueError('Error: Missing %s in configuration.' % item)
ValueError: Error: Missing username in configuration.
Run Code Online (Sandbox Code Playgroud)
我不知道刚发生了什么......同样的线路之前工作得很好.这是我第一次发现这个问题.
使用以下方法在Pytorch中保存模型时:
torch.save(model, 'checkpoint.pth')
Run Code Online (Sandbox Code Playgroud)
我收到以下警告:
/opt/conda/lib/python3.6/site-packages/torch/serialization.py:193:UserWarning:无法检索类型为Network的容器的源代码。加载后不会检查其正确性。“类型” + obj。名称 +“。不会被选中”
当我加载它时,出现以下错误:
state_dict = torch.load('checkpoint_state_dict.pth')
model = torch.load('checkpoint.pth')
model.load_state_dict(state_dict)
AttributeError Traceback (most recent call last)
<ipython-input-2-6a79854aef0f> in <module>()
2 state_dict = torch.load('checkpoint_state_dict.pth')
3 model = 0
----> 4 model = torch.load('checkpoint.pth')
5 model.load_state_dict(state_dict)
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in load(f, map_location, pickle_module)
301 f = open(f, 'rb')
302 try:
--> 303 return _load(f, map_location, pickle_module)
304 finally:
305 if new_fd:
/opt/conda/lib/python3.6/site-packages/torch/serialization.py in _load(f, map_location, pickle_module)
467 unpickler = pickle_module.Unpickler(f)
468 unpickler.persistent_load = persistent_load
--> 469 result …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用功能性API来拥有一个共享层,其中只有一条路径是可训练的:
a_in = Input(x_shape)
b_in = Input(x_shape)
a_out = my_model(a_in) # I want these weights to be trainable
b_out = my_model(b_in) # I want these weights to be non-trainable (no gradient update)
y_out = my_merge(a_out, b_out)
full_model = Model(inputs=[a_in, b_in], outputs=[y_out])
full_model.compile(...)
Run Code Online (Sandbox Code Playgroud)
我不知道如何做到这一点。设置my_model可训练标志会影响两个图层。我可以用不同的可训练标记来编译2个不同的模型,但是然后我看不到如何结合2个预编译的模型来优化我的单个合并成本函数。
这与Keras可能吗?如果没有,TensorFlow是否有可能?
我正在使用 tf.keras API 构建我的 CNN 模型,并使用 tf.Dataset API 为我的模型创建输入管道。来自 mnist 数据集tf.keras.datasets用于测试并通过执行代码在内存中准备:
(train_images,train_labels),(test_images,test_labels) = tf.keras.datasets.mnist.load_data()
Run Code Online (Sandbox Code Playgroud)
还有一些与我的 keras 模型兼容的预处理:
Train_images = np.expand_dims(train_images,3).astype('float')/255.0
Test_images = np.expand_dims(test_images,3).astype('float')/255.0
Train_labels = tf.keras.utils.to_categorical(train_labels)
Test_labels = tf.keras.utils.to_categorical(test_labels)
Run Code Online (Sandbox Code Playgroud)
这些数据作为数组存储在内存中,有两种创建数据集对象的选项。第一个只是使用tf.data.Dataset.from_tensor_slices:
image = tf.data.Dataset.from_tensor_slices((Train_images,Train_labels))
Run Code Online (Sandbox Code Playgroud)
并将得到的对象输入到 model.fit() 中:
model.fit(x=image,steps_per_epoch=1000)
Run Code Online (Sandbox Code Playgroud)
或者通过以下方式输入该数据集的迭代器:
iterator = image.make_one_shot_iterator()
model.fit(x=iterator,steps_per_epoch=1000)
Run Code Online (Sandbox Code Playgroud)
这两个选项都可以正常工作,因为此处名为 image 的数据集是使用内存中的数据创建的。然而,根据这里的导入数据,我们可能希望避免这样做,因为它会多次复制数据并占用内存。因此,另一个选择是基于可初始化迭代器创建这样的数据集对象tf.placeholder:
X = tf.placeholder(tf.float32,shape = [60000,28,28,1])
Y = tf.placeholder(tf.float32,shape = [60000,10])
image2 = tf.data.Dataset.from_tensor_slices((X,Y))
iterator2 = image.make_initializable_iterator()
with tf.Session() as sess:
sess.run(iterator2.initializer,feed_dict={X:Train_images,Y:Train_labels}
sess.run(iterator2.get_next())
Run Code Online (Sandbox Code Playgroud)
这种迭代器在使用tf.Session()内存中的数据时工作得很好,并且避免了数据的多个副本。但我找不到让它工作的方法,keras.model.fit()因为你不能真正在那里调用 …
考虑以下张量流代码片段:
import time
import numpy as np
import tensorflow as tf
def fn(i):
# do some junk work
for _ in range(100):
i ** 2
return i
n = 1000
n_jobs = 8
stuff = np.arange(1, n + 1)
eager = False
t0 = time.time()
if eager:
tf.enable_eager_execution()
res = tf.map_fn(fn, stuff, parallel_iterations=n_jobs)
if not eager:
with tf.Session() as sess:
res = sess.run(res)
print(sum(res))
else:
print(sum(res))
dt = time.time() - t0
print("(eager=%s) Took %ims" % (eager, dt * 1000))
Run Code Online (Sandbox Code Playgroud)
如果使用 …
我将 Windows 10 jupyter 笔记本作为服务器并在其上运行一些火车。
我已经正确安装了 CUDA 9.0 和 cuDNN,并且 python 检测到了 GPU。这就是我在 anaconda 提示符下得到的内容。
>>> torch.cuda.get_device_name(0)
'GeForce GTX 1070'
Run Code Online (Sandbox Code Playgroud)
我还通过 .cuda() 将模型和张量放在 cuda 上
model = LogPPredictor(1, 58, 64, 128, 1, 'gsc')
if torch.cuda.is_available():
torch.set_default_tensor_type(torch.cuda.DoubleTensor)
model.cuda()
else:
torch.set_default_tensor_type(torch.FloatTensor)
list_train_loss = list()
list_val_loss = list()
acc = 0
mse = 0
optimizer = args.optim(model.parameters(),
lr=args.lr,
weight_decay=args.l2_coef)
data_train = DataLoader(args.dict_partition['train'],
batch_size=args.batch_size,
pin_memory=True,
shuffle=args.shuffle)
data_val = DataLoader(args.dict_partition['val'],
batch_size=args.batch_size,
pin_memory=True,
shuffle=args.shuffle)
for epoch in tqdm_notebook(range(args.epoch), desc='Epoch'):
model.train()
epoch_train_loss = 0
for i, batch …Run Code Online (Sandbox Code Playgroud) 首先,我曾使用过像'model.cuda()'这样的模型和数据转换为cuda。但是它仍然有这样的问题。我调试模型的每一层,每个模块的权重为iscuda = True。那么有人知道为什么会出现这样的问题吗?
我有两种模型,一种是resnet50,另一种包含第一个作为主干。
class FC_Resnet(nn.Module):
def __init__(self, model, num_classes):
super(FC_Resnet, self).__init__()
# feature encoding
self.features = nn.Sequential(
model.conv1,
model.bn1,
model.relu,
model.maxpool,
model.layer1,
model.layer2,
model.layer3,
model.layer4)
# classifier
num_features = model.layer4[1].conv1.in_channels
self.classifier = nn.Sequential(
nn.Conv2d(num_features, num_classes, kernel_size=1, bias=True))
def forward(self, x):
# children=self.features.children()
# for child in children:
# if child.weight is not None:
# print(child.weight.device)
x = self.features(x)
x = self.classifier(x)
return x
def fc_resnet50(num_classes=20, pre_trained=True):
model = FC_Resnet(models.resnet50(pre_trained), num_classes)
return model
Run Code Online (Sandbox Code Playgroud)
还有一个:
class PeakResponseMapping(nn.Sequential):
def __init__(self, *args, **kargs): …Run Code Online (Sandbox Code Playgroud) 遵循https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html 上的 Pytorch 教程
我收到以下错误:
(pt_gpu) [martin@A08-R32-I196-3-FZ2LTP2 mlm]$ python pytorch-1.py
Traceback (most recent call last):
File "pytorch-1.py", line 39, in <module>
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
AttributeError: module 'torch' has no attribute 'device'
Run Code Online (Sandbox Code Playgroud)
在下面的代码中,我添加了以下语句:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net.to(device)
Run Code Online (Sandbox Code Playgroud)
但这似乎不对或不够。这是我第一次在 linux 机器上运行带有 GPU 的 Pytorch。我还应该怎么做才能正确跑步?
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120) …Run Code Online (Sandbox Code Playgroud) python ×10
pytorch ×5
tensorflow ×4
keras ×3
gpu ×1
kaggle ×1
save ×1
username ×1
valueerror ×1
warnings ×1