使用resnet152模型训练报错

在使用resnet152模型训练fashion_mnist时报错

具体代码如下

model = torchvision.models.resnet152(pretrained=True)
fashion_mnist = keras.datasets.fashion_mnist
data,label = fashion_mnist.load_data()

for param in model.parameters():
    param.requires_gradq = False
model.fc = torch.nn.Linear(2048, 28)
loss_fn = torch.nn.MSELoss(reduction='sum')
opt = torch.optim.SGD(model.fc.parameters(), lr=0.001)

然而在下面的代码,也就是训练模型的时候遇到报错

for i in range(200):
    y = model(data)
    loss = loss_fn(label, y)
    opt.zero_grad()
    loss.backward()
    opt.step()
    print(loss)

报错内容 :

TypeError                                 Traceback (most recent call last)
20-042e5c6d56e7> in <module>
      1 for i in range(200):
----> 2     y = model(data)
      3     loss = loss_fn(label, y)
      4     opt.zero_grad()
      5     loss.backward()

~/yes/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/yes/lib/python3.8/site-packages/torchvision/models/resnet.py in forward(self, x)
    247 
    248     def forward(self, x: Tensor) -> Tensor:
--> 249         return self._forward_impl(x)
    250 
    251 

~/yes/lib/python3.8/site-packages/torchvision/models/resnet.py in _forward_impl(self, x)
    230     def _forward_impl(self, x: Tensor) -> Tensor:
    231         # See note [TorchScript super()]
--> 232         x = self.conv1(x)
    233         x = self.bn1(x)
    234         x = self.relu(x)

~/yes/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

~/yes/lib/python3.8/site-packages/torch/nn/modules/conv.py in forward(self, input)
    444 
    445     def forward(self, input: Tensor) -> Tensor:
--> 446         return self._conv_forward(input, self.weight, self.bias)
    447 
    448 class Conv3d(_ConvNd):

~/yes/lib/python3.8/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
    440                             weight, bias, self.stride,
    441                             _pair(0), self.dilation, self.groups)
--> 442         return F.conv2d(input, weight, bias, self.stride,
    443                         self.padding, self.dilation, self.groups)
    444 

TypeError: conv2d() received an invalid combination of arguments - got (tuple, Parameter, NoneType, tuple, tuple, tuple, int), but expected one of:
 * (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, tuple of ints padding, tuple of ints dilation, int groups)
      didn't match because some of the arguments have invalid types: (!tuple!, !Parameter!, !NoneType!, !tuple!, !tuple!, !tuple!, int)
 * (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, str padding, tuple of ints dilation, int groups)
      didn't match because some of the arguments have invalid types: (!tuple!, !Parameter!, !NoneType!, !tuple!, !tuple!, !tuple!, int)

我尝试将data和label转torch.tensor

data = torch.tensor(data)

可是依旧报错

TypeError                                 Traceback (most recent call last)
32-42ca4109f755> in <module>
----> 1 data = torch.tensor(data)

TypeError: not a sequence
求指导!
提前致谢!

fashion_mnist.load_data()这个返回值应该有4个把,为啥你才两个?
x_train,y_train,x_test,y_test=fashion.load_data()?

看这个https://so.csdn.net/so/search?q=Resnet&spm=1001.2101.3001.7020
Resnet网络实战之Fashion-MNIST数据集
残差网络的由来(Resnet)
普通的网络结构与加入残差连接的网络结构对比
残差块的代码实现
Resnet模型
获取数据和训练模型
完整代码
残差网络的由来(Resnet)
 在实践中,添加过多的层后训练误差往往不降反升。即使利用批量归一化带来的数值稳定性使训练深层模型更加容易,该问题仍然存在。针对这一问题,何恺明等人提出了残差网络(ResNet)。它在2015年的ImageNet图像识别挑战赛夺魁,并深刻影响了后来的深度神经网络的设计。下面通过Fashion-MNIST数据集来了解一下ResNet-18网络。

普通的网络结构与加入残差连接的网络结构对比
 设输入为x。假设我们希望学出的理想映射为f(x),从而作为图最上方激活函数的输入。左图虚线框中的部分需要直接拟合出该映射f(x),而右图虚线框中的部分则需要拟合出有关恒等映射的残差映射f(x)−x。残差映射在实际中往往更容易优化。我们只需将图中右图虚线框内上方的加权运算(如仿射)的权重和偏差参数学成0,那么f(x)即为恒等映射。实际中,当理想映射f(x)极接近于恒等映射时,残差映射也易于捕捉恒等映射的细微波动。右图也是ResNet的基础块,即残差块(residual block)。在残差块中,输入可通过跨层的数据线路更快地向前传播。

残差块的代码实现
 ResNet沿用了VGG全3×3卷积层的设计。残差块里首先有2个有相同输出通道数的3×3卷积层。每个卷积层后接一个批量归一化层和ReLU激活函数。然后我们将输入跳过这两个卷积运算后直接加在最后的ReLU激活函数前。这样的设计要求两个卷积层的输出与输入形状一样,从而可以相加。如果想改变通道数,就需要引入一个额外的1×1卷积层来将输入变换成需要的形状后再做相加运算。
残差块的实现如下。它可以设定输出通道数、是否使用额外的1×1卷积层来修改通道数以及卷积层的步幅。

import time
import torch
from torch import nn, optim
import torch.nn.functional as F

import sys
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
class Residual(nn.Module):
def init(self, in_channels, out_channels, use_1x1conv=False, stride=1):
super(Residual, self).init()
self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, stride=stride)
self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1)
if use_1x1conv:
self.conv3 = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride)
else:
self.conv3 = None
self.bn1 = nn.BatchNorm2d(out_channels)
self.bn2 = nn.BatchNorm2d(out_channels)

def forward(self, X):
    Y = F.relu(self.bn1(self.conv1(X)))
    Y = self.bn2(self.conv2(Y))
    if self.conv3:
        X = self.conv3(X)
    return F.relu(Y + X)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Resnet模型
 ResNet的前两层跟GoogLeNet中的一样:在输出通道数为64、步幅为2的7×7卷积层后接步幅为2的3×3的最大池化层。不同之处在于ResNet每个卷积层后增加的批量归一化层

net = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1))
1
2
3
4
5
 ResNet则使用4个由残差块组成的模块,每个模块使用若干个同样输出通道数的残差块。第一个模块的通道数同输入通道数一致。由于之前已经使用了步幅为2的最大池化层,所以无须减小高和宽。之后的每个模块在第一个残差块里将上一个模块的通道数翻倍,并将高和宽减半。
下面我们来实现这个模块。注意,这里对第一个模块做了特别处理(即第一个模块的in_channels == out_channels)。

def resnet_block(in_channels, out_channels, num_residuals, first_block=False):
if first_block:
assert in_channels == out_channels # 第一个模块的通道数同输入通道数一致
blk = []
for i in range(num_residuals):
if i == 0 and not first_block:
blk.append(Residual(in_channels, out_channels, use_1x1conv=True, stride=2))
else:
blk.append(Residual(out_channels, out_channels))
return nn.Sequential(*blk)
1
2
3
4
5
6
7
8
9
10
 接着我们为ResNet加入所有残差块。这里每个模块使用两个残差块,即[2, 2, 2, 2]。

net.add_module("resnet_block1", resnet_block(64, 64, 2, first_block=True))
net.add_module("resnet_block2", resnet_block(64, 128, 2))
net.add_module("resnet_block3", resnet_block(128, 256, 2))
net.add_module("resnet_block4", resnet_block(256, 512, 2))
1
2
3
4
 最后,与GoogLeNet一样,加入全局平均池化层后接上全连接层输出。

net.add_module("global_avg_pool", GlobalAvgPool2d()) # GlobalAvgPool2d的输出: (Batch, 512, 1, 1)
net.add_module("fc", nn.Sequential(FlattenLayer(), nn.Linear(512, 10)))
1
2
 这里每个模块里有4个卷积层(不计算1×1卷积层),加上最开始的卷积层和最后的全连接层,共计18层。这个模型通常也被称为ResNet-18。通过配置不同的通道数和模块里的残差块数可以得到不同的ResNet模型,例如更深的含152层的ResNet-152。虽然ResNet的主体架构跟GoogLeNet的类似,但ResNet结构更简单,修改也更方便。这些因素都导致了ResNet迅速被广泛使用。
在训练ResNet之前,我们来观察一下输入形状在ResNet不同模块之间的变化。

X = torch.rand((1, 1, 224, 224))
for name, layer in net.named_children():
X = layer(X)
print(name, ' output shape:\t', X.shape)
1
2
3
4
结果:

获取数据和训练模型
 下面我们在Fashion-MNIST数据集上训练ResNet。

batch_size = 64#批次可以设置大点,256

如出现“out of memory”的报错信息,可减小batch_size或resize

train_iter, test_iter = load_data_fashion_mnist(batch_size, resize=96)
lr, num_epochs = 0.001, 5
optimizer = torch.optim.Adam(net.parameters(), lr=lr)
train_ch5(net, train_iter, test_iter, batch_size, optimizer, device, num_epochs)
1
2
3
4
5
6
 结果:

完整代码
import time
import torch
from torch import nn, optim
import torch.nn.functional as F
import torchvision
device = torch.device('cuda:1' if torch.cuda.is_available() else 'cpu')

class Residual(nn.Module):
def init(self, in_channels, out_channels, use_1x1conv=False, stride=1):
super(Residual, self).init()
self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, stride=stride)
self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1)
if use_1x1conv:
self.conv3 = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride)
else:
self.conv3 = None
self.bn1 = nn.BatchNorm2d(out_channels)
self.bn2 = nn.BatchNorm2d(out_channels)

def forward(self, X):
    Y = F.relu(self.bn1(self.conv1(X)))
    Y = self.bn2(self.conv2(Y))
    if self.conv3:
        X = self.