《PyTorch入门教程》

来源 : PyTorch深度学习快速入门教程(绝对通俗易懂!)【小土堆】

代码参考:Github | PyTorch-Tutorial 代码

2024-01-30@isSeymour

一、安装初步

关于环境问题,下面这个文章讲的挺好的:

001-深度学习Pytorch环境搭建(Anaconda , PyCharm导入)

1.1 安装与环境

Anaconda

1
2
3
4
5
6
7
8
9
10
查看镜像源
conda config --show
添加镜像源
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
设置搜索时显示通道地址
conda config --set show_channel_urls yes

若要删除镜像源
conda config --remove channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/

CUDA 显卡驱动

安装完成,cmd下检测:

1
nvcc -V
image-20240130152954122

版本查询

1
2
3
4
5
6
7
8
9
10
11
12
13
14
查看python版本(不必,现在的PyTorch不区分python版本了)
python --version

查看Anaconda版本
conda --version

查看GPU显卡配置(查看当前GPU支持的最大情况)
nvidia-smi

查看安装了的CUDA版本
nvcc --version

查看算力
E:\Program Files\CUDA12.1\extras\demo_suite>deviceQuery.exe

1.2 创建工作环境

CMD下创建Python 环境

1
2
3
4
5
6
7
8
创建环境
conda create -n PyTorchEnv python=3.11

激活环境
conda activate PyTorchEnv

退出环境
conda deactivate

1.3 下载PyTorch

  • 到官网PyTorch 官网寻找对应版本,执行安装

    在对应的工作环境(我的是PyTorchEnv)中执行安装包操作

1
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
  • 安装完成,会出现一个done字符

若下载太慢:

  1. 更改镜像源

  2. 下载到本地,然后放入Anaconda3\pkgs\文件夹下,

    然后cmd下使用conda install --user-local 包名,即可安装完成(本地安装)

若使用pip安装,也可以切换镜像源

1
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

1.4 测试

  • 进入交互模式,查看是否可用GPU
1
2
3
4
5
python

>>> import torch
>>> torch.cuda.is_available()
True

返回True,说明成功!

可以开始你的 PyTorch 之旅了!

如果没有Nvidia显卡,torch.cuda.is_available()就是False,是正确的。 即便没有显卡,也是可以往后面学习的。

image-20240130204030988

二、编辑器选择

2.1 PyCharm

  • 创建新的Project时
PyCharm使用
  • 测试
测试PyCharm使用

2.2 Jupyter

  • 打开Anaconda 的prompt 命令行,切换到 工作环境(PyTorchEnv),

    然后安装 jupyter的工作包

1
2
3
4
5
6
7
8
9
激活工作环境
conda activate PyTorchEnv

可以检查是不是有jupyter的工作包(找是否有 ipykernel 包)
conda list

没有说明需要安装(下面两个哪个可以则使用哪个,选一个即可)
conda install nb_conda
conda install nb_conda_kernels
  • 运行打开
1
jupyter notebook
jupyter使用

2.3 对比

image-20240131002953999

2.4 关于包的帮助

image-20240131000800321
1
2
3
4
5
dir(torch)
dir(torch.cuda)
dir(torch.cuda.is_available)

help(torch.cuda.is_available)

三、基本使用

  • 基本认知:
Dataset与Dataloder

3.1 自定义数据集 Dataset

数据集:Data | ants & bees

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
from torch.utils.data import Dataset
import os
from PIL import Image

class MyData(Dataset):

def __init__(self, root_dir, label_dir):
self.root_dir = root_dir
self.label_dir = label_dir
self.path = os.path.join(self.root_dir, self.label_dir)
self.img_path = os.listdir(self.path)

def __getitem__(self, idx):
img_name = self.img_path[idx]
img_item_path = os.path.join(self.root_dir, self.label_dir, img_name)
img = Image.open(img_item_path)
label = self.label_dir
return img, label

def __len__(self):
return len(self.img_path)

# 导入数据
root_dir = "train"
ants_label_dir = "ants"
bees_label_dir = "bees"
ants_data = MyData(root_dir, ants_label_dir)
bees_data = MyData(root_dir, bees_label_dir)

# 长度
len1 = len(ants_data)
len2 = len(bees_data)
print(len1, len2)

# 展示图片
img0, label0 = ants_data[0]
img0.show()

# 数据集结合
train_data = ants_data + bees_data
len3 = len(train_data)
print(len3)

# 查看数据位置( len1-1是ants, len1是bees)
img1, label1 = train_data[len1-1]
img1.show()

img2, label2 = train_data[len1]
img2.show()

3.2 数据看板 TensorBoard

add_scalar()

  • 安装TensorBoard

    1
    pip install tensorboard
1
2
3
4
5
6
7
8
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter("logs")

for i in range(100):
writer.add_scalar("y=2x", 2*i, i)

writer.close()
  • 运行上述代码后,会在当前目录下生成log目录下的数据文件:
TensorBoard生成数据
  • 然后在当前环境(PyTorchEnv)的终端下,执行下面的命令,把数据进行展示:
1
tensorboard --logdir=logs --port=6007
TensorBoard
  • 数据文件是并不会删除的,而是每次都是添加状态。

    不同的tag标识,会生成logs下的不同数据文件;

    对于相同的tag标识的数据,会不断在对应的数据文件向后追加。

  • 每次运行代码,会进行一次新的追加,而展示时会自动进行拟合,

    因此数据展示会有问题。

  • 解决方法:

    手动删除对应的数据文件(若发现无法删除,需要先关闭tensorboard)

    Crtl + C 关闭

add_image()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from torch.utils.tensorboard import SummaryWriter
import numpy as np
from PIL import Image

writer = SummaryWriter("logs2")
img_path = "hymenoptera_data/val/ants/8124241_36b290d372.jpg"
img_PIL = Image.open(img_path)
img_array = np.array(img_PIL)
print(type(img_array))
print(img_array.shape)

writer.add_image("train", img_array, 1, dataformats="HWC")

writer.close()
1
2
<class 'numpy.ndarray'>
(500, 375, 3)
  • 然后在当前环境(PyTorchEnv)的终端下,执行下面的命令,把数据进行展示:
1
tensorboard --logdir=logs2 --port=6007
image-20240131210317853

注:

  • PIL的Image类型是不可以作为输入数据的,类型不符合。需要转换为 numpy.ndarray 类型。

    输入的img_tensor只能是 torch.Tensor, numpy.ndarray, or string/blobname

  • 注意查看数据的shape,要和对应参数匹配。

    1
    2
    (高度,宽度,3通道)   HWC
    (3通道,高度,宽度) CHW

3.3 图片转换 Transforms

image-20240201133932086 常见transform

按住Alt,可点击进入对应的package查看源代码。

PyCharm左侧的Structure可以查看类、函数、属性结构,并定位。

ToTensor()

  • Convert a PIL Image or ndarray to tensor and scale the values accordingly.
    
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19

    转换PIL 或ndarray 为 tensor 数据类型,用于后续处理(如SummaryWriter的添加)

    ```python
    from PIL import Image
    from torch.utils.tensorboard import SummaryWriter
    from torchvision import transforms

    img_path = "hymenoptera_data/train/ants_image/0013035.jpg"
    img = Image.open(img_path)

    writer = SummaryWriter("logs3")

    tensor_trans = transforms.ToTensor()
    tensor_img = tensor_trans(img)

    writer.add_image("Tensor_img", tensor_img)

    writer.close()
1
tensorboard --logdir=logs3
transform使用

Normalize()

  • 归一化处理
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs4")
img = Image.open("hymenoptera_data/train/bees_image/16838648_415acd9e3f.jpg")
print(img)

# ToTensor
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
print(img_tensor[0][0][0])
writer.add_image("ToTensor", img_tensor)

# Normalize
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(img_tensor)
print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm)

writer.close()
1
2
3
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x450 at 0x1F80D45F1D0>
tensor(0.0980)
tensor(-0.8039)
1
tensorboard --logdir=logs4
Normalize使用

可以改动参数,查看不同情况:

1
2
3
4
5
# Normalize
trans_norm = transforms.Normalize([6, 3, 2], [9, 3, 5]) # 改动参数
img_norm = trans_norm(img_tensor)
print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm, 1) # 改动步骤

Resize()

  • 改变图片大小
    • 参数:序列或者int
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs4")
img = Image.open("hymenoptera_data/train/bees_image/16838648_415acd9e3f.jpg")
print(img)

# ToTensor
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
print(img_tensor[0][0][0])
writer.add_image("ToTensor", img_tensor)

# Normalize
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(img_tensor)
print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm)

# Resize
print(img.size)
trans_resize = transforms.Resize((512, 512))

# img PIL -> resize -> img_resize PIL
img_resize = trans_resize(img)
# img_resize PIL -> totensor -> img_resize tensor
img_resize = trans_totensor(img_resize)

writer.add_image("Resize", img_resize, 0)
print(img_resize)

writer.close()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x450 at 0x2760C376750>
tensor(0.0980)
tensor(-0.8039)
(500, 450)
tensor([[[0.0980, 0.0863, 0.0902, ..., 0.0314, 0.0314, 0.0431],
[0.0824, 0.0863, 0.0863, ..., 0.0235, 0.0235, 0.0235],
[0.0588, 0.0824, 0.0863, ..., 0.0353, 0.0314, 0.0314],
...,
[0.7451, 0.7843, 0.7176, ..., 0.7490, 0.7255, 0.7843],
[0.6980, 0.7843, 0.7373, ..., 0.7412, 0.7098, 0.7882],
[0.7882, 0.7216, 0.6471, ..., 0.7333, 0.6863, 0.7529]],

[[0.0863, 0.0667, 0.0667, ..., 0.0706, 0.0706, 0.0667],
[0.0824, 0.0824, 0.0784, ..., 0.0549, 0.0510, 0.0431],
[0.0863, 0.0980, 0.0902, ..., 0.0510, 0.0431, 0.0392],
...,
[0.8667, 0.9098, 0.8471, ..., 0.8824, 0.8392, 0.8784],
[0.8078, 0.8863, 0.8353, ..., 0.8549, 0.8118, 0.8824],
[0.8980, 0.8235, 0.7451, ..., 0.8471, 0.7882, 0.8471]],

[[0.1216, 0.0902, 0.0667, ..., 0.0392, 0.0392, 0.0510],
[0.1020, 0.0941, 0.0745, ..., 0.0275, 0.0235, 0.0275],
[0.0745, 0.0902, 0.0745, ..., 0.0275, 0.0235, 0.0275],
...,
[0.8941, 0.9529, 0.9176, ..., 0.9373, 0.8863, 0.9176],
[0.8549, 0.9451, 0.9098, ..., 0.9176, 0.8627, 0.9216],
[0.9451, 0.8824, 0.8235, ..., 0.9098, 0.8353, 0.8863]]])
Resize使用

Compose()

  • 会直接进行Resize然后进行ToTensor,一步到位
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs4")
img = Image.open("hymenoptera_data/train/bees_image/16838648_415acd9e3f.jpg")
print(img)

# ToTensor
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
print(img_tensor[0][0][0])
writer.add_image("ToTensor", img_tensor)

# Normalize
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(img_tensor)
print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm)

# Resize
print(img.size)
trans_resize = transforms.Resize((512, 512))
img_resize = trans_resize(img)
img_resize = trans_totensor(img_resize)
writer.add_image("Resize", img_resize, 0)
print(img_resize)

# Compose
trans_resize_2 = transforms.Resize(512)
trans_compose = transforms.Compose([trans_resize_2, trans_totensor])
img_resize_2 = trans_compose(img)
writer.add_image("Compose", img_resize_2, 1)

writer.close()
Compose使用

RandomCrop()

  • 随机大小裁剪
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs4")
img = Image.open("hymenoptera_data/train/bees_image/16838648_415acd9e3f.jpg")
print(img)

# ToTensor
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
print(img_tensor[0][0][0])
writer.add_image("ToTensor", img_tensor)

# Normalize
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
img_norm = trans_norm(img_tensor)
print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm)

# Resize
print(img.size)
trans_resize = transforms.Resize((512, 512))
img_resize = trans_resize(img)
img_resize = trans_totensor(img_resize)
writer.add_image("Resize", img_resize, 0)
print(img_resize)

# Compose
trans_resize_2 = transforms.Resize(512)
trans_compose = transforms.Compose([trans_resize_2, trans_totensor])
img_resize_2 = trans_compose(img)
writer.add_image("Compose", img_resize_2, 1)

# RandomCrop
trans_random = transforms.RandomCrop((50,100))
trans_compose_2 = transforms.Compose([trans_random, trans_totensor])
for i in range(10):
img_crop = trans_compose_2(img)
writer.add_image("RandomCrop", img_crop, i)


writer.close()
RandomCrop使用

3.4 现有数据集 torchvision

以下,我们使用 CIFAR10 为例:

CIFAR10说明

下载查看

1
2
3
4
5
6
7
8
9
10
11
12
13
import torchvision

train_set = torchvision.datasets.CIFAR10(root="./dataset", train=True, download=True)
test_set = torchvision.datasets.CIFAR10(root="./dataset", train=False, download=True)

print(test_set[0])
print(test_set.classes)

img, target = test_set[0]
print(img)
print(target)
print(test_set.classes[target])
img.show()
1
2
3
4
5
6
7
8
9
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./dataset\cifar-10-python.tar.gz
100.0%
Extracting ./dataset\cifar-10-python.tar.gz to ./dataset
Files already downloaded and verified
(<PIL.Image.Image image mode=RGB size=32x32 at 0x201E45FEC50>, 3)
['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
<PIL.Image.Image image mode=RGB size=32x32 at 0x201E3E51F90>
3
cat
torchvision_dataset测试结果

使用transform

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import torchvision
from torch.utils.tensorboard import SummaryWriter

dataset_transform = torchvision.transforms.Compose([
torchvision.transforms.ToTensor()
])

train_set = torchvision.datasets.CIFAR10(root="./dataset", train=True, transform=dataset_transform, download=True)
test_set = torchvision.datasets.CIFAR10(root="./dataset", train=False, transform=dataset_transform,download=True)

writer = SummaryWriter("logs5")
for i in range(10):
img, target = test_set[i]
writer.add_image("test_set", img, i)

writer.close()
1
2
Files already downloaded and verified
Files already downloaded and verified
1
tensorboard --logdir=logs5
CIFAR10使用

3.5 数据取用 DataLoader

一次epoch更新

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

test_data = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor())

test_loader = DataLoader(dataset=test_data, batch_size=64, shuffle=True, num_workers=0, drop_last=False)

# 测试 dataset 第一张图片
img, target = test_data[0]
print(img.shape)
print(target)

# 测试 dataloader 第一张照片 !!DataLoader不支持下标括号访问!
# img2, target2 = test_loader[0]
# print(img2.shape)
# print(target2)

# 迭代查看
writer = SummaryWriter("logs6")
step = 0
for data in test_loader:
imgs, targets = data
# print(imgs.shape)
# print(targets)
writer.add_images("test_data", imgs, step) # 注意是 add_images 不是 add_image !
step = step + 1

writer.close()
1
2
torch.Size([3, 32, 32])
3
1
tensorboard --logdir=logs6
DataLoader测试结果-1 DataLoader测试结果-2

DataLoader 的 data 输出的大小类型是

1
2
print(imgs.shape)
print(targets)
1
2
3
4
torch.Size([64, 3, 32, 32])
tensor([1, 6, 3, 1, 3, 4, 0, 0, 6, 9, 4, 1, 1, 3, 8, 4, 9, 9, 0, 3, 6, 1, 1, 0,
8, 2, 4, 6, 0, 5, 2, 8, 7, 1, 8, 8, 7, 9, 7, 7, 9, 5, 4, 9, 9, 1, 0, 5,
5, 3, 0, 5, 9, 1, 2, 9, 9, 4, 2, 7, 3, 3, 3, 0])

多次epoch更新

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

test_data = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor())
test_loader = DataLoader(dataset=test_data, batch_size=64, shuffle=True, num_workers=0, drop_last=True)

writer = SummaryWriter("logs7")
for epoch in range(3):
step = 0
for data in test_loader:
imgs, targets = data
writer.add_images("Epoch:{}".format(epoch), imgs, step)
step = step + 1

writer.close()
1
tensorboard --logdir=logs7
DataLoader多epoch测试结果

四、神经网络

NN官网

4.1 基本骨架 nn.Module

nn.Module官网 nn.Module说明 nn.Module说明2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import torch
from torch import nn

# 所有 nn 模型都必须继承 nn.Module
# 且必须 override(重载)下面两个方法
class MyModule(nn.Module):
def __init__(self):
super(MyModule, self).__init__()

def forward(selfself, input):
output = input + 1
return output


mynn = MyModule()
x = torch.tensor(1.0)
print(x)
output = mynn(x)
print(output)
1
2
tensor(1.)
tensor(2.)

4.2 卷积层

卷积操作动画说明

参考:

Convolution animations

N.B.: Blue maps are inputs, and cyan maps are outputs.

No padding, no strides Arbitrary padding, no strides Half padding, no strides Full padding, no strides
No padding, strides Padding, strides Padding, strides (odd)

Transposed convolution animations

N.B.: Blue maps are inputs, and cyan maps are outputs.

No padding, no strides, transposed Arbitrary padding, no strides, transposed Half padding, no strides, transposed Full padding, no strides, transposed
No padding, strides, transposed Padding, strides, transposed Padding, strides, transposed (odd)

Dilated convolution animations

N.B.: Blue maps are inputs, and cyan maps are outputs.

No padding, no stride, dilation
image-20240202194225022
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import torch
import torch.nn.functional as F

input = torch.tensor([[1, 2, 0, 3, 1],
[0, 1, 2, 3, 1],
[1, 2, 1, 0, 0],
[5, 2, 3, 1, 1],
[2, 1, 0, 1, 1]])
kernel = torch.tensor([[1, 2, 1],
[0, 1, 0],
[2, 1, 0]])
print(input.shape)
print(kernel.shape)

input = torch.reshape(input, (1, 1, 5, 5))
kernel = torch.reshape(kernel, (1, 1, 3, 3))
print(input.shape)
print(kernel.shape)

output1 = F.conv2d(input, kernel, stride=1)
print(output1)

output2 = F.conv2d(input, kernel, stride=2)
print(output2)
1
2
3
4
5
6
7
8
9
torch.Size([5, 5])
torch.Size([3, 3])
torch.Size([1, 1, 5, 5])
torch.Size([1, 1, 3, 3])
tensor([[[[10, 12, 12],
[18, 16, 16],
[13, 9, 3]]]])
tensor([[[[10, 12],
[13, 3]]]])

卷积 Conv2d()

CONV2D
  • 二维卷积操作
Conv2d卷积公式
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import torch
import torchvision
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=64)

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.conv1 = Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0)

def forward(self, x):
x = self.conv1(x)
return x

tudui = Tudui()

writer = SummaryWriter("logs8")
step = 0
for data in dataloader:
imgs, targets = data
output = tudui(imgs)
# imgs: torch.Size([64, 3, 32, 32])
writer.add_images("input", imgs, step)
# output: torch.Size([64, 6, 30, 30]) -> # torch.Size([-1, 3, 30, 30]) 需要转为3通道才能显示
output = torch.reshape(output, (-1, 3, 30, 30)) # -1处会自动计算
writer.add_images("output", output, step)
step = step + 1

writer.close()
1
Files already downloaded and verified
1
tensorboard --logdir=logs8
Conv2d使用

4.3 最大池化 MaxPool2d()

MAXPOOL2D
最大池化说明
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import torch
from torch import nn
from torch.nn import MaxPool2d

input = torch.tensor([[1, 2, 0, 3, 1],
[0, 1, 2, 3, 1],
[1, 2, 1, 0, 0],
[5, 2, 3, 1, 1],
[2, 1, 0, 1, 1]], dtype=torch.float32) # 注意:必须是浮点数

input = torch.reshape(input, (-1, 1, 5, 5))

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)

def forward(self, input):
output = self.maxpool1(input)
return output

tudui = Tudui()
output = tudui(input)
print(output)
1
2
tensor([[[[2., 3.],
[5., 1.]]]])
  • 若改为
1
self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=False)
  • 则会输出
1
tensor([[[[2.]]]])
  • 下面使用一下,效果是模糊
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./dataset", train=False, download=True,
transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=False)

def forward(self, input):
output = self.maxpool1(input)
return output

tudui = Tudui()
writer = SummaryWriter("logs9")
step = 0
for data in dataloader:
imgs, targets = data
writer.add_images("input", imgs, step)
output = tudui(imgs)
writer.add_images("output", output, step)
step = step + 1

writer.close()
1
tensorboard --logdir=logs9

MaxPool2d使用

4.4 非线性激活 ReLU()、Sigmoid()

RELU ReLU() SIGMOID Sigmoid()
ReLU参数说明
  • 下例实际上只用了Sigmoid:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import torchvision
from torch import nn
from torch.nn import ReLU, Sigmoid
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./dataset", train=False, download=True,
transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.relu1 = ReLU()
self.sigmoid1 = Sigmoid()

def forward(self, input):
output = self.sigmoid1(input)
return output

tudui = Tudui()
writer = SummaryWriter("logs10")
step = 0
for data in dataloader:
imgs, targets = data
writer.add_images("input", imgs, step)
output = tudui(imgs)
writer.add_images("output", output, step)
step = step + 1

writer.close()
1
tensorboard --logdir=logs10
Sigmoid()使用

4.5 线性层 Linear()

LINEAR
线性层说明
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import torch
import torchvision
from torch import nn
from torch.nn import Linear
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("./dataset", train=False, download=True,
transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.linear1 = Linear(196608, 10)

def forward(self, input):
output = self.linear1(input)
return output

tudui = Tudui()

for data in dataloader:
imgs, targets = data
print(imgs.shape)
# 先铺平再进行线性层变换
# output = torch.reshape(imgs, (1, 1, 1, -1)) # 与下面 flatten效果一样
output = torch.flatten(imgs)
print(output.shape)
output = tudui(output)
print(output.shape)
1
2
3
4
5
6
7
8
9
10
Files already downloaded and verified
torch.Size([64, 3, 32, 32])
torch.Size([1, 1, 1, 196608])
torch.Size([1, 1, 1, 10])
......

Files already downloaded and verified
torch.Size([64, 3, 32, 32])
torch.Size([196608])
torch.Size([10])

4.6 其他层

  • 一般是用于特定处理的层
  • 如语音处理、图像处理、文字处理等

4.7 层序列 Sequential()

SEQUENTIAL
  • 下面用 CIFAR10 示例:
CIFAR10模型结构
  • Conv2d 参数计算示例:
Conv2d参数计算示例

不使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear


class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.conv1 = Conv2d(3, 32, 5, padding=2)
self.maxpool1 = MaxPool2d(2)
self.conv2 = Conv2d(32, 32, 5, padding=2)
self.maxpool2 = MaxPool2d(2)
self.conv3 = Conv2d(32, 64, 5, padding=2)
self.maxpool3 = MaxPool2d(2)
self.flatten = Flatten() # 线性层前先展平(图中没有画出来)
self.linear1 = Linear(1024, 64)
self.linear2 = Linear(64, 10)

def forward(self, x):
x = self.conv1(x)
x = self.maxpool1(x)
x = self.conv2(x)
x = self.maxpool2(x)
x = self.conv3(x)
x = self.maxpool3(x)
x = self.flatten(x)
x = self.linear1(x)
x = self.linear2(x)
return x

tudui = Tudui()
print(tudui)
1
2
3
4
5
6
7
8
9
10
11
Tudui(
(conv1): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(maxpool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(maxpool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv3): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(maxpool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(flatten): Flatten(start_dim=1, end_dim=-1)
(linear1): Linear(in_features=1024, out_features=64, bias=True)
(linear2): Linear(in_features=64, out_features=10, bias=True)
)

使用序列

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)

def forward(self, x):
x = self.model1(x)
return x

tudui = Tudui()
print(tudui)
1
2
3
4
5
6
7
8
9
10
11
12
13
Tudui(
(model1): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Flatten(start_dim=1, end_dim=-1)
(7): Linear(in_features=1024, out_features=64, bias=True)
(8): Linear(in_features=64, out_features=10, bias=True)
)
)

tensorboard展示

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import torch
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.tensorboard import SummaryWriter

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)

def forward(self, x):
x = self.model1(x)
return x

tudui = Tudui()
input = torch.ones((64, 3, 32, 32))
output = tudui(input)
print(output.shape)

writer = SummaryWriter("logs12")
writer.add_graph(tudui, input)
writer.close()
1
torch.Size([64, 10])
1
tensorboard --logdir=logs12
Sequential示例-1 Sequential示例-2

4.8 损失函数与方向传播

LOSS
LOSS说明 LOSS说明-2

L1Loss()

L1LOSS

MSELoss()

MSELOSS

CrossEntropyLoss()

CROSSENTROPYLOSS-1 CROSSENTROPYLOSS-2
CROSSENTROPYLOSS-3

操作示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import torch
from torch import nn

inputs = torch.tensor([1, 2, 3], dtype=torch.float32)
targets = torch.tensor([1, 2, 5], dtype=torch.float32)

inputs = torch.reshape(inputs, (1, 1, 1, 3))
targets = torch.reshape(targets, (1, 1, 1, 3))

# 默认是‘mean’,可以改为'sum'
loss_mae = nn.L1Loss(reduction='mean')
result_mae = loss_mae(inputs, targets)
print(result_mae)

loss_mse = nn.MSELoss()
result_mse = loss_mse(inputs, targets)
print(result_mse)

x = torch.tensor([0.1, 0.2, 0.3])
y = torch.tensor([1])
x = torch.reshape(x, (1, 3))
loss_cross = nn.CrossEntropyLoss()
result_cross = loss_cross(x, y)
print(result_cross)
1
2
3
tensor(0.6667)
tensor(1.3333)
tensor(1.1019)

模型示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import torch
import torchvision
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("./dataset", train=False, download=True,
transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)

def forward(self, x):
x = self.model1(x)
return x

loss = nn.CrossEntropyLoss()
tudui = Tudui()
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets)
print(result_loss)
1
2
3
4
5
6
Files already downloaded and verified
tensor(2.3101, grad_fn=<NllLossBackward0>)
tensor(2.3090, grad_fn=<NllLossBackward0>)
tensor(2.3187, grad_fn=<NllLossBackward0>)
tensor(2.3161, grad_fn=<NllLossBackward0>)
tensor(2.3075, grad_fn=<NllLossBackward0>)

查看 grad

  • 添加一个
1
result_loss.backward()
  • 打下断点,进行DEBUG

    在打开查看grad参数的变化

    grad在

    • tudui
      • model1
        • Protected Attributes
          • _modules
            • ‘0’
              • weight
                • grad
DEBUG-grad DEBUG-grad-2

4.9 优化器

OPTIM

查看 data

  • 添加下面的 optimizer 内容
1
2
3
4
5
6
7
8
9
10
loss = nn.CrossEntropyLoss()
tudui = Tudui()
optimizer = torch.optim.SGD(tudui.parameters(), lr=0.01)
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets)
optimizer.zero_grad() # 不能掉了这一句
result_loss.backward()
optimizer.step()
DEBUG-optim-1 DEBUG-optim-2

仔细看data,实际上更新了数字

是会根据grad梯度进行data参数的更新,从而更新了模型

完整示例

  • 我们添加epoch多轮,每轮输出对应的 loss 和
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import torch
import torchvision
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("./dataset", train=False, download=True,
transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10)
)

def forward(self, x):
x = self.model1(x)
return x

loss = nn.CrossEntropyLoss()
tudui = Tudui()
optimizer = torch.optim.SGD(tudui.parameters(), lr=0.01)
for epoch in range(20):
running_loss = 0.0
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets)
optimizer.zero_grad() # 不能掉了这一句
result_loss.backward()
optimizer.step()
running_loss = running_loss + result_loss
print(running_loss)

1
2
3
4
5
Files already downloaded and verified
tensor(360.3722, grad_fn=<AddBackward0>)
tensor(355.9296, grad_fn=<AddBackward0>)
tensor(341.5743, grad_fn=<AddBackward0>)
......

可以看到,损失越来越小了。

五、现有网络模型

5.1 网络模型的保存与读取

保存

1
2
3
4
5
6
7
8
9
10
import torch
import torchvision

vgg16 = torchvision.models.vgg16(pretrained=False)

# 保存方式-1
torch.save(vgg16, "vgg16_method1.pth")

# 保存方式-2 官方推荐
torch.save(vgg16.state_dict(), "vgg16_method2.pth")
Save

读取

1
2
3
4
5
6
7
8
9
import torch

# 读取方式-1
model1 = torch.load("vgg16_method1.pth")
print(model1)

# 读取方式-2 官方推荐
vgg16 = torchvision.models.vgg16(pretrained=False)
vgg16.load_state_dict(torch.load("vgg16_method2.pth"))
1
2
3
4
5
6
7
8
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
......

注意:

  • 保存自己的已训练模型后,再读取模型时,

    需要自己把这个模型类先引入,才能正常读入。

1
from Tudui import *

前面的torch.load("vgg16_method2.pth")的内容是

1
2
3
4
5
6
7
8
9
10
11
12
OrderedDict([('features.0.weight', tensor([[[[-0.1135,  0.1008, -0.0002],
[ 0.0986, 0.0091, 0.0706],
[-0.1599, 0.0261, 0.1146]],

[[-0.0203, -0.0541, 0.1053],
[ 0.0094, -0.0109, 0.0597],
[ 0.0154, -0.1059, 0.0040]],

[[ 0.0358, -0.0205, 0.0215],
[-0.0723, 0.0951, 0.1017],
[ 0.0250, -0.0216, -0.1106]]],
......

5.2 使用现有网络模型

现有MODELS

使用

1
2
3
4
5
6
import torchvision

# vgg16_false = torchvision.models.vgg16(pretrained=False)
vgg16_true = torchvision.models.vgg16(pretrained=True)

print(vgg16_true)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to C:\Users\86182/.cache\torch\hub\checkpoints\vgg16-397923af.pth
100.0%
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)

修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import torchvision
from torch import nn

# vgg16_false = torchvision.models.vgg16(pretrained=False)
vgg16_true = torchvision.models.vgg16(pretrained=True)

# print(vgg16_true)

train_data = torchvision.datasets.CIFAR10('./dataset', train=True,
transform=torchvision.transforms.ToTensor(),
download=True)
# 方法1
vgg16_true.classifier.add_module('add_linear', nn.Linear(1000, 10))
print(vgg16_true)

# 方法2
# vgg16_true.classifier[6] = nn.Linear(1000, 10)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Files already downloaded and verified
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
(add_linear): Linear(in_features=1000, out_features=10, bias=True)
)
)

可以看到最后一行有一个add_linear是我们添加到层名

六、完整套路

6.1 模型训练

结构图

CIFAR10模型结构

mymodel.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import torch
from torch import nn

# 搭建神经网络
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 32, 5, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 32, 5, padding=2),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 5, padding=2),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(1024, 64),
nn.Linear(64, 10)
)

def forward(self, x):
x = self.model(x)
return x

if __name__ == '__main__':
tudui = Tudui()
input = torch.ones((64, 3, 32, 32))
output = tudui(input)
print(output.shape)
# 输出 torch.Size([64, 10])

train.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# -*- coding = utf-8 -*-
# @Time : 2024/2/3 0:00
# @Author : Seymour0314 2151713
# @File : train.py
# @Software: PyCharm
import torch
import torchvision
from torch import nn
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import time

from mymodel import Tudui

# 定义训练的设备
device = torch.device("cuda")

# 准备数据集
train_data = torchvision.datasets.CIFAR10(root="./dataset", train=True,
transform=torchvision.transforms.ToTensor(),
download=True)
test_data = torchvision.datasets.CIFAR10(root="./dataset", train=False,
transform=torchvision.transforms.ToTensor(),
download=True)

# length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)
print("训练数据集的长度为{}".format(train_data_size))
print("测试数据集的长度为{}".format(test_data_size))

# 利用 Dataloader 来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)

# 创建网络模型
tudui = Tudui()
tudui.to(device)

# 损失函数
loss_fn = nn.CrossEntropyLoss()
loss_fn.to(device)

# 优化器
# learning_rate = 0.01
# 1e-2 = 1 x (10)^(-2) = 0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(), lr=learning_rate)

# 设置训练网络的参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10

writer = SummaryWriter("logs13")

start_time = time.time()
for i in range(epoch):
print("------第 {} 轮训练开始------".format(i+1))

# 训练步骤开始
# tudui.train() # 这句话不一定加,有些层需要而已,我们这里模型不需要
for data in train_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)

# 优化器优化模型
optimizer.zero_grad()
loss.backward()
optimizer.step()

total_train_step = total_train_step + 1
if total_train_step % 100 == 0:
end_time = time.time()
print("用时:{}".format(end_time-start_time))
print("训练次数: {}, Loss: {}".format(total_train_step, loss.item()))
writer.add_scalar("train_loss", loss.item(), total_train_step)

# 测试步骤开始
tudui.eval() # 可以不加,但是加了效率更高
test_loss_sum = 0
total_accuracy = 0
with torch.no_grad():
for data in test_dataloader:
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)
outputs = tudui(imgs)
loss = loss_fn(outputs, targets)
test_loss_sum = test_loss_sum + loss
accuracy = (outputs.argmax(1) == targets).sum()
total_accuracy = total_accuracy + accuracy

print("整体测试数据集上的Loss: {}".format(test_loss_sum))
print("整体测试数据集上的正确率: {}".format(total_accuracy/test_data_size))
writer.add_scalar("test_loss", test_loss_sum, total_test_step)
writer.add_scalar("test_accuracy", total_accuracy/test_data_size, total_test_step)
total_test_step = total_test_step + 1

# 保存本轮模型
# torch.save(tudui, "tudui_{}".format(i))
torch.save(tudui.state_dict(), "tudui_{}.pth".format(i))
print("模型已保存")

writer.close()

CPU跑

1
2
# 使用CPU
device = torch.device("cpu")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
训练数据集的长度为50000
测试数据集的长度为10000
------第 1 轮训练开始------
用时:2.647998094558716
训练次数: 100, Loss: 2.287977933883667
用时:5.382997751235962
训练次数: 200, Loss: 2.2727694511413574
用时:8.066996812820435
训练次数: 300, Loss: 2.231820821762085
用时:10.812994956970215
训练次数: 400, Loss: 2.142338275909424
用时:13.573992013931274
训练次数: 500, Loss: 2.0379247665405273
用时:16.22099018096924
训练次数: 600, Loss: 1.9816762208938599
用时:18.903990268707275
训练次数: 700, Loss: 2.0086615085601807
整体测试数据集上的Loss: 312.3843078613281
整体测试数据集上的正确率: 0.28760001063346863
模型已保存
------第 2 轮训练开始------
......

GPU跑

1
2
3
4
# 使用GPU
device = torch.device("cuda")
device = torch.device("cuda:0") # 与 cuda 等效
device = torch.device("cuda:1")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Files already downloaded and verified
Files already downloaded and verified
训练数据集的长度为50000
测试数据集的长度为10000
------第 1 轮训练开始------
用时:0.9674310684204102
训练次数: 100, Loss: 2.296257972717285
用时:1.6764259338378906
训练次数: 200, Loss: 2.2909674644470215
用时:2.3930132389068604
训练次数: 300, Loss: 2.2711126804351807
用时:3.1179311275482178
训练次数: 400, Loss: 2.226407051086426
用时:3.8293075561523438
训练次数: 500, Loss: 2.0999419689178467
用时:4.5414087772369385
训练次数: 600, Loss: 2.0297019481658936
用时:5.255408048629761
训练次数: 700, Loss: 2.018446922302246
整体测试数据集上的Loss: 315.6116027832031
整体测试数据集上的正确率: 0.2709999978542328
模型已保存
------第 2 轮训练开始------
......

tensorboard显示

1
tensorboard --logdir=logs13
test_accuracy test_loss

其中

  • 使用GPU:

    也可以单个询问,但是上面示例方法更好

1
2
3
4
5
6
7
8
9
10
11
# 把 .to(device) 都换掉
# 一共是三个地方需要:1.网络模型 2.损失函数 3.数据(必须是data拆分后的)
if torch.cuda.is_available():
tudui = tudui.cuda()

if torch.cuda.is_available():
loss_fn = loss_fn.cuda()

if torch.cuda.is_available():
imgs = imgs.cuda()
targets = targets.cuda()
  • 说明
正确率说明
  • argmax函数

argmax参数-0 argmax参数-1

1
2
3
4
5
# 第一个得到
tensor([0, 1])

# 第二个得到
tensor([1, 1])

6.2 模型验证

class_to_idx

获取一张图片

  • 在网络上截图了一张图片
dog

写入测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import torch
import torchvision
from PIL import Image

from mymodel import Tudui

image_path = "images/dog.png"
image = Image.open(image_path)
# png是RGBA四通道,需要转换RBG三通道
image = image.convert('RGB')
print(image)

trans = torchvision.transforms.Compose([
torchvision.transforms.Resize((32, 32)),
torchvision.transforms.ToTensor()
])
image = trans(image)
print(image.shape)

model = Tudui()
model.load_state_dict(torch.load("tudui_7.pth"))
print(model)

image = torch.reshape(image, (1, 3, 32, 32))
model.eval()
with torch.no_grad():
output = model(image)
print(output)
print(output.argmax(1))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<PIL.Image.Image image mode=RGB size=213x230 at 0x17E8A0F2C90>
torch.Size([3, 32, 32])
Tudui(
(model): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Flatten(start_dim=1, end_dim=-1)
(7): Linear(in_features=1024, out_features=64, bias=True)
(8): Linear(in_features=64, out_features=10, bias=True)
)
)
tensor([[-1.2757, -5.4262, 2.5295, 1.7245, 4.3046, 2.5135, 3.9746, 3.1648,
-5.5898, -5.7073]])
tensor([4])
  • 4对应是deer

    说明错了。

  • 但是可以看到是5-dog的概率也有2.51,至少不会太离谱。

    只不过,训练不够,还需要继续训练,提高正确率。

  • 我们改用下面这个图片试一下

    dog3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<PIL.Image.Image image mode=RGB size=272x250 at 0x24A29D1AA50>
torch.Size([3, 32, 32])
Tudui(
(model): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Flatten(start_dim=1, end_dim=-1)
(7): Linear(in_features=1024, out_features=64, bias=True)
(8): Linear(in_features=64, out_features=10, bias=True)
)
)
tensor([[-2.6350, -2.5190, -1.2866, 2.8444, 0.6324, 3.7176, 2.9099, 1.6787,
-3.9632, -1.8650]])
tensor([5])

这次是正确的!

6.3* 使用 colab 跑GPU

colab
  • 需要使用谷歌账号登录才能使用GPU

  • 新建笔记本

    • 修改

      • 笔记本设置

        • 硬件加速器

          选择GPU,保存

  • 编写格式类似 Jupyter Notebook

1
2
3
4
5
6
7
# 编写代码
import torch
print(torch.__version__)
print(torch.cuda.is_available())

# 命令行使用,查看GPU设备
!nvidia-smi