harry's blog

1. Conv

首先是常规卷积，假设我们有一张$6 \times 4 \times 4$的特征图，现在想得到一张$10 \times 3 \times 3$的图，如果直接使用卷积操作，大卷积核（包含 channel，3 维）一共有 10 个，每个大小为$6 \times 2 \times 2$。代码及计算过程如下图所示

conv = nn.Conv2d(6, 10, kernel_size=2, stride=1, padding=0, bias=False, groups=1)
input = torch.ones((1, 6, 4, 4))
output = conv(input)
print(output.size())
'''
torch.Size([1, 10, 3, 3])
'''

Normal Convolution

参数和计算量

参数：10 个输出 channel，6 个输入 channel，卷积核大小为 2 * 2
计算量：10 个输出 channel，6 个输入 channel，卷积核大小为 2 * 2，输出图为 3 * 3

参数：$10 * 6 * 2\times 2=240$
计算量：$10 * 6 * 3 \times3 *2 \times 2 =2160$

2. Group Conv

分组卷积可降低参数量，我们假设一张$6 \times 4 \times 4$的特征图，现在想得到一张$10 \times 3 \times 3$的图，设分组卷积数为 2，因此每个大卷积核的大小为$3 \times 2 \times 2$，一共有$5+5=10$个大卷积核。代码及计算过程如下图所示

group_conv = nn.Conv2d(6, 10, kernel_size=2, stride=1, padding=0, bias=False, groups=2)
input = torch.ones((1, 6, 4, 4))
output = group_conv(input)
print(output.size())
'''
torch.Size([1, 10, 3, 3])
'''

Group Convolution

参数和计算量

参数：10 个输出 channel(分组为 2，每组 5 channel)，6 个输入 channel(分组为 2，每组 3 channel)，大卷积核一共分为两组，每组大小为 3 * 2 * 2，卷积核大小为 2 * 2
计算量：10 个输出 channel，6 个输入 channel，卷积核大小为 2 * 2，输出图为 3 * 3

参数：$(5+5) * 3 * 2 \times 2=120$
计算量：$5 * 3 * 3 \times3 *2 \times 2 * 2(group)=1080$

3. Depthwise Separable Convolution

depthwise separable convolution 是 Google 在 2017 年提出的arXiv

这个模型为 MobileNet，主要是在显著降低参数和计算量的情况下保证性能，depthwise separable convolution 一共分为两步：depthwise conv 以及 pointwise conv

3.1 Depthwise Conv

depthwise 中每个卷积核只负责一个通道，卷积只能在二维平面内进行，因此他没有办法增加通道数

继续上面的例子，我们假设一张$6 \times 4 \times 4$的特征图，因为 depthwise 没办法增加通道数，所以我们只能得到一张$6 \times 3 \times 3$的图

Depthwise Convolution

参数和计算量

参数：6 个输出 channel，6 个输入 channel，大卷积核大小为 1 * 2 * 2，卷积核大小为 2 * 2
计算量：6 个输出 channel，6 个输入 channel，卷积核大小为 2 * 2，输出图为 3 * 3

参数：$6 * 1 * 2 \times 2=24$
计算量：$3 \times3 *2 \times 2 * 6=216$

3.2 Pointwise Conv

因为我们想获得$10 \times 3 \times 3$的特征图，但是目前经过 depthwise 我们得到了$6 \times 3 \times 3$的特征图，现在我们用$1 \times 1$的核来进行 pointwise 操作，每个卷积核的大小为$6 \times1 \times 1$，一共有 10 个

Pointwise Convolution

参数和计算量

参数：10 个输出 channel，6 个输入 channel，大卷积核大小为 6 * 1 * 1，卷积核大小为 1 * 1
计算量：10 个输出 channel，6 个输入 channel，卷积核大小为 2 * 2，输出图为 3 * 3

参数：$10 * 6 * 1\times 1 = 60$
计算量：$10 * 6 * 3\times 3 * 1 \times 1=540$

上述两步的代码如下所示

depthwise = nn.Conv2d(6, 6, kernel_size=2, stride=1, padding=0, bias=False, groups=6)
pointwise = nn.Conv2d(6, 10, kernel_size=1, stride=1, padding=0, bias=False, groups=1)
input = torch.ones((1, 6, 4, 4))
output = depthwise(input)
print(output.size())
output = pointwise(output)
print(output.size())
'''
torch.Size([1, 6, 3, 3])
torch.Size([1, 10, 3, 3])'''

总结：特征图$6 \times 4 \times 4 \rightarrow10 \times 3 \times 3$

model	params	flops
Normal Conv	240	2160
Group Conv	120	1080
Separable Conv	24+60	216+540

本文由 Yonghui Wang 创作，采用知识共享署名4.0 国际许可协议进行许可
本站文章除注明转载/出处外，均为本站原创或翻译，转载前请务必署名
最后编辑时间为: Dec 19, 2024 12:13 pm

pytorch 的几种 conv 方法

1. Conv

2. Group Conv

3. Depthwise Separable Convolution

3.1 Depthwise Conv

3.2 Pointwise Conv