去噪算法:评价指标、计算参数量以及CBAM 回顾
Denoising review:
The trials made can be found on Github.
Asssment indexes:
PSNR: peak signal to noi ratio
注意MSE是均⽅误差(已平均化),MAX表⽰图像颜⾊的最⼤值,⼀般8位图表⽰255
PSNR单位是dB.
SSIM: structural similarity index
where , and generally .
.
Note some basic np operations:
np.split(a,4,0)
np.vstack
np.hstack, etc.
parameters & memory calculation
Note:
Total GPU memory = memory for model & layer outputs
In training time: forward , backward X2the craft
< VGG16 , memory of params: 528MB, memory of layers: 58.12MB/image
when training:
SGD+momentum, batchsize = 128
memory for model:
PSNR =10∗log ()=10MSE (2−1)n 2
20∗log ()
10MSE MAX L (X ,Y )=,C (X ,Y )=u +u +C X 2Y 212u u +C X Y 1,S (X ,Y )=σ+σ+C X 2Y 222σσ+C X Y 2σσ+C X Y 3
σ+C XY 3C 1=(K 1∗L ),C 2=2(K 2∗L ),C 3=2C 2/2K 1=0.01,K 2=0.03,l =255SSIM =L ∗C ∗S
1. 528MB*3 = 1.54 GB
1 for params, 1 for SGD, 1 for momemtum
active sync
If u Adam, need to x4
2. Memory for outputs:
128*58.12MB *2 = 14.53GB
3. Totalnasty8
14.53+1.54 = 16.07GB
FLOPS:
foreigner什么意思Conv = para*H*W
Linear = para
TFlops/s,可以简单写为T/s, 是数据流量的,意思是”1万亿次浮点指令每秒”,它是衡量⼀个电脑的标准。1TFlops=1024GFlops,即1T=1024G。
from torchstat import stat
model
1. Residual Den Network for Image Super-Resolution
Here, LR reprents Low Resolution Images.
upscale:
sub-pixel convolution
the best position of CBAM is after the up-sampling layer. up 0.0x dB, if patch-size up, perf down.
GRDN:
… yeah, concatenate , concatenate , concatenate .
CBAM:
The channel attention is computed as :
Where reprents activation function, e.g. ReLU.
kiddleThe spatial attention is computed as:
ballbearingwhere, M (F )=C σ(MLP (AvgPool (F ))+MLP (MaxPool (F )))
σM (F )=s σ(f ([AvgPool (F );MaxPool (F )]))7∗7=σ(f ([F ;F ]))
7∗7avg s max s F ,F ∈avg s max s R 1×H ×W
class ChannelAttention(nn.Module):
def__init__(lf, in_planes, ratio=16):
super(ChannelAttention, lf).__init__()
小妇人英文简介lf.avg_pool = nn.AdaptiveAvgPool2d(1)
lf.max_pool = nn.AdaptiveMaxPool2d(1)
lf.fc1 = nn.Conv2d(in_planes, in_planes //16,1, bias=Fal)
lf.fc2 = nn.Conv2d(in_planes //16, in_planes,1, bias=Fal)
lf.sigmoid = nn.Sigmoid()
def forward(lf, x):
avg_out = lf.lu1(lf.fc1(lf.avg_pool(x))))
max_out = lf.lu1(lf.fc1(lf.max_pool(x))))
out = avg_out + max_out
return lf.sigmoid(out)
channel attention: $C\times W $ -> pool动词不定式
class SpatialAttention(nn.Module):
def__init__(lf, kernel_size=7):
super(SpatialAttention, lf).__init__()
asrt kernel_size in(3,7),'kernel size must be 3 or 7'
padding =3if kernel_size ==7el1
lf.sigmoid = nn.Sigmoid()墨尔本大学排名
def forward(lf, x):
avg_out = an(x, dim=1, keepdim=True)
max_out, _ = torch.max(x, dim=1, keepdim=True)
x = torch.cat([avg_out, max_out], dim=1)
英语词典手机版x = lf.conv1(x)
return lf.sigmoid(x)
For channels, an or torch.max to implement the avg or pool for channel info.