首页 > 英文翻译

CAM实现的流程（pytorch）

更新时间:2023-06-06 03:03:41 阅读：评论：0

CAM实现的流程（pytorch）

2021/2/28 更新

之前写了⼀个简化版本（）的可视化过程，简化版的可视化没有考虑到通道之间的关系。这篇将介绍cam的流程。

下⼀篇为

⽬录

流程图

算法思路

1. 将要可视化的图⽚输进⽹络模型，判断出所属类别

2. 获取最后⼀个卷积层的输出特征图

3. 通过图⽚所属类别，得到权重，对获取的特征图的各个通道赋值，并且相加为单通道的特征图

举个例⼦

如果输⼊⼀张图⽚，通过⽹络模型之后，判断这张图⽚为第500类（总共1000类）。获取的特征图shape为(1，512，13，13)，假设分类层为1 x 1卷积（这⾥就不算是最后⼀个卷积层，⽽是属于分类层）和全局平均池化组成。那么，1000个类别有1000种权重，也就是说能够给特征图赋1000种值。每个权重关注点不⼀样，所以才需要知道图⽚属于哪个类别。知道它是500类后，那么只需要拿出第500个类别的权重赋给特征图就ok了。

CAM算法有⼀个制约条件，需要⽤到全局平均池化的操作，如果最后有多层全连接层，那么CAM算法就不适⽤了。⽐如vgg16，最后⼀个卷积层之后，接了三个全连接层，由于卷积层的输出特征图需要flatten才能接⼊全连接层，在经过三个全连接层后，已经难以算出通道之间的联系，则很难去计算各个特征图通道的权重重要性。这种情况下就需要⽤到Grad-Cam算法了。

代码分析

1.导⼊各种包，并且读取类别标签

from PIL import Image

import torch

from torchvision import models, transforms

from torch.autograd import Variable

import functional as F

import numpy as np

import cv2

import json

# 读取 imagenet数据集的类别标签

json_path ='./cam/labels.json'

vs什么意思with open(json_path,'r')as load_f:

load_json = json.load(load_f)

class ={int(key): value for(key, value)

in load_json.items()}

2.读取图⽚，并预处理

大昭寺简介# 读取 imagenet数据集的某类图⽚

img_path ='./cam/9933031-large.jpg'

normalize = transforms.Normalize(

mean=[0.485,0.456,0.406],

std=[0.229,0.224,0.225]

)

# 图⽚预处理

preprocess = transforms.Compo([

transforms.Resize((224,224)),

transforms.ToTensor(),

normalize

])

img_pil = Image.open(img_path)

img_tensor = preprocess(img_pil)

img_variable = Variable(img_tensor.unsqueeze(0))

3.加载预训练模型

# 加载预训练模型

model_id =1

if model_id ==1:

net = models.squeezenet1_1(pretrained=Fal)

pthfile = r'./pretrained/squeezenet1_1-f364aa15.pth'

net.load_state_dict(torch.load(pthfile))

finalconv_name ='features'# 获取卷积层的特征

elif model_id ==2:

net = snet18(pretrained=Fal)

finalconv_name ='layer4'

elif model_id ==3:

net = models.dennet161(pretrained=Fal)

finalconv_name ='features'

net.eval()# 使⽤eval()属性

print(net)

我只下了net1_1，如果想使⽤其余两个模型，依葫芦画瓢⾃⾏修改。

打印模型的结果：

SqueezeNet(

(features):Sequential(

(0):Conv2d(3,64, kernel_size=(3,3), stride=(2,2))

(1):ReLU(inplace)

(2):MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)

(3):Fire(

(squeeze):Conv2d(64,16, kernel_size=(1,1), stride=(1,1))

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(16,64, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(16,64, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(4):Fire(

(squeeze):Conv2d(128,16, kernel_size=(1,1), stride=(1,1))

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(16,64, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(16,64, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(5):MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)

(6):Fire(

(squeeze):Conv2d(128,32, kernel_size=(1,1), stride=(1,1))

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(32,128, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(32,128, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(7):Fire(

(squeeze):Conv2d(256,32, kernel_size=(1,1), stride=(1,1))

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(32,128, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(32,128, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(8):MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)

slight(9):Fire(

(squeeze):Conv2d(256,48, kernel_size=(1,1), stride=(1,1))

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(48,192, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(48,192, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(10):Fire(

(squeeze):Conv2d(384,48, kernel_size=(1,1), stride=(1,1))

育才教育怎么样

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(48,192, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(48,192, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(11):Fire(

(squeeze):Conv2d(384,64, kernel_size=(1,1), stride=(1,1))

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(64,256, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(64,256, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(12):Fire(

(squeeze):Conv2d(512,64, kernel_size=(1,1), stride=(1,1))

(squeeze_activation):ReLU(inplace)

(expand1x1):Conv2d(64,256, kernel_size=(1,1), stride=(1,1))

(expand1x1_activation):ReLU(inplace)

(expand3x3):Conv2d(64,256, kernel_size=(3,3), stride=(1,1), padding=(1,1)) (expand3x3_activation):ReLU(inplace)

)

(classifier):Sequential(

(0):Dropout(p=0.5)

(1):Conv2d(512,1000, kernel_size=(1,1), stride=(1,1))

(2):ReLU(inplace)

(3):AdaptiveAvgPool2d(output_size=(1,1))

)

可以看到特征提取部分在（features）中，分类层在（classifier）中。

4.获取特征图

features_blobs =[]# 后⾯⽤于存放特征图

def hook_feature(module,input, output):

gomeifeatures_blobs.append(output.data.cpu().numpy())

# 获取 features 模块的输出

net._(finalconv_name).register_forward_hook(hook_feature)

register_forward_hook可以获取中间层输出，具体可⾃⾏百度。

闰秒

5.获取权重

# 获取权重

net_name =[]

params =[]

for name, param in net.named_parameters():

net_name.append(name)

params.append(param)

print(net_name[-1], net_name[-2])# classifier.1.bias classifier.1.weight

print(len(params))# 52

weight_softmax = np.squeeze(params[-2].data.numpy())# shape:(1000, 512)

params 中保存了模型的所有权重，怎么索引到我们需要的呢？再回到模型打印结果那⾥，由于poolin

g层、dropout层以及ReLU激活是不保存参数的，将所有的卷积、激活操作数下来，发现⼀共有52层有参数。如果要获取features模块到classifier模块的权重，那么就是获取classifier中(1): Conv2d(512, 1000, kernel_size=(1, 1), stride=(1, 1))的参数。也可以看打印net_name的结果，发现-1表⽰classifier.1的偏置，-2表⽰classifier.1的权重。因此，我们要的就是索引为-2的参数。

logit = net(img_variable)# 计算输⼊图⽚通过⽹络后的输出值

print(logit.shape)# torch.Size([1, 1000])

print(params[-2].data.numpy().shape)# 权重有1000种 (1000, 512, 1, 1)

print(features_blobs[0].shape)# 特征图⼤⼩为　(1, 512, 13, 13)

# 结果有1000类，进⾏排序，并获得排序索引

h_x = F.softmax(logit, dim=1).data.squeeze()

print(h_x.shape)# torch.Size([1000])

thinnerprobs, idx = h_x.sort(0,True)

probs = probs.numpy()# 概率值排序

idx = idx.numpy()# 类别索引排序，概率值越⾼，索引越靠前

# 取概率值为前5的类别看看类别名和概率值

for i in range(0,5):

print('{:.3f} -> {}'.format(probs[i], class[idx[i]]))

'''

0.678 -> mountain bike, all-terrain bike, off-roader

0.088 -> bicycle-built-for-two, tandem bicycle, tandem

0.042 -> unicycle, monocycle

0.038 -> hor cart, hor-cart

0.019 -> lakeside, lakeshore

'''

6.定义计算CAM的函数

# 定义计算CAM的函数

def returnCAM(feature_conv, weight_softmax, class_idx):

# 类激活图上采样到 256 x 256

size_upsample =(256,256)

bz, nc, h, w = feature_conv.shape

output_cam =[]

# 将权重赋给卷积层：这⾥的weigh_softmax.shape为(1000, 512)

# feature_conv.shape为(1, 512, 13, 13)

# weight_softmax[class_idx]由于只选择了⼀个类别的权重，所以为(1, 512)

# shape((nc, h * w))后feature_conv.shape为(512, 169)

cam = weight_softmax[class_idx].dot(shape((nc, h * w)))

print(cam.shape)# 矩阵乘法之后，为各个特征通道赋值。输出shape为（1，169）

cam = shape(h, w)# 得到单张特征图

# 特征图上所有元素归⼀化到 0-1

cam_img =(cam - cam.min())/(cam.max()- cam.min())

# 再将元素更改到　0-255

cam_img = np.uint8(255* cam_img)

output_cam.size(cam_img, size_upsample))

return output_cam

7.⽣成图⽚

forthright

# 对概率最⾼的类别产⽣类激活图

CAMs = returnCAM(features_blobs[0], weight_softmax,[idx[0]])

# 融合类激活图和原始图⽚

img = cv2.imread(img_path)

height, width, _ = img.shapedominate什么意思

heatmap = cv2.size(CAMs[0],(width, height)), cv2.COLORMAP_JET) result = heatmap *0.3+ img *0.７

cv2.imwrite('CAM０.jpg', result)

cv2.applyColorMap函数的作⽤这⾥不再赘述，上⼀篇博客中已经涉及。

# 对概率排在第五的类别产⽣类激活图

CAMs = returnCAM(features_blobs[0], weight_softmax,[idx[4]])

# 融合类激活图和原始图⽚

运营商英文img = cv2.imread(img_path)

height, width, _ = img.shape

heatmap = cv2.size(CAMs[0],(width, height)), cv2.COLORMAP_JET) result = heatmap *0.3+ img *0.7

cv2.imwrite('CAM1.jpg', result)

本文发布于:2023-06-06 03:03:41，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/90/135466.html

上一篇：雅思作文樱桃酱的制作过程

下一篇：英语必修五的复习知识点

标签：特征类别权重模型通道

留言与评论（共有 0 条评论）