深度神经⽹络可解释性:卷积核、权重和激活可视化(pytorch+tensorboard)⽂章⽬录
引⾔
⼀直以来,深度神经⽹络作为⼀种功能强⼤的“⿊盒”,被认为可解释性较弱。⽬前,常⽤的⼀种典型可解释性分析⽅法是就是可视化⽅法。
本⽂整理了深度神经⽹络训练过程中常⽤的可视化技巧,便于对训练过程进⾏分析和检查。
卷积核可视化
以resnet18为例,提取第⼀层的卷积核(7x7)进⾏可视化,可以看出⼤多提取的是边缘、⾓点之类的底层视觉特征。
在全连接层前的卷积层采⽤的是3x3卷积核,表达⾼层语义信息,更加抽象:
托福口语复议
这⾥采⽤torchvision.utils.make_grid对卷积核进⾏⽹格化显⽰,图像⽹格的列数由nrow参数确定。
卷积核的可视化代码参考进⾏修改:
def plot_conv(writer,model):
for name,param in model.named_parameters():
if'conv'in name and'weight'in name:
in_channels = param.size()[1]# 输⼊通道
out_channels = param.size()[0]# 输出通道
k_w, k_h = param.size()[3], param.size()[2]# 卷积核的尺⼨新概念英语青少版
kernel_all = param.view(-1,1, k_w, k_h)# 每个通道的卷积核
kernel_grid = torchvision.utils.make_grid(kernel_all, normalize=True, scale_each=True, nrow=in_channels)
writer.add_image(f'{name}_all', kernel_grid, global_step=0)
参数直⽅图可视化
利⽤直⽅图可以对每⼀层参数的分布进⾏直观展⽰,便于分析模型参数的学习情况。
全连接层的参数分布如下图所⽰:
代码⽰例如下:
def plot_param_hist(writer,model):
for name, param in model.named_parameters():
writer.add_histogram(f"{name}", param,0)
激活可视化
输⼊图像经过第⼀个卷积层的激活映射:
英文童话故事
经过layer2和layer3的激活:
从pytorch模型中获取指定层的权重和激活的代码如下,参考facebook的⼯程:
class GetWeightAndActivation:
"""
A class ud to get weights and activations from specified layers from a Pytorch model.
A class ud to get weights and activations from specified layers from a Pytorch model.
"""
def__init__(lf, model, layers):
"""
Args:
model (nn.Module): the model containing layers to obtain weights and activations from. layers (list of strings): a list of layer names to obtain weights and activations from.
Names are hierarchical, parated by /. For example, If a layer follow a path
"s1" ---> "pathway0_stem" ---> "conv", the layer path is "s1/pathway0_stem/conv". """
lf.hooks ={}
wrangle
lf.layers_names = layers
# eval mode
lf._register_hooks()
def_get_layer(lf, layer_name):
"""
Return a layer (nn.Module Object) given a hierarchical layer name, parated by /.
Args:
layer_name (str): the name of the layer.
"""
layer_ls = layer_name.split("/")
prev_module = lf.model
for layer in layer_ls:
prev_module = prev_module._modules[layer]
return prev_module
def_register_single_hook(lf, layer_name):
"""
Register hook to a layer, given layer_name, to obtain activations.
Args:
layer_name (str): name of the layer.
"""
def hook_fn(module,input, output):
celebrates
lf.hooks[layer_name]= output.clone().detach()
layer = get_del, layer_name)
def_register_hooks(lf):
"""
Register hooks to layers in `lf.layers_names`.
"""
for layer_name in lf.layers_names:
lf._register_single_hook(layer_name)
def get_activations(lf,input, bboxes=None):
"""
Obtain all activations from layers that we register hooks for.
Args:
input (tensors, list of tensors): the model input.
bboxes (Optional): Bouding boxes data that might be required
by the model.
Returns:
activation_dict (Python dictionary): a dictionary of the pair
{layer_name: list of activations}, where activations are outputs returned
by the layer.
"""
星期一到星期日英文kewellinput_clone =[inp.clone()for inp in input]
if bboxes is not None:
preds = lf.model(input_clone, bboxes)
el:
preds = lf.model(input_clone)
activation_dict ={}
for layer_name, hook in lf.hooks.items():
for layer_name, hook in lf.hooks.items():
# list of activations for each instance.
activation_dict[layer_name]= hook
return activation_dict, preds
def get_weights(lf):
"""
Returns weights from registered layers.
Returns:
weights (Python dictionary): a dictionary of the pair
{layer_name: weight}, where weight is the weight tensor.
"""
weights ={}
for layer in lf.layers_names:
cur_layer = get_del, layer)
if hasattr(cur_layer,"weight"):
weights[layer]= cur_layer.weight.clone().detach()
slcel:
托业考试官网(
"Layer {} does not have weight attribute.".format(layer)
)
return weights
对给定输⼊进⾏测试,输出指定层的激活映射,并绘制在tensorboard中:
# 模型测试,避免改变权重
model.eval()
# Set up writer for logging to Tensorboard format.
writer = tb.TensorboardWriter(cfg)
# 注册指定层的激活hook
layer_ls=["conv1","layer1/1/conv2","layer2/1/conv2","layer3/1/conv2","layer4/1/conv2"]
model_vis = GetWeightAndActivation(model, layer_ls)
# 给定⼀个输⼊,获取指定层的激活映射
activations, preds = _activations(inputs)
# 绘制激活映射(如画在tensorboard中)
plot_weights_and_activations(writer,activations,tag="Input {}/Activations: ".format(0))
百特英语
⼩结
本⽂整理了深度神经⽹络常⽤的局部可视化代码,对卷积核、权重和激活映射进⾏可视化,便于对训练过程进⾏分析和检查。有需要的朋友可以马住收藏。
1.
2.