首页 > 美文阅读

基于深度学习的车牌检测识别（Pytorch）（ResNet+Transformer）

更新时间:2023-07-26 11:30:15 阅读：评论：0

车牌识别

概述

基于深度学习的车牌识别，其中，车辆检测⽹络直接使⽤YOLO侦测。⽽后，才是使⽤⽹络侦测车牌与识别车牌号。

车牌的侦测⽹络，采⽤的是resnet18，⽹络输出检测边框的仿射变换矩阵，可检测任意形状的四边形。

车牌号序列模型，采⽤Resnet18+transformer模型，直接输出车牌号序列。

数据集上，车牌检测使⽤CCPD 2019数据集，在训练检测模型的时候，会使⽤程序⽣成虚假的车牌，覆盖于数据集图⽚上，来加强检测的能⼒。

车牌号的序列识别，直接使⽤程序⽣成的车牌图⽚训练，并佐以适当的图像增强⼿段。模型的训练直接采⽤端到端的训练⽅式，输⼊图⽚，直接输出车牌号序列，损失采⽤CTCLoss。

⼀、⽹络模型

1、车牌的侦测⽹络模型：

⽹络代码定义如下：

class WpodNet(nn.Module):

def__init__(lf):

"""

车牌侦测⽹络，直接使⽤Resnet18，仅改变输出层。

"""

super(WpodNet, lf).__init__()

resnet = resnet18(True)

backbone =list(resnet.children())

lf.backbone = nn.Sequential(

nn.BatchNorm2d(3),

*backbone[:3],

*backbone[4:8],

)

lf.detection = nn.Conv2d(512,8,3,1,1)

def forward(lf, x):

features = lf.backbone(x)

out = lf.detection(features)

out = rearrange(out,'n c h w -> n h w c')# 变换形状

return out专升本大学语文

该⽹络，相当于直接对图⽚划分cell，即在16X16的格⼦中，侦测车牌，输出的为该车牌边框的反射变换矩阵。

2、车牌号的序列识别⽹络：

车牌号序列识别的主⼲⽹络：采⽤的是ResNet18+transformer，其中有ResNet18完成对图⽚的编码⼯作，再由transformer解码为对应的字符。

⽹络代码定义如下：

from torch import nn

dels import resnet18

import torch

from einops import rearrange

class OcrNet(nn.Module):

def__init__(lf,num_class):

super(OcrNet, lf).__init__()

resnet = resnet18(True)

backbone =list(resnet.children())

lf.backbone = nn.Sequential(

nn.BatchNorm2d(3),

*backbone[:3],

学习的座右铭*backbone[4:8],

)# 创建ResNet18

lf.decoder = nn.Sequential(

Block(512,8,Fal),

)# 由Transformer 构成的解码器

lf.out_layer = nn.Linear(512, num_class)# 线性输出层

lf.abs_pos_emb = AbsPosEmb((3,9),512)# 绝对位置编码

def forward(lf,x):

雪简笔画x = lf.backbone(x)

x = rearrange(x,'n c h w -> n (w h) c')

x = x + lf.abs_pos_emb()

x = lf.decoder(x)

x = rearrange(x,'n s v -> s n v')

return lf.out_layer(x)

其中的Block类的代码如下：

class Block(nn.Module):

r"""

Args:

embed_dim: 词向量的特征数。

num_head: 多头注意⼒的头数。

is_mask: 是否添加掩码。是，则⽹络只能看到每个词前的内容，⽽⽆法看到后⾯的内容。 Shape:

- Input: N,S,V (批次，序列数，词向量特征数)

- Output:same shape as the input

郁郁葱葱的近义词Examples::

# >>> m = Block(720, 12)

# >>> x = torch.randn(4, 13, 720)

# >>> output = m(x)

# >>> print(output.shape)

# torch.Size([4, 13, 720])

"""

def__init__(lf, embed_dim, num_head, is_mask):

如何退货

super(Block, lf).__init__()

lf.ln_1 = nn.LayerNorm(embed_dim)

lf.attention = SelfAttention(embed_dim, num_head, is_mask)

lf.ln_2 = nn.LayerNorm(embed_dim)

lf.feed_forward = nn.Sequential(

nn.Linear(embed_dim, embed_dim *6),

nn.ReLU(),

nn.Linear(embed_dim *6, embed_dim)

牛腰子的功效与作用

)

def forward(lf, x):

'''计算多头⾃注意⼒'''

attention = lf.attention(lf.ln_1(x))

'''残差'''

x = attention + x

x = lf.ln_2(x)

'''计算feed forward部分'''

h = lf.feed_forward(x)

x = h + x # 增加残差

return x

位置编码的代码如下：

《名人传》读后感class AbsPosEmb(nn.Module):

def__init__(

lf,

fmap_size,

dim_head

super().__init__()

height, width = fmap_size

scale = dim_head **-0.5

lf.height = nn.Parameter(torch.randn(height, dim_head)* scale)

lf.width = nn.Parameter(torch.randn(width, dim_head)* scale)

def forward(lf):

emb = rearrange(lf.height,'h d -> h () d')+ rearrange(lf.width,'w d -> () w d')

emb = rearrange(emb,' h w d -> (w h) d')

return emb

Block类使⽤的⾃注意⼒代码如下：

class SelfAttention(nn.Module):

r"""多头⾃注意⼒

Args:

embed_dim: 词向量的特征数。

num_head: 多头注意⼒的头数。

is_mask: 是否添加掩码。是，则⽹络只能看到每个词前的内容，⽽⽆法看到后⾯的内容。 Shape:

Input: N,S,V (批次，序列数，词向量特征数)

- Output:same shape as the input

Examples::

# >>> m = SelfAttention(720, 12)

# >>> x = torch.randn(4, 13, 720)

# >>> output = m(x)

# >>> print(output.shape)

# torch.Size([4, 13, 720])

万能充电器"""

def__init__(lf, embed_dim, num_head, is_mask=True):

super(SelfAttention, lf).__init__()

asrt embed_dim % num_head ==0

lf.num_head = num_head

lf.is_mask = is_mask

lf.linear1 = nn.Linear(embed_dim,3* embed_dim)

lf.linear2 = nn.Linear(embed_dim, embed_dim)

def forward(lf, x):

'''x 形状 N,S,V'''

x = lf.linear1(x)# 形状变换为N,S,3V

n, s, v = x.shape

"""分出头来，形状变换为 N,S,H,V"""

x = x.reshape(n, s, lf.num_head,-1)

"""换轴，形状变换⾄ N,H,S,V"""

x = anspo(x,1,2)

'''分出Q,K,V'''

query, key, value = torch.chunk(x,3,-1)

dk = value.shape[-1]**0.5

'''计算⾃注意⼒'''

w = torch.matmul(query, anspo(-1,-2))/ dk # w 形状 N,H,S,S if lf.is_mask:

"""⽣成掩码"""

mask = s(w.shape[-1], w.shape[-1])).to(w.device)

w = w * mask -1e10*(1- mask)

w = torch.softmax(w, dim=-1)# softmax归⼀化

attention = torch.matmul(w, value)# 各个向量根据得分合并合并, 形状 N,H,S,V '''换轴⾄ N,S,H,V'''

attention = attention.permute(0,2,1,3)

n, s, h, v = attention.shape

'''合并H，V，相当于吧每个头的结果cat在⼀起。形状⾄N,S,V'''

attention = shape(n, s, h * v)

return lf.linear2(attention)# 经过线性层后输出

⼆、数据加载

1、车牌号的数据加载

同过程序⽣成⼀组车牌号:

再通过数据增强，

主要包括：

随机污损：

⾼斯模糊：

仿射变换，粘贴于⼀张⼤图中：

四边形的四个⾓的位置随机偏移些许后扣出：

然后直接训练车牌号的序列识别⽹络，

loss_func = nn.CTCLoss(blank=0, zero_infinity=True)

optimizer = torch.optim.Adam(lf.parameters(), lr=0.00001)

优化器直接使⽤Adam，损失函数为CTCLoss。

2、车牌检测的数据加载

数据使⽤的是CCPD数据集，在这过程中，会随机的使⽤⽣成车牌，覆盖原始图⽚的车牌位置，来训练⽹络对车牌的检测能⼒。

if random.random()<0.5:

plate, _ = lf.draw()

plate = cv2.cvtColor(plate, cv2.COLOR_RGB2BGR)

plate = lf.smudge(plate)# 随机污损

image = enhance.apply_plate(image, points, plate)# 粘贴车牌图⽚于数据图中

[x1, y1, x2, y2, x4, y4, x3, y3]= points

points =[x1, x2, x3, x4, y1, y2, y3, y4]

image, pts = enhance.augment_detect(image, points,208)

三、训练

分别训练即可

其中，侦测⽹络的损失计算，如下：

本文发布于:2023-07-26 11:30:15，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/82/1118087.html

上一篇：熵、能源与环境保护

下一篇：SQLTransientConnectionException：HikariPool-1-。。。

标签：车牌检测车牌号识别输出序列

留言与评论（共有 0 条评论）