首页 > 英文翻译

语义分割损失函数系列（1）：交叉熵损失函数

更新时间:2023-06-29 05:54:39 阅读：评论：0

最近⼀直在做⼀些语义分割相关的项⽬，找损失函数的时候发现⽹上这些⼤佬的写得各有千秋，也没说怎么⽤，在此记录⼀下⾃⼰在训练过程中使⽤损失函数的⼀些⼼得.本⼈是使⽤的Pytorch框架，故这⼀系列都会基于Pytorch来实现。

⾸先是交叉熵损失函数，语义分割其实是⼀个逐像素分类的⼀个分类问题，做过图像分类的应该都⽐较熟悉交叉熵损失函数。

pytorch中⾃带有写好的交叉熵函数，只需要调⽤就⾏：

loss_func = nn.CrossEntropyLoss()

<模块中写好的损失函数都是以类的⽅式写的，只需要提前声明⼀下后⾯即可调⽤。

pytorch中交叉熵损失函数的实现：

class CrossEntropyLoss(_WeightedLoss):

r"""This criterion computes the cross entropy loss between input and target.

It is uful when training a classification problem with `C` class.

If provided, the optional argument :attr:`weight` should be a 1D `Tensor`

assigning weight to each of the class.

使用方法英文This is particularly uful when you have an unbalanced training t.

The `input` is expected to contain raw, unnormalized scores for each class.

`input` has to be a Tensor of size either :math:`(minibatch, C)` or

:math:`(minibatch, C, d_1, d_2, ..., d_K)` with :math:`K \geq 1` for the

`K`-dimensional ca. The latter is uful for higher dimension inputs, such

as computing cross entropy loss per-pixel for 2D images.

The `target` that this criterion expects should contain either:

- Class indices in the range :math:`[0, C-1]` where :math:`C` is the number of class; if

`ignore_index` is specified, this loss also accepts this class index (this index

may not necessarily be in the class range). The unreduced (i.e. with :attr:`reduction`

t to ``'none'``) loss for this ca can be described as:

.. math::

\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad

l_n = - w_{y_n} \log \frac{\exp(x_{n,y_n})}{\sum_{c=1}^C \exp(x_{n,c})}

\cdot \mathbb{1}\{y_n \not= \text{ignore\_index}\}

where :math:`x` is the input, :math:`y` is the target, :math:`w` is the weight,

:math:`C` is the number of class, and :math:`N` spans the minibatch dimension as well as

:math:`d_1, ..., d_k` for the `K`-dimensional ca. If

:attr:`reduction` is not ``'none'`` (default ``'mean'``), then

. math::

\ell(x, y) = \begin{cas}

\sum_{n=1}^N \frac{1}{\sum_{n=1}^N w_{y_n} \cdot \mathbb{1}\{y_n \not= \text{ignore\_index}\}} l_n, &

\text{if reduction} = \text{`mean';}\\

\sum_{n=1}^N l_n, &

\text{if reduction} = \text{`sum'.}

\end{cas}

Note that this ca is equivalent to the combination of :class:`~LogSoftmax` and

:class:`~NLLLoss`.

- Probabilities for each class; uful when labels beyond a single class per minibatch item

are required, such as for blended labels, label smoothing, etc. The unreduced (i.e. withdescriptions

:attr:`reduction` t to ``'none'``) loss for this ca can be described as:

.. math::

\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad

l_n = - \sum_{c=1}^C w_c \log \frac{\exp(x_{n,c})}{\exp(\sum_{i=1}^C x_{n,i})} y_{n,c}

where :math:`x` is the input, :math:`y` is the target, :math:`w` is the weight,

:math:`C` is the number of class, and :math:`N` spans the minibatch dimension as well as

:math:`d_1, ..., d_k` for the `K`-dimensional ca. If

:attr:`reduction` is not ``'none'`` (default ``'mean'``), then

.. math::

\ell(x, y) = \begin{cas}

\frac{\sum_{n=1}^N l_n}{N}, &

\text{if reduction} = \text{`mean';}\\

\sum_{n=1}^N l_n, &

\text{if reduction} = \text{`sum'.}

\end{cas}

.. note::

The performance of this criterion is generally better when `target` contains class

indices, as this allows for optimized computation. Consider providing `target` as

class probabilities only when a single class label per minibatch item is too restrictive.

Args:

weight (Tensor, optional): a manual rescaling weight given to each class.

If given, has to be a Tensor of size `C`

size_average (bool, optional): Deprecated (e :attr:`reduction`). By default,

the loss are averaged over each loss element in the batch. Note that for

some loss, there are multiple elements per sample. If the field :attr:`size_average`

is t to ``Fal``, the loss are instead summed for each minibatch. Ignored

when :attr:`reduce` is ``Fal``. Default: ``True``

ignore_index (int, optional): Specifies a target value that is ignored

and does not contribute to the input gradient. When :attr:`size_average` is

``True``, the loss is averaged over non-ignored targets. Note that

:attr:`ignore_index` is only applicable when the target contains class indices.

reduce (bool, optional): Deprecated (e :attr:`reduction`). By default, the

loss are averaged or summed over obrvations for each minibatch depending

on :attr:`size_average`. When :attr:`reduce` is ``Fal``, returns a loss per

batch element instead and ignores :attr:`size_average`. Default: ``True``

reduction (string, optional): Specifies the reduction to apply to the output:

``'none'`` | ``'mean'`` | ``'sum'``. ``'none'``: no reduction will

be applied, ``'mean'``: the weighted mean of the output is taken,

``'sum'``: the output will be summed. Note: :attr:`size_average`

and :attr:`reduce` are in the process of being deprecated, and in

the meantime, specifying either of tho two args will override

:attr:`reduction`. Default: ``'mean'``

label_smoothing (float, optional): A float in [0.0, 1.0]. Specifies the amountwebphone

ysm

of smoothing when computing the loss, where 0.0 means no smoothing. The targets

become a mixture of the original ground truth and a uniform distribution as described in

`Rethinking the Inception Architecture for Computer Vision <arxiv/abs/1512.00567>`__. Default: :math:`0.0`. Shape:

- Input: :math:`(N, C)` where `C = number of class`, or

:math:`(N, C, d_1, d_2, ..., d_K)` with :math:`K \geq 1`

in the ca of `K`-dimensional loss.

- Target: If containing class indices, shape :math:`(N)` where each value is

:math:`0 \leq \text{targets}[i] \leq C-1`, or :math:`(N, d_1, d_2, ..., d_K)` with

:math:`K \geq 1` in the ca of K-dimensional loss. If containing class probabilities,

same shape as the input.

- Output: If :attr:`reduction` is ``'none'``, shape :math:`(N)` or

:math:`(N, d_1, d_2, ..., d_K)` with :math:`K \geq 1` in the ca of K-dimensional loss.

Otherwi, scalar.

Examples::

>>> # Example of target with class indices

>>> loss = nn.CrossEntropyLoss()

>>> input = torch.randn(3, 5, requires_grad=True)

>>> target = pty(3, dtype=torch.long).random_(5)

>>> output = loss(input, target)

>>> output.backward()

>>>

>>> # Example of target with class probabilities

>>> input = torch.randn(3, 5, requires_grad=True)

>>> target = torch.randn(3, 5).softmax(dim=1)

>>> output = loss(input, target)

>>> output.backward()

"""

__constants__ =['ignore_index','reduction','label_smoothing']super bowl

ignore_index:int

label_smoothing:float

def__init__(lf, weight: Optional[Tensor]=None, size_average=None, ignore_index:int=-100,

ennreduce=None, reduction:str='mean', label_smoothing:float=0.0)->None:

super(CrossEntropyLoss, lf).__init__(weight, size_average,reduce, reduction)

lf.ignore_index = ignore_index

lf.label_smoothing = label_smoothing

def forward(lf,input: Tensor, target: Tensor)-> Tensor:

ss_entropy(input, target, weight=lf.weight,

ignore_index=lf.ignore_index, duction,

label_smoothing=lf.label_smoothing)

在声明损失函数的时候可以加上⼀些参数，其中⽐较重要的是weight: Optional[Tensor] = None，

weight权重是在计算中给每⼀类加上相应的计算权重，是⼀个tensor，长度和类别数⼀致；以及label_smoothing，label_smoothing⽅法在交叉熵损失函数中⾃带有，假设label_smoothing = 0.1的话，在⼆分类问题中就会将类别为1的变为0.9，类别为0的变为0.1，这样做能够让损失更加平滑，更容易收敛，避免错误分类带来的过⼤的损失。

在使⽤也就是计算loss值的时候需要两个参数，⼀个是input，⼀个是target，两个都是tensor，input是你模型的预测结果，target是真实标注。例如：

import torch

时间表英文 as nn

loss_func = nn.CrossEntropyLoss()

input= torch.randn(3,5, requires_grad=True)

perfumes

target = pty(3, dtype=torch.long).random_(5)

output = loss_func(input, target)

print(output)

input: tensor([[1.6738,0.0526,0.6329,-0.8809,1.4822],诗句翻译

[-0.5908,1.5717,1.3402,0.4227,-0.3498],

[-0.3359,-2.3797,-1.6206,-2.3070,0.6010]], requires_grad=True)

target: tensor([3,4,1])

loss: tensor(3.2306, grad_fn=<NllLossBackward0>)

上⾯这就类似与⼀个五分类的问题，input是模型最后的全连接层的输出，或者是全卷积⽹络最后的输出。

这⾥注意：

best wishes中文翻译加⼊你把分类输出的概率做了argmax操作以后，计算会出错，例如：

import torch

as nn

loss_func = nn.CrossEntropyLoss()

# input = torch.randn(3, 5, requires_grad=True)

input= torch.randn(3,requires_grad=True)

print("input:",input)

target = pty(3, dtype=torch.long).random_(5)

print("target:",target)

output = loss_func(input, target)

print("loss:",output)

input: tensor([-0.3463,1.2289,0.2517], requires_grad=True)

target: tensor([3,4,3])

Traceback (most recent call last):

File "/home/lwf/Project/MRI-Segmentation/tets.py", line 19,in<module>

output = loss_func(input, target)

File "/home/lwf/anaconda3/envs/torch3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102,in _call_impl

return forward_call(*input,**kwargs)

File "/home/lwf/anaconda3/envs/torch3.7/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1152,in forward

label_smoothing=lf.label_smoothing)

File "/home/lwf/anaconda3/envs/torch3.7/lib/python3.7/site-packages/torch/nn/functional.py", line 2846,in cross_entropy

return torch._C._nn.cross_entropy_loss(input, target, weight, __enum(reduction), ignore_index, label_smoothing)

RuntimeError: Expected floating point type for target with class probabilities, got Long

也就是说，input的输出必须是每个类别的概率。

本文发布于:2023-06-29 05:54:39，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/90/161267.html

上一篇：第七讲几何纠正(摄影测量与遥感)

下一篇：量子化学课程习题及标准答案

标签：损失函数分类计算

留言与评论（共有 0 条评论）