注意力机制_注意

更新时间:2023-06-09 10:49:58 阅读：评论：0

注意⼒机制_注意

手术增高注意⼒机制

注意 (Note)

In this article I have discusd the various types of activation functions and what are the types of problems one might encounter while using each of them.

在本⽂中，我讨论了各种激活函数，以及在使⽤每种激活函数时可能遇到的问题类型。

上班时间I would suggest to begin with a ReLU function and explore other functions as you move further. You can also design your own activation functions giving a non-linearity component to your network.

我建议从ReLU功能开始，并在您进⼀步开发其他功能时进⾏探索。您还可以设计⾃⼰的激活函数，为⽹络提供⾮线性组件。

Recall that inputs x0,x1,x2,x3……xn and weights w0,w1,w2,w3……..wn are multiplied and added with bias term to form our input.

回想⼀下，输⼊x0，x1，x2，x3……xn和权重w0，w1，w2，w3…….wn被相乘并加上偏置项以形成我们的输⼊。

Clearly W implies how much weight or strength we want to give our incoming input and we can think b as an offt value, making x*w have to reach an offt value before having an effect.

显然， W表⽰我们要给输⼊的输⼊多少重量或强度，我们可以将b视为偏移值，从⽽使x * w必须先达到偏移值才能⽣效。

到⽬前为⽌，我们已经看到了输⼊，那么什么是激活函数？ (As far we have en the inputs so now what is activation function?)

Activation function is ud to t the boundaries for the overall output value.For Example:-let z=X*w+b be the output of the previous layer then it will be nt to the activation function for limit it’svalue between 0 and 1(if binary classification problem).

激活函数⽤于设置总输出值的边界，例如：-let z = X * w + b是上⼀层的输出，然后将其发送到激活函数以将其值限制在0到0之间。 1(如果⼆进制分类问题)。

Finally, the output from the activation function moves to the next hidden layer and the same process is repeated. This forward movement of information is known as the forward propagation.

最后，激活函数的输出移⾄下⼀个隐藏层，并重复相同的过程。信息的这种前向移动称为前向传播。

What if the output generated is far away from the actual value? Using the output from the forward propagation, error is calculated. Bad on this error value, the weights and bias of the neurons are updated. This process is known as back-propagation.

如果⽣成的输出与实际值相去甚远怎么办？使⽤前向传播的输出，可以计算误差。基于此误差值，将更新神经元的权重和偏差。此过程称为反向传播。

A neural network without an activation function is esntially just a linear regression model.

没有激活函数的神经⽹络实质上只是线性回归模型。

⼀些激活功能 (Some Activation Functions)

个人成绩查询1. Step Function

步进功能

Image for post

if value of z<0,output=0,if value of z>0,output=1

This sort of function is for classification however this activation function is less ud becau this is very strong function as the small changes are not reflected.

此类功能⽤于分类，但是此激活功能使⽤较少，因为由于未反映出较⼩的变化，因此此功能⾮常强⼤。

2. Sigmoid Function

2.⼄状结肠功能

The next activation function that we are going to look at is the Sigmoid function. It is one of the most widely ud non-linear activation function. Sigmoid transforms the values between the range 0 and 1.

我们要看的下⼀个激活函数是Sigmoid函数。它是使⽤最⼴泛的⾮线性激活函数之⼀。 Sigmoid转换0到1之间的值。

Image for post

A noteworthy point here is that unlike the binary step and linear functions, sigmoid is a non-linear function. This

esntially means -when I have multiple neurons having sigmoid function as their activation function,the output is non linear as well.

这⾥值得注意的⼀点是，与⼆值阶跃函数和线性函数不同，Sigmoid是⾮线性函数。这本质上意味着-当我有多个具有S型功能作为激活功能的神经元时，输出也是⾮线性的。描写老师的作文

3.Hyperbolic Tangent(tanh(z))

3.双曲正切(tanh(z))

Image for post

The tanh function is very similar to the sigmoid function. The only difference is that it is symmetric around the origin.

The range of values in this ca is from -1 to 1. Thus the inputs to the next layers will not always be of the same sign.

tanh函数与S型函数⾮常相似。唯⼀的区别是它围绕原点对称。在这种情况下，值的范围是-1⾄1 。因此，下⼀层的输⼊将不会总是具有相同的符号。

4. Rectified Linear Unit(Relu)

4.整流线性单元(Relu)

Image for post

This is actually a relative simple funcion max(0,z).

这实际上是⼀个相对简单的函数max(0，z)。

Image for post

def relu_function(x):

if x<0:

return 0

el:

return x

Relu has been found to have very good performance ,especially when dealing with the issue of Vanishing Gradient.

⼈们发现Relu的性能⾮常好，尤其是在处理“ 消失梯度 ”问题时。

5. Leaky Rectified Linear Unit

5.泄漏线性整流器

Leaky ReLU function is nothing but an improved version of the ReLU function. As we saw that for the ReLU function, the gradient is 0 for x<0, which would deactivate the neurons in that region.

泄漏的ReLU功能不过是ReLU功能的改进版本。如我们所见，对于ReLU函数，对于x <0，梯度为0，这将使该区域的神经元失活。

Image for post

Leaky ReLU is defined to address this problem. Instead of defining the Relu function as 0 for negative values of x, we define it as an extremely small linear component of x. Here is the mathematical expression-

泄漏的ReLU被定义为解决此问题。对于x的负值，我们没有将Relu函数定义为0，⽽是将其定义为x的⾮常⼩的线性分量。这是数学表达式-

f(x)={ 0.01x, x<0

建行怎么查开户行x, x>=0}

版权页

6.Softmax Function

6.Softmax功能

Softmax function is often described as a combination of multiple sigmoids. We know that sigmoid returns values between 0 and 1, which can be treated as probabilities of a data point belonging to a particular class. Thus sigmoid is widely ud for binary classification problems.

女人贫血怎么补Softmax函数通常被描述为多个S型曲线的组合。我们知道sigmoid返回的值介于0和1之间，可以将其视为属于特定类的数据点的概率。因此，⼄状结肠被⼴泛⽤于⼆进制分类问题。

Image for post

for i=1,2,3,4……k ,(k=no of categories) 对于i = 1,2,3,4……k，(k =类别数)

Softamax function calculates the probablities distribution of the event over k different events.

Softamax函数计算k个不同事件上事件的概率分布。

So,this means this function will calculate the probablities of each target over all possible targets.

因此，这意味着此函数将计算所有可能⽬标上每个⽬标的概率。

def softmax_function(x):

z = np.exp(x)

z_ = z/z.sum()

return z_

选择正确的激活功能 (Choosing the right Activation Function)

Now that we have en so many activation functions, we need some logic / heuristics to know which activation function should be ud in which situation. Good or bad — there is no rule of thumb.

现在我们已经看到了这么多的激活函数，我们需要⼀些逻辑/试探法来知道在哪种情况下应该使⽤哪个激活函数。好与坏-没有经验法则。

However depending upon the properties of the problem we might be able to make a better choice for easy and quicker convergence of the network.

但是，根据问题的性质，我们可能能够做出更好的选择，以实现⽹络的轻松，快速融合。

Sigmoid functions and their combinations generally work better in the ca of classifiers

在分类器的情况下，Sigmoid函数及其组合通常效果更好

Sigmoids and tanh functions are sometimes avoided due to the vanishing gradient problem

由于逐渐消失的梯度问题，有时会避免出现S型和tanh函数

ReLU function is a general activation function and is ud in most cas the days

ReLU功能是⼀种常规的激活功能，⽬前在⼤多数情况下都使⽤

If we encounter a ca of dead neurons in our networks the leaky ReLU function is the best choice

如果我们在⽹络中遇到死亡的神经元的情况，则漏液的ReLU功能是最佳选择

Always keep in mind that ReLU function should only be ud in the hidden layers

始终牢记ReLU功能只能在隐藏层中使⽤

As a rule of thumb, you can begin with using ReLU function and then move over to other activation functions in ca ReLU doesn’t provide with optimum results.

根据经验，您可以先使⽤ReLU功能，然后再转到其他激活功能，以防ReLU⽆法提供最佳效果。

胸的成语注意⼒机制

本文发布于:2023-06-09 10:49:58，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/1030506.html

上一篇：VGAE（Variationalgraphauto-encoders）论文及代码解读

下一篇：部署KMS激活windows7