TensorFlow学习--学习率衰减learningratedecay
学习率衰减
学习率衰减(learning rate decay)
在训练神经⽹络时,使⽤学习率控制参数的更新速度.学习率较⼩时,会⼤⼤降低参数的更新速度;学习率较⼤时,会使搜索过程中发⽣震荡,导致参数在极优值附近徘徊.
为此,在训练过程中引⼊学习率衰减,使学习率随着训练的进⾏逐渐衰减.
TensorFlow中实现的学习率衰减⽅法:
函数返回衰减的学习率.
分段常数衰减
参数:
x:0-D标量Tensor.
except的用法boundaries:边界,tensor或list.
values:指定定义区间的值.
name:操作的名称,默认为PiecewiConstant.
分段常数衰减就是在定义好的区间上,分别设置不同的常数值,作为学习率的初始值和后续衰减的取值.
朝辞白帝彩云间千里江陵一日还的意思是什么
⽰例:
# piecewi_constant 阶梯式下降法
import matplotlib.pyplot as plt
import tensorflow as tf
global_step = tf.Variable(0, name='global_step', trainable=Fal)
boundaries = [10, 20, 30]
learing_rates = [0.1, 0.07, 0.025, 0.0125]
y = []
N = 40
with tf.Session() as ss:
ss.run(tf.global_variables_initializer())
for global_step in range(N):
learing_rate = tf.train.piecewi_constant(global_step, boundaries=boundaries, values=learing_rates)
lr = ss.run([learing_rate])
y.append(lr[0])
x = range(N)
plt.plot(x, y, 'r-', linewidth=2)
plt.title('piecewi_constant')
plt.show()
指数衰减
指数衰减
指数衰减是最常⽤的衰减⽅法.
参数:
learning_rate:初始学习率.
global_step:⽤于衰减计算的全局步数,⾮负.⽤于逐步计算衰减指数.我爱你我的祖国
decay_steps:衰减步数,必须是正值.决定衰减周期.
decay_rate:衰减率.
stairca:若为True,则以不连续的间隔衰减学习速率即阶梯型衰减(就是在⼀段时间内或相同的eproch内保持相同的学习率);若为Fal,则是标准指数型衰减.
name:操作的名称,默认为ExponentialDecay.(可选项)
指数衰减的学习速率计算公式为:
decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps)
优点:简单直接,收敛速度快.
⽰例,阶梯型衰减与指数型衰减对⽐:
import matplotlib.pyplot as plt
import tensorflow as tf
global_step = tf.Variable(0, name='global_step', trainable=Fal)
y = []
z = []
N = 200
with tf.Session() as ss:
ss.run(tf.global_variables_initializer())
for global_step in range(N):
# 阶梯型衰减
learing_rate1 = ponential_decay(
learning_rate=0.5, global_step=global_step, decay_steps=10, decay_rate=0.9, stairca=True) # 标准指数型衰减
average缩写learing_rate2 = ponential_decay(
learning_rate=0.5, global_step=global_step, decay_steps=10, decay_rate=0.9, stairca=Fal) lr1 = ss.run([learing_rate1])
lr2 = ss.run([learing_rate2])
y.append(lr1[0])
z.append(lr2[0])
x = range(N)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.t_ylim([0, 0.55])
plt.plot(x, y, 'r-', linewidth=2)
plt.plot(x, z, 'g-', linewidth=2)
plt.title('exponential_decay')
ax.t_xlabel('step')
ax.t_ylabel('learing rate')
plt.show()
如图,红⾊:阶梯型;绿⾊:指数型:
⾃然指数衰减
stairca:若为True,则是离散的阶梯型衰减(就是在⼀段时间内或相同的eproch内保持相同的学习率);若为Fal,则是标准型衰减.
name: 操作的名称,默认为ExponentialTimeDecay.
natural_exp_decay 和 exponential_decay 形式近似,natural_exp_decay的底数是e.⾃然指数衰减⽐指数衰减要快的多,⼀般⽤于较快收敛,容易训练的⽹络.
⾃然指数衰减的学习率计算公式为:⽰例,指数衰减与⾃然指数衰减的阶梯型与指数型:
decayed_learning_rate = learning_rate * exp(-decay_rate * global_step)
#!/usr/bin/python
# coding:utf-8
import matplotlib.pyplot as pltvia是什么意思
import tensorflow as tf
global_step = tf.Variable(0, name='global_step', trainable=Fal)
y = []
z = []
w = []
N = 200
with tf.Session() as ss:
ss.run(tf.global_variables_initializer())
for global_step in range(N):
# 阶梯型衰减
learing_rate1 = tf.train.natural_exp_decay(
learning_rate=0.5, global_step=global_step, decay_steps=10, decay_rate=0.9, stairca=True) # 标准指数型衰减
learing_rate2 = tf.train.natural_exp_decay(
learning_rate=0.5, global_step=global_step, decay_steps=10, decay_rate=0.9, stairca=Fal) # 指数衰减
learing_rate3 = ponential_decay(
learning_rate=0.5, global_step=global_step, decay_steps=10, decay_rate=0.9, stairca=Fal) lr1 = ss.run([learing_rate1])
lr2 = ss.run([learing_rate2])
lr3 = ss.run([learing_rate3])
y.append(lr1[0])
z.append(lr2[0])
w.append(lr3[0])
x = range(N)
山东摄影培训fig = plt.figure()
ax = fig.add_subplot(111)
ax.t_ylim([0, 0.55])
2010考研英语答案plt.plot(x, y, 'r-', linewidth=2)
headfirstplt.plot(x, z, 'g-', linewidth=2)
plt.plot(x, w, 'b-', linewidth=2)
plt.title('natural_exp_decay')
ax.t_xlabel('step')
succession
gazelleax.t_ylabel('learing rate')
plt.show()