The class implements the AdaDelta method.
\[ \begin{array}{l} g_\theta=(1-\lambda){(\frac{ \partial f(\cdot) }{\partial \theta })}^2+\lambda g_\theta\\ d_\theta=\alpha\frac{\sqrt{s_\theta+\epsilon}}{\sqrt{g_\theta+\epsilon}}\frac{ \partial f(\cdot) }{\partial \theta }\\ s_\theta=(1-\lambda){(d_\theta)}^2+\lambda s_\theta \end{array} \]
.
where \( \frac{ \partial f(\cdot) }{\partial \theta } \) is a negative descend direction (eg, gradient) wrt \(\theta\), \(\lambda\) is a decay factor, \(\epsilon\) is used to avoid dividing by 0, \( \alpha \) is a build-in learning rate \(d_\theta\) is a corrected negative descend direction.
Reference: Matthew D. Zeiler, ADADELTA: An Adaptive Learning Rate Method, arXiv:1212.5701
在文件 AdaDeltaUpdater.h 第 58 行定义.
Public 成员函数 | |
AdaDeltaUpdater () | |
AdaDeltaUpdater (float64_t learning_rate, float64_t epsilon, float64_t decay_factor) | |
virtual | ~AdaDeltaUpdater () |
virtual void | set_learning_rate (float64_t learning_rate) |
virtual void | set_epsilon (float64_t epsilon) |
virtual void | set_decay_factor (float64_t decay_factor) |
virtual void | update_context (CMinimizerContext *context) |
virtual void | load_from_context (CMinimizerContext *context) |
virtual void | update_variable (SGVector< float64_t > variable_reference, SGVector< float64_t > raw_negative_descend_direction, float64_t learning_rate) |
virtual void | set_descend_correction (DescendCorrection *correction) |
virtual bool | enables_descend_correction () |
Protected 成员函数 | |
virtual float64_t | get_negative_descend_direction (float64_t variable, float64_t gradient, index_t idx, float64_t learning_rate) |
AdaDeltaUpdater | ( | ) |
在文件 AdaDeltaUpdater.cpp 第 36 行定义.
AdaDeltaUpdater | ( | float64_t | learning_rate, |
float64_t | epsilon, | ||
float64_t | decay_factor | ||
) |
Parameterized Constructor
learning_rate | learning_rate |
epsilon | epsilon |
decay_factor | decay_factor |
在文件 AdaDeltaUpdater.cpp 第 42 行定义.
|
virtual |
在文件 AdaDeltaUpdater.cpp 第 73 行定义.
|
virtualinherited |
Do we enable descend correction?
在文件 DescendUpdaterWithCorrection.h 第 145 行定义.
|
protectedvirtual |
Get the negative descend direction given current variable and gradient
It will be called at update_variable()
variable | current variable (eg, \(\theta\)) |
gradient | current gradient (eg, \( \frac{ \partial f(\cdot) }{\partial \theta }\)) |
idx | the index of the variable |
learning_rate | learning rate (for AdaDelta, learning_rate is NOT used because there is a build-in learning_rate) |
实现了 DescendUpdaterWithCorrection.
在文件 AdaDeltaUpdater.cpp 第 124 行定义.
|
virtual |
Return a context object which stores mutable variables Usually it is used in serialization.
This method will be called by FirstOrderMinimizer::load_from_context(CMinimizerContext* context)
重载 DescendUpdaterWithCorrection .
在文件 AdaDeltaUpdater.cpp 第 106 行定义.
|
virtual |
|
virtualinherited |
Set the type of descend correction
correction | the type of descend correction |
在文件 DescendUpdaterWithCorrection.h 第 135 行定义.
|
virtual |
|
virtual |
|
virtual |
Update a context object to store mutable variables
This method will be called by FirstOrderMinimizer::save_to_context()
context | a context object |
重载 DescendUpdaterWithCorrection .
在文件 AdaDeltaUpdater.cpp 第 86 行定义.
|
virtual |
Update the target variable based on the given negative descend direction
Note that this method will update the target variable in place. This method will be called by FirstOrderMinimizer::minimize()
variable_reference | a reference of the target variable |
raw_negative_descend_direction | the negative descend direction given the current value |
learning_rate | learning rate |
重载 DescendUpdaterWithCorrection .
在文件 AdaDeltaUpdater.cpp 第 140 行定义.
|
protected |
learning_rate \( \alpha \) at iteration
在文件 AdaDeltaUpdater.h 第 141 行定义.
|
protectedinherited |
descend correction object
在文件 DescendUpdaterWithCorrection.h 第 165 行定义.
|
protected |
decay term ( \( \lambda \))
在文件 AdaDeltaUpdater.h 第 147 行定义.
|
protected |
\( \epsilon \)
在文件 AdaDeltaUpdater.h 第 144 行定义.
\( g_\theta \)
在文件 AdaDeltaUpdater.h 第 150 行定义.
\( s_\theta \)
在文件 AdaDeltaUpdater.h 第 153 行定义.