SHOGUN
4.1.0
|
The class implements the stochastic variance reduced gradient (SVRG) minimizer.
Reference: Johnson, Rie, and Tong Zhang. "Accelerating stochastic gradient descent using predictive variance reduction." Advances in Neural Information Processing Systems. 2013.
Definition at line 47 of file SVRGMinimizer.h.
Public Member Functions | |
SVRGMinimizer () | |
SVRGMinimizer (FirstOrderSAGCostFunction *fun) | |
virtual | ~SVRGMinimizer () |
virtual float64_t | minimize () |
virtual void | set_sgd_number_passes (int32_t sgd_passes) |
virtual void | set_average_update_interval (int32_t interval) |
virtual bool | supports_batch_update () const |
virtual void | set_gradient_updater (DescendUpdater *gradient_updater) |
virtual void | set_number_passes (int32_t num_passes) |
virtual void | load_from_context (CMinimizerContext *context) |
virtual void | set_learning_rate (LearningRate *learning_rate) |
virtual int32_t | get_iteration_counter () |
virtual void | set_cost_function (FirstOrderCostFunction *fun) |
virtual CMinimizerContext * | save_to_context () |
virtual void | set_penalty_weight (float64_t penalty_weight) |
virtual void | set_penalty_type (Penalty *penalty_type) |
Protected Member Functions | |
virtual void | init_minimization () |
virtual void | do_proximal_operation (SGVector< float64_t >variable_reference) |
virtual void | update_context (CMinimizerContext *context) |
virtual float64_t | get_penalty (SGVector< float64_t > var) |
virtual void | update_gradient (SGVector< float64_t > gradient, SGVector< float64_t > var) |
Protected Attributes | |
int32_t | m_num_sgd_passes |
int32_t | m_svrg_interval |
SGVector< float64_t > | m_average_gradient |
SGVector< float64_t > | m_previous_variable |
DescendUpdater * | m_gradient_updater |
int32_t | m_num_passes |
int32_t | m_cur_passes |
int32_t | m_iter_counter |
LearningRate * | m_learning_rate |
FirstOrderCostFunction * | m_fun |
Penalty * | m_penalty_type |
float64_t | m_penalty_weight |
SVRGMinimizer | ( | ) |
Default constructor
Definition at line 37 of file SVRGMinimizer.cpp.
SVRGMinimizer | ( | FirstOrderSAGCostFunction * | fun | ) |
|
virtual |
Destructor
Definition at line 43 of file SVRGMinimizer.cpp.
|
protectedvirtualinherited |
Do proximal update in place
variable_reference | variable_reference to be updated |
Definition at line 167 of file FirstOrderStochasticMinimizer.h.
|
virtualinherited |
How many samples/mini-batch does the minimizer use?
Definition at line 160 of file FirstOrderStochasticMinimizer.h.
Get the penalty given target variables For L2 penalty, the target variable is \(w\) and the value of penalty is \(\lambda \frac{w^t w}{2}\), where \(\lambda\) is the weight of penalty
var | the variable used in regularization |
Definition at line 164 of file FirstOrderMinimizer.h.
|
protectedvirtual |
init the minimization process
Reimplemented from FirstOrderStochasticMinimizer.
Definition at line 61 of file SVRGMinimizer.cpp.
|
virtualinherited |
Load the given context object to restores mutable variables Usually it is used in deserialization.
context | a context object |
Reimplemented from FirstOrderMinimizer.
Reimplemented in SMDMinimizer, and SMIDASMinimizer.
Definition at line 136 of file FirstOrderStochasticMinimizer.h.
|
virtual |
Do minimization and get the optimal value
Implements FirstOrderStochasticMinimizer.
Definition at line 81 of file SVRGMinimizer.cpp.
|
virtualinherited |
Return a context object which stores mutable variables Usually it is used in serialization.
Reimplemented in LBFGSMinimizer, and NLOPTMinimizer.
Definition at line 103 of file FirstOrderMinimizer.h.
|
virtual |
Set the number of interval to average stochastic sample gradients
If we have \((n-g)\) passes to go through data and the interval is \(k\), we will average stochastic sample gradients at the 0-th, k-th, 2k-th, 3k-th, ... pass
Note that \(n\) is the total number to go through data and \(g\) is the number of using SGDMinimizer to initialize variables,
interval | how often to average stochastic sample gradients |
Definition at line 94 of file SVRGMinimizer.h.
|
virtualinherited |
Set cost function used in the minimizer
fun | the cost function |
Definition at line 92 of file FirstOrderMinimizer.h.
|
virtualinherited |
Set a gradient updater
gradient_updater | the gradient_updater |
Definition at line 104 of file FirstOrderStochasticMinimizer.h.
|
virtualinherited |
Set the learning rate of a minimizer
learning_rate | learn rate or step size |
Definition at line 151 of file FirstOrderStochasticMinimizer.h.
|
virtualinherited |
Set the number of times to go through all data points (samples) For example, num_passes=1 means go through all data points once.
Recall that a stochastic cost function \(f(w)\) can be written as \(\sum_i{ f_i(w) }\), where \(f_i(w)\) is the differentiable function for the i-th sample.
num_passes | the number of times |
Definition at line 125 of file FirstOrderStochasticMinimizer.h.
|
virtualinherited |
Set the type of penalty For example, L2 penalty
penalty_type | the type of penalty. If NULL is given, regularization is not enabled. |
Definition at line 137 of file FirstOrderMinimizer.h.
|
virtualinherited |
Set the weight of penalty
penalty_weight | the weight of penalty, which is positive |
Definition at line 126 of file FirstOrderMinimizer.h.
|
virtual |
Set the number to go through data using SGDMinimizer to initialize variables before SVRG minimization
sgd_passes | the number to go through data using SGDMinimizer |
Definition at line 72 of file SVRGMinimizer.h.
|
virtualinherited |
Does minimizer support batch update
Implements FirstOrderMinimizer.
Definition at line 98 of file FirstOrderStochasticMinimizer.h.
|
protectedvirtualinherited |
Update a context object to store mutable variables
context | a context object |
Reimplemented from FirstOrderMinimizer.
Reimplemented in SMDMinimizer, and SMIDASMinimizer.
Definition at line 187 of file FirstOrderStochasticMinimizer.h.
|
protectedvirtualinherited |
Add gradient of the penalty wrt target variables to unpenalized gradient For least sqaure with L2 penalty,
\[ L2f(w)=f(w) + L2(w) \]
where \( f(w)=\sum_i{(y_i-w^T x_i)^2}\) is the least sqaure cost function and \(L2(w)=\lambda \frac{w^t w}{2}\) is the L2 penalty
Target variables is \(w\) Unpenalized gradient is \(\frac{\partial f(w) }{\partial w}\) Gradient of the penalty wrt target variables is \(\frac{\partial L2(w) }{\partial w}\)
gradient | unpenalized gradient wrt its target variable |
var | the target variable |
Definition at line 190 of file FirstOrderMinimizer.h.
used to store average gradient
Definition at line 119 of file SVRGMinimizer.h.
|
protectedinherited |
current iteration to go through data
Definition at line 215 of file FirstOrderStochasticMinimizer.h.
|
protectedinherited |
Cost function
Definition at line 205 of file FirstOrderMinimizer.h.
|
protectedinherited |
the gradient update step
Definition at line 200 of file FirstOrderStochasticMinimizer.h.
|
protectedinherited |
number of used samples/mini-batches
Definition at line 218 of file FirstOrderStochasticMinimizer.h.
|
protectedinherited |
learning_rate object
Definition at line 221 of file FirstOrderStochasticMinimizer.h.
|
protectedinherited |
iteration to go through data
Definition at line 212 of file FirstOrderStochasticMinimizer.h.
|
protected |
the number to go through data using SGD before SVRG update
Definition at line 113 of file SVRGMinimizer.h.
|
protectedinherited |
the type of penalty
Definition at line 208 of file FirstOrderMinimizer.h.
|
protectedinherited |
the weight of penalty
Definition at line 211 of file FirstOrderMinimizer.h.
used to store previous result
Definition at line 122 of file SVRGMinimizer.h.
|
protected |
interval to average gradient
Definition at line 116 of file SVRGMinimizer.h.