详细描述

The class is about a stochastic cost function for stochastic average minimizers.

The cost function must be written as a finite sample-specific sum of cost. For example, least squares cost function,

\[ f(w)=\frac{ \sum_i^n{ (y_i-w^T x_i)^2 } }{2} \]

where \(n\) is the sample size, \((y_i,x_i)\) is the i-th sample, \(y_i\) is the label and \(x_i\) is the features

A stochastic average minimizer uses average sample gradients ( get_average_gradient() ) to reduce variance related to stochastic gradients.

Well known stochastic average methods are: SVRG, Johnson, Rie, and Tong Zhang. "Accelerating stochastic gradient descent using predictive variance reduction." Advances in Neural Information Processing Systems. 2013.

SAG, Schmidt, Mark, Nicolas Le Roux, and Francis Bach. "Minimizing finite sums with the stochastic average gradient." arXiv preprint arXiv:1309.2388 (2013).

SAGA, Defazio, Aaron, Francis Bach, and Simon Lacoste-Julien. "SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives." Advances in Neural Information Processing Systems. 2014.

SDCA, Shalev-Shwartz, Shai, and Tong Zhang. "Stochastic dual coordinate ascent methods for regularized loss." The Journal of Machine Learning Research 14.1 (2013): 567-599.

在文件 FirstOrderSAGCostFunction.h 第 70 行定义.

类 FirstOrderSAGCostFunction 继承关系图:

[图例]

Public 成员函数
virtual int32_t	get_sample_size ()=0

virtual SGVector< float64_t >	get_average_gradient ()=0

virtual SGVector< float64_t >	get_gradient ()=0

virtual float64_t	get_cost ()=0

virtual void	begin_sample ()=0

virtual bool	next_sample ()=0

virtual SGVector< float64_t >	obtain_variable_reference ()=0

成员函数说明

virtual void begin_sample ( )

pure virtualinherited

Initialize to generate a sample sequence

virtual SGVector<float64_t> get_average_gradient ( )

pure virtual

Get the AVERAGE gradient value wrt target variables

Note that the average gradient is the mean of sample gradient from get_gradient() if samples are generated (uniformly) at random.

WARNING This method returns \( \frac{\sum_i^n{ \frac{\partial f_i(w) }{\partial w} }}{n}\)

For least squares, that is the value of \( \frac{\frac{\partial f(w) }{\partial w}}{n} \) given \(w\) is known where \(f(w)=\frac{ \sum_i^n{ (y_i-w^t x_i)^2 } }{2}\)

返回: average gradient of target variables

virtual float64_t get_cost ( )

pure virtual

Get the cost given current target variables

For least squares cost function, that is the value of \(f(w)\).

返回: cost

实现了 FirstOrderStochasticCostFunction.

virtual SGVector<float64_t> get_gradient ( )

pure virtual

Get the SAMPLE gradient value wrt target variables

WARNING This method does return \( \frac{\partial f_i(w) }{\partial w} \) instead of \(\sum_i^n{ \frac{\partial f_i(w) }{\partial w} }\)

For least squares cost function, that is the value of \(\frac{\partial f_i(w) }{\partial w}\) given \(w\) is known where the index \(i\) is obtained by next_sample()

返回: sample gradient of target variables

实现了 FirstOrderStochasticCostFunction.

virtual int32_t get_sample_size ( )

pure virtual

Get the sample size

返回: the sample size

virtual bool next_sample ( )

pure virtualinherited

Get next sample

返回: false if reach the end of the sample sequence

virtual SGVector<float64_t> obtain_variable_reference ( )

pure virtualinherited

Obtain a reference of target variables Minimizers will modify target variables in place.

This method will be called by FirstOrderMinimizer::minimize()

For least squares, that is \(w\)

返回: reference of variables

该类的文档由以下文件生成:

FirstOrderSAGCostFunction.h

详细描述

Public 成员函数

成员函数说明