说明:整理⾃ Forecast:Principe and Practice 。python⾥的pyramid.arima.auto_arima也是在R语⾔auto.arima的基础上写的。
算法通过stepwi arch⽽不是遍历所有可能的p,q组合来寻找最优的p,q组合。
⼀个不含常数项的模型也被拟合 ARIMA(0,d,0);
“current model”;
(2.c)考察current model的以下变种模型:
——对p 和/或 q的值改变±1,
将上述变种和原来的current model中AICc最⼩的模型即为
最新的current model。
2. auto.arima参数简述
auto_arima(y, exogenous=None,
start_p=2, d=None, start_q=2, max_p=5, max_d=2, max_q=5, start_P=1, D=None, start_Q=1, max_P=2, max_D=1,
max_Q=2, max_order=10, m=1, asonal=True, stationary=Fal, information_criterion=‘aic’, alpha=0.05, test=‘kpss’, asonal_test=‘ch’, stepwi=True, n_jobs=1, start_params=None, trend=‘c’, method=None, transparams=True,
solver=‘lbfgs’, maxiter=50, disp=0, callback=None, offt_test_args=None, asonal_test_args=None,
suppress_warnings=Fal, error_action=‘warn’, trace=Fal, random=Fal, random_state=None, n_fits=10,
return_valid_fits=Fal, out_of_sample_size=0, scoring=‘m’, scoring_args=None, **fit_args)
y : 要拟合的时间序列,需要是⼀维的浮点型数组。不能包含‘np.nan’ 或者‘np.inf’;
exogenous : 可以在给定时间序列数据之外,给定额外的特征来帮助预测,需要注意的是,对于预测未来的时序数据的时候,也要提供未来的特征数据。
start_p : int, 默认2,算法⾃动选择p时的下界。
d : int, 默认None,⾮周期的差分阶数,如果是None,则⾃动选择,此时,运⾏时间会显著增加。
start_q : int, 默认2,算法⾃动选择q时的下界。
max_p : int, 默认5,算法⾃动选择p时的上界,必须≥start_p。
max_d : int, 默认2,算法⾃动选择d(⾮周期差分阶数)时的上界,必须≥d。
max_q : int, 默认5,算法⾃动选择q时的上界,必须≥start_q。
start_P : int,默认1,周期模型⾃动选择P时的下界。
D : int,默认None,周期差分的阶数,如果是None,则⾃动选择。
start_Q : int, 默认1,周期模型⾃动选择Q时的下界。
max_P : int,默认2,周期模型⾃动选择P时的上界。
max_D : int, 默认1,周期差分阶数的最⼤值,必须≥D。
max_Q : int,默认2,周期模型⾃动选择Q时的上界。
max_order : int,默认10,如果p+q≥max_order,该组合对应的模型将不会被拟合。
m : int, 默认1,周期数,例如季度数据m=4,⽉度数据m=12;如果m=1,则asonal会被设置为Fal。
asonal : bool, 默认True,是否进⾏周期ARIMA拟合。需要注意的是,如果asonal=True同时m=1,asonal会被设置为Fal。stationary : bool, 默认Fal,标志该序列是否是平稳序列。
information_criterion : str, 默认’aic’,模型评价指标,‘aic’, ‘bic’, ‘hqic’,'oob’之⼀。
alpha : float,默认0.05,test的显著性⽔平。
test : str, 默认’kpss’,单位根检验的类型,当⾮平稳且d=None才会进⾏检验。
asonal_test : str, 默认’ch’,周期单位根检验⽅法的标志。
stepwi : bool,默认True,如果为True,模型搜寻范围扩⼤,耗时显著增加。
n_jobs : int,默认1,并⾏拟合模型的数⽬,如果为-1,则尽可能多的并⾏。
start_params : array-like, 默认None,ARMA(p,q)的起始参数。
transparams : bool,默认True,如果为True,则进⾏变换确保平稳性,如果为Fal,不检验平稳性和可逆性。
method : str, 似然函数的类型,{‘css-mle’,‘mle’,‘css’}之⼀。
trend : str or iterable, 多项式趋势的多项式的系数。
solver : str or None, 默认’lbfgs’,模型求解器。其它选项如’bfgs’、‘newton’ 等等。
maxiter : int, 默认50,The maximum number of function evaluations。
disp : int, 默认0,收敛信息的打印控制。disp<0表⽰不打印任何信息。
callback : callable, optional (default=None)
Called after each iteration as callback(xk) where xk is the current
parameter vector. This is only ud in non-asonal ARIMA models.
offt_test_args : dict, optional (default=None)
The args to pass to the constructor of the offt (d) test. See pyramid.arima.stationarity for more details.
asonal_test_args : dict, optional (default=None)
The args to pass to the constructor of the asonal offt (D) test. See pyramid.arima.asonality for more details.
suppress_warnings : bool, optional (default=Fal)
Many warnings might be thrown inside of statsmodels. If suppress_warnings is True, all of the warnings coming from ARIMA will be squelched.
error_action : str, optional (default=‘warn’)
If unable to fit an ARIMA due to stationarity issues, whether to
warn (‘warn’), rai the ValueError (‘rai’) or ignore (‘ignore’).Note that the default behavior is to warn, and fits that fail will be returned as None. This is the recommended behavior, as statsmodels ARIMA and SARIMAX models hit bugs periodically that can cau an otherwi healthy parameter combination to fail for reasons not related to pyramid.
trace : bool, optional (default=Fal)
Whether to print status on the fits. Note that this can be very verbo…
random : bool, optional (default=Fal)
Similar to grid arches, auto_arima provides the capability to
perform a “random arch” over a hyper-parameter space. If random is True, rather than perform an exhaustive arch or stepwi arch, only n_fits ARIMA models will be fit (stepwi must be Fal for this option to do anything).
random_state : int, long or numpy RandomState, optional (default=None) The PRNG for when random=True. Ensures replicable testing and results.
n_fits : int, optional (default=10)
If random is True and a “random arch” is going to be performed,n_iter is the number of ARIMA models to be fit.
return_valid_fits : bool, optional (default=Fal)
If True, will return all valid ARIMA fits in a list. If Fal (by default), will only return the best fit.
out_of_sample_size : int, optional (default=0)
The ARIMA class can fit only a portion of the data if specified,
in order to retain an “out of bag” sample score. This is the
number of examples from the tail of the time ries to hold out
and u as validation examples. The model will not be fit on the
samples, but the obrvations will be added into the model’s endog and exog arrays so that future forecast values originate from the end of the endogenous vector.
For instance::
y = [0, 1, 2, 3, 4, 5, 6]
out_of_sample_size = 2
> Fit on: [0, 1, 2, 3, 4]
> Score on: [5, 6]
> Append [5, 6] to end of lf.arima_res_.dog values
scoring : str, optional (default=‘m’)
If performing validation (i.e., if out_of_sample_size > 0), the metric to u for scoring the out-of-sample data. One of
scoring_args : dict, optional (default=None)
A dictionary of key-word arguments to be pasd to the scoring metric.
**fit_args : dict, optional (default=None)
A dictionary of keyword arguments to pass to the :func:‘ARIMA.fit’ method.