时间序列分析之ARIMA模型预测__R篇
之前⼀直⽤SAS做ARIMA模型预测,今天尝试⽤了⼀下R,发现灵活度更⾼,结果输出也更直观。现在记录⼀下如何⽤R分析ARIMA模型。
1. 处理数据
1.1. 导⼊forecast包
forecast包是⼀个封装的ARIMA统计软件包,在默认情况下,R没有预装forecast包,因此需要先安装该包
> install.packages("forecast')
导⼊依赖包zoo,再导⼊forecast包
> library("zoo")老虎怎么画好看又简单
中文打字> library("forecast")
1.2. 导⼊数据
博主使⽤的数据是⼀组航空公司的销售数据,可在此下载数据:,共有132条数据,是以⽉为单位的销售数据。
> airline <- read.table("")
> airline
V1 V2
1 1 112
2 2 118
3 3 132
4 4 129
5 5 121
6 6 135
7 7 148
8 8 148
9 9 136
10 10 119
(........)
1.3. 将数据转化为时间序列格式(ts)
由于将数据转化为时间序列格式,我们并不需要时间字段,因此只取airline数据的第⼆列,即销售数据,⼜因为该数据是以⽉为单位的,因此Period是12。
> airline2 <- ariline[2]
> airts <- ts(airline2,start=1,frequency=12)
2. 识别模型
2.1. 查看趋势图
> plot.ts(airts)
由图可见,该序列还不平稳,先做⼀次Log平滑,再做⼀次差分:
> airlog <- log(airts)
> airdiff <- diff(airlog, differences=1)
> plot.ts(airdiff)
这次看上去就⽐较平稳了,现在看看ACF和PACF的结果
2.2. 查看ACF和PACF
> acf(airdff, lag.max=30)
> acf(airdff, lag.max=30,plot=FALSE)
Autocorrelations of ries ‘airdiff’, by lag
0.00000.08330.16670.25000.33330.41670.50000.58330.66670.75000.8333
1.0000.188 -0.127 -0.154 -0.326 -0.0660.041 -0.098 -0.343 -0.109 -0.120
0.91671.00001.08331.16671.25001.33331.41671.50001.58331.66671.7500
0.1990.8330.198 -0.143 -0.110 -0.288 -0.0460.036 -0.104 -0.313 -0.106
1.83331.9167
2.00002.08332.16672.25002.33332.41672.5000
-
0.0850.1850.7140.175 -0.126 -0.077 -0.214 -0.0460.029
> pacf(airdff, lag.max=30)
> pacf(airdff, lag.max=30,plot=FALSE)
Partial autocorrelations of ries ‘airdiff’, by lag
0.08330.16670.25000.33330.41670.50000.58330.66670.75000.83330.9167
苏区0.188 -0.169 -0.101 -0.3170.018 -0.072 -0.199 -0.509 -0.171 -0.553 -0.300
1.00001.08331.16671.25001.33331.41671.50001.58331.66671.75001.8333
0.5510.010 -0.2000.164 -0.052 -0.037 -0.1080.0940.005 -0.095 -0.001
1.9167
2.00002.08332.16672.25002.33332.41672.5000
0.057 -0.074 -0.0480.0240.0730.0470.0100.033
从ACF和PACF可以看出来,该序列在lag=12和lag=24处有明显的spike,说明该序列需要再做⼀次diff=12的差分。且PACF⽐ACF呈现更明显的指数平滑的趋势,因此先猜测ARIMA模型为:ARIMA(0,1,1)(0,1,1)[12].
2.3. 利⽤auto.arima
> auto.arima(airlog,trace=T)
ARIMA(2,1,2)(1,1,1)[12] : -354.4719
ARIMA(0,1,0)(0,1,0)[12] : -316.8213
ARIMA(1,1,0)(1,1,0)[12] : -356.4353
ARIMA(0,1,1)(0,1,1)[12] : -359.7679
ARIMA(0,1,1)(1,1,1)[12] : -354.9069
ARIMA(0,1,1)(0,1,0)[12] : -327.5759
ARIMA(0,1,1)(0,1,2)[12] : -357.6861
ARIMA(0,1,1)(1,1,2)[12] : -363.2418
ARIMA(1,1,1)(1,1,2)[12] : -359.6535
ARIMA(0,1,0)(1,1,2)[12] : -346.1537
ARIMA(0,1,2)(1,1,2)[12] : -361.1765
ARIMA(1,1,2)(1,1,2)[12] : 1e+20
ARIMA(0,1,1)(1,1,2)[12] : -363.2418
ARIMA(0,1,1)(2,1,2)[12] : -368.8244
ARIMA(0,1,1)(2,1,1)[12] : -368.1761
ARIMA(1,1,1)(2,1,2)[12] : -367.0903
ARIMA(0,1,0)(2,1,2)[12] : -363.7024
ARIMA(0,1,2)(2,1,2)[12] : -366.6877
ARIMA(1,1,2)(2,1,2)[12] : 1e+20
ARIMA(0,1,1)(2,1,2)[12] : -368.8244
Best model: ARIMA(0,1,1)(2,1,2)[12]
Series: airlog
ARIMA(0,1,1)(2,1,2)[12]
Coefficients:
ma1 sar1 sar2 sma1 sma2
-0.2710 -0.4764 -0.1066 -0.0098 -0.1987
农历二十四节气表< 0.09950.14320.10870.15670.1130
sigma^2 estimated as0.001188: log likelihood=231.88屏幕颜色
AIC=-369.57 AICc=-368.82 BIC=-352.9
福建电信招聘auto.arima提供的最佳模型为ARIMA(0,1,1)(2,1,2)[12],我们可以同时测试两个模型,看看哪个更适合。
3. 参数估计
> airarima1 <- arima(airlog,order=c(0,1,1),asonal=list(order=c(0,1,1),period=12),method="ML")
> airarima1
Series: airlog
ARIMA(0,1,1)(0,1,1)[12]
Coefficients:
ma1 sma1
-0.3484 -0.5622
< 0.09430.0774
sigma^2 estimated as0.001313: log likelihood=223.63
AIC=-441.26 AICc=-441.05 BIC=-432.92
开工大吉
> airarima2 <- arima(airlog,order=c(0,1,1),asonal=list(order=c(2,1,2),period=12),method="ML")
> airarima2
Series: airlog
ARIMA(0,1,1)(2,1,2)[12]
Coefficients:
ma1 sar1 sar2 sma1 sma2
-0.35461.0614 -0.1211 -1.91300.9962
< 0.09950.10940.18440.38870.3812
sigma^2 estimated as0.0009811: log likelihood=225.56
AIC=-439.12 AICc=-438.37 BIC=-422.44
两个ARIMA模型都采⽤极⼤似然⽅法估计,计算系数对应的t值:
ARIMA(0,1,1)(0,1,1)[12] :t(ma1)=-39.1791, t(sma1)=-93.8445
ARIMA(0,1,1)(2,1,2)[12] : t(ma1)=-35.8173,t(sar1)=88.68383,t(sar2)=-3.56141,t(sma1)=-12.6615,t(sma2)= 6.855526
可见两个模型的系数都是显著的,⽽ARIMA(0,1,1)(0,1,1)[12]的AIC和BIC⽐ARIMA(0,1,1)(2,1,2)[12]的要⼩,因此选择模型ARIMA(0,1,1) (0,1,1)[12]。
4. 预测
预测五年后航空公司的销售额:
> airforecast <- forecast.Arima(airarima1,h=5,level=c(99.5))
> airforecast
夜不成眠Point Forecast Lo 99.5 Hi 99.5
Jan 126.0386495.9369516.140348
Feb 125.9887625.8673806.110143
Mar 126.1454286.0071376.283719
Apr 126.1189935.9656466.272340
May 126.1596575.9926056.326709
> plot.forecast(airforecast)