Hi: Since the series is obviously nonstationary and periodic, it would seem that one should embrace a wider window over which to evaluate the DF test. I did the following:
y <- ts(Y, frequency = 12) # looked like an annual series to me plot(stl(y, 'periodic')) # very informative!! adf.test(y) Augmented Dickey-Fuller Test data: y Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01 alternative hypothesis: stationary # Try a wider window of 12 lags: adf.test(y, k = 12) Augmented Dickey-Fuller Test data: y Dickey-Fuller = -1.061, Lag order = 12, p-value = 0.9261 alternative hypothesis: stationary This certainly made me wonder, and since I don't know much about unit root testing, it was time to play. What should k be? I decided to plot the value of the ADF statistic and its p-value for lags k from 5 to 40 just to see what would happen. df <- data.frame(lag = 5:40, ADF.statistic = sapply(5:40, function(x) adf.test(y, k = x)$statistic), ADF.pvalue = sapply(5:40, function(x) adf.test(y, k = x)$p.value)) library(ggplot2) dfm <- melt(df, id = 'lag') # one could use geom_point() in place of geom_path() ggplot(dfm, aes(x = lag, y = value)) + geom_path(size = 1) + facet_grid(variable ~ ., scales = 'free_y') + xlab('Lag number') + ylab("") This shows pretty clearly that after about eight or nine lags, the ADF statistic fails to reject the null hypothesis of a unit root. The appearance of periodicity in both graphs indicates that the seasonality of the series has some impact on the value of the augmented DF test. Since I'm not up on the literature re this test, is it meant to be applied in the presence of seasonality? If not, I would suggest deseasonalizing the series first before applying the unit root test. I tried a few models and the best fitting one had differencing in both the non-seasonal and seasonal parts. This isn't really supported by the stl() graph, but there is an improvement in the fit if the seasonal part of the series is differenced, too. So if there is nonstationarity in both the non-seasonal and seasonal parts of the series, is the ADF test sensitive enough to pick up on this and does it still do the right thing? From my limited web searching, I didn't find any adjustments in the presence of seasonality, statiionary or not, which makes me wonder... HTH, Dennis On Thu, Oct 28, 2010 at 8:47 PM, Cuckovic Paik <cuckovic.p...@gmail.com>wrote: > > Dear Users, please help with the following DF test: > ===== > library(tseries) > library(timeSeries) > > Y=c(3519,3803,4332,4251,4661,4811,4448,4451,4343,4067,4001,3934,3652,3768 > ,4082,4101,4628,4898,4476,4728,4458,4004,4095,4056,3641,3966,4417,4367 > ,4821,5190,4638,4904,4528,4383,4339,4327,3856,4072,4563,4561,4984,5316 > ,4843,5383,4889,4681,4466,4463,4217,4322,4779,4988,5383,5591,5322,5404 > ,5106,4871,4977,4706,4193,4460,4956,5022,5408,5565,5360,5490,5286,5257 > ,5002,4897,4577,4764,5052,5251,5558,5931,5476,5603,5425,5177,4792,4776 > ,4450,4659,5043,5233,5423,5814,5339,5474,5278,5184,4975,4751,4600,4718 > ,5218,5336,5665,5900,5330,5626,5512,5293,5143,4842,4627,4981,5321,5290 > ,6002,5811,5671,6102,5482,5429,5356,5167,4608,4889,5352,5441,5970,5750 > ,5670,5860,5449,5401,5240,5229,4770,5006,5518,5576,6160,6121,5900,5994 > ,5841,5832,5505,5573,5331,5355,6057,6055,6771,6669,6375,6666,6383,6118 > ,5927,5750,5122,5398,5817,6163,6763,6835,6678,6821,6421,6338,6265,6291 > ,5540,5822,6318,6268,7270,7096,6505,7039,6440,6446,6717,6320) > > YY=as.timeSeries(Y) > > adf.test(Y) > adf.test(YY) > ======== Output ==== > > adf.test(Y) > > Augmented Dickey-Fuller Test > > data: Y > Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01 > alternative hypothesis: stationary > > Warning message: > In adf.test(Y) : p-value smaller than printed p-value > > adf.test(YY) > > Augmented Dickey-Fuller Test > > data: YY > Dickey-Fuller = 12.4944, Lag order = 5, p-value = 0.99 > alternative hypothesis: stationary > > Warning message: > In adf.test(YY) : p-value greater than printed p-value > > > ========================================== > Question: Why the two results are different? > > The help file says that the input series is either a numeric vector or a > time series object. But the results are completely opposite if the > different > types of arguments are used. Thanks in advance. > > > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Dickey-Fuller-Test-tp3018408p3018408.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.