Hi:

Since the series is obviously nonstationary and periodic, it would seem that
one should embrace a wider window over which to evaluate the DF test. I did
the following:

y <- ts(Y, frequency = 12)   # looked like an annual series to me
plot(stl(y, 'periodic'))     # very informative!!
adf.test(y)

        Augmented Dickey-Fuller Test

data:  y
Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

# Try a wider window of 12 lags:
adf.test(y, k = 12)

        Augmented Dickey-Fuller Test

data:  y
Dickey-Fuller = -1.061, Lag order = 12, p-value = 0.9261
alternative hypothesis: stationary

This certainly made me wonder, and since I don't know much about unit root
testing, it was time to play. What should k be? I decided to plot the value
of the ADF statistic and its p-value for lags k from 5 to 40 just to see
what would happen.

df <- data.frame(lag = 5:40,
                           ADF.statistic = sapply(5:40, function(x)
adf.test(y, k = x)$statistic),
                           ADF.pvalue = sapply(5:40, function(x) adf.test(y,
k = x)$p.value))
library(ggplot2)
dfm <- melt(df, id = 'lag')
# one could use geom_point() in place of geom_path()
ggplot(dfm, aes(x = lag, y = value)) + geom_path(size = 1) +
     facet_grid(variable ~ ., scales = 'free_y') +  xlab('Lag number') +
ylab("")

This shows pretty clearly that after about eight or nine lags, the ADF
statistic fails to reject the null hypothesis of a unit root. The appearance
of periodicity in both graphs indicates that the seasonality of the series
has some impact on the value of the augmented DF test. Since I'm not up on
the literature re this test, is it meant to be applied in the presence of
seasonality? If not, I would suggest deseasonalizing the series first before
applying the unit root test.

I tried a few models and the best fitting one had differencing in both the
non-seasonal and seasonal parts. This isn't really  supported by the stl()
graph, but there is an improvement in the fit if the seasonal part of the
series is differenced, too. So if there is nonstationarity in both the
non-seasonal and seasonal parts of the series, is the ADF test sensitive
enough to pick up on this and does it still do the right thing? From my
limited web searching, I didn't find any adjustments in the presence of
seasonality, statiionary or not, which makes me wonder...

HTH,
Dennis

On Thu, Oct 28, 2010 at 8:47 PM, Cuckovic Paik <cuckovic.p...@gmail.com>wrote:

>
> Dear Users, please help with the following DF test:
> =====
> library(tseries)
> library(timeSeries)
>
> Y=c(3519,3803,4332,4251,4661,4811,4448,4451,4343,4067,4001,3934,3652,3768
> ,4082,4101,4628,4898,4476,4728,4458,4004,4095,4056,3641,3966,4417,4367
> ,4821,5190,4638,4904,4528,4383,4339,4327,3856,4072,4563,4561,4984,5316
> ,4843,5383,4889,4681,4466,4463,4217,4322,4779,4988,5383,5591,5322,5404
> ,5106,4871,4977,4706,4193,4460,4956,5022,5408,5565,5360,5490,5286,5257
> ,5002,4897,4577,4764,5052,5251,5558,5931,5476,5603,5425,5177,4792,4776
> ,4450,4659,5043,5233,5423,5814,5339,5474,5278,5184,4975,4751,4600,4718
> ,5218,5336,5665,5900,5330,5626,5512,5293,5143,4842,4627,4981,5321,5290
> ,6002,5811,5671,6102,5482,5429,5356,5167,4608,4889,5352,5441,5970,5750
> ,5670,5860,5449,5401,5240,5229,4770,5006,5518,5576,6160,6121,5900,5994
> ,5841,5832,5505,5573,5331,5355,6057,6055,6771,6669,6375,6666,6383,6118
> ,5927,5750,5122,5398,5817,6163,6763,6835,6678,6821,6421,6338,6265,6291
> ,5540,5822,6318,6268,7270,7096,6505,7039,6440,6446,6717,6320)
>
> YY=as.timeSeries(Y)
>
> adf.test(Y)
> adf.test(YY)
> ========  Output ====
> > adf.test(Y)
>
>        Augmented Dickey-Fuller Test
>
> data:  Y
> Dickey-Fuller = -6.1661, Lag order = 5, p-value = 0.01
> alternative hypothesis: stationary
>
> Warning message:
> In adf.test(Y) : p-value smaller than printed p-value
> > adf.test(YY)
>
>        Augmented Dickey-Fuller Test
>
> data:  YY
> Dickey-Fuller = 12.4944, Lag order = 5, p-value = 0.99
> alternative hypothesis: stationary
>
> Warning message:
> In adf.test(YY) : p-value greater than printed p-value
> >
> ==========================================
> Question: Why the two results are different?
>
> The help file says that the input series is either a numeric vector or a
> time series object. But the results are completely opposite if the
> different
> types of arguments are used. Thanks in advance.
>
>
>
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Dickey-Fuller-Test-tp3018408p3018408.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to