Hello,

That's a statistics question, but it's also about using an R function.

The Ljung-Box test isn't supposed to be used in such a context, to test the residuals of an ols y = bX + e. It is used to test time independence of the original series or of the residuals of an ARMA(p, q) fit.

In both cases you are right, 'x' is a series.
'lag' can be explained as follows: you have a time series and want to know if the value observed today depends on what was observed in the past. Then, a linear regression of "today" on "yesterday" could be

X[t] = b[1]*X[t-1] + e[t], e ~ Normal(0, sigma^2)

A linear regression on two time units in the past would be

X[t] = b[1]*X[t-1] + b[2]*X[t-2] + e[t], e ~ Normal(0, sigma^2)

etc. This is a regression of the series on itself lagged by a certain number of time units, the present is regressed on the past. Function ar() fits this kind of model to a time series. In the first case, the order is p=1, in the second, p=2.

Now, in the first case, is there second order serial correlation? Test the residuals with lag=2, fitdf=1, the value of p. Third order? lag=3, fitdf=p=1, etc.

You are NOT fitting this type of model, so the Ljung-Box test is misused. Test the original series with default parameters, lag=1. If there is serial correlation, fit an AR (Auto-Regressive) model with ar(). See the help page ?ar. And see a statiscian with experience in time series. It's a world on its own, I haven't even mentioned seasonality. And almost everything else about time series.

Do ask someone near you.

Hope this helps,

Rui Barradas
Em 26-06-2012 19:01, Steven Winter escreveu:
I fit a simple linear model y = bX to a data set today, and that produced 24 
residuals (I have 24 data points, one for each year from 1984-2007). I would 
like to test the time-independence of the residuals of my model, and I was 
recommended by my supervisor to use the Ljung-Box test. The Box.test function 
in R takes 4 arguments:

x a numeric vector or univariate time series.
lag the statistic will be based on lag autocorrelation
coefficients.
type test to be performed: partial matching is used.
fitdf number of degrees of freedom to be subtracted if x is a series of 
residuals.

Unfortunately, I never took a statistics class where I learned the Ljung-Box test, and information 
about it online is hard to find. What does "lag" mean, and what value would you guys 
recommend I use for the test? Also, what does "fitdf" represent, and what would the value 
for that parameter be in my case? Finally, the value of x is a vector of my 24 residuals, correct?

Thank you all so much. I apologize for the basic nature of the question.

Steven
        [[alternative HTML version deleted]]



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to