Sorry, but this gives me the shivers!
Are all your time series linear? For each model you should check the residuals and their squares to see if they are uncorrelated (Box-ljung Chi-sq). Another useful check is to test for a trend in the coefficient of variation of the residuals. If the series is linear then AIC (or preferably BIC) is an excellent measure of predictive performance. If these tests fail your auto-model has not worked - in fact if auto-model has an ar value of 10 then you can bet this is not a linear series. For most series the polynomials have max ar=3 and max ma=3. So using ar=10 isn't really a good idea - you're way overfitting - this is only masking something else such as long memory. Cross-validation issues For series with a nonlinear aspect, arbitartily splitting it up at whatever point you feel like is not such a good idea as it interferes with the dynamics. If your series is linear then the proposal isn't too bad because any induced discontinuity will be damped out by the ar process in a finite number of steps. There are 2 types of predictive power here. The first is based on a fixed point (usually the last) y(t_n) - this is the conditional forecast. The second is the predictive power at any point - this is the contidional forecast marginalised across all points. For a linear series (a la Box-Jenkins) the variance of each is the same. For nonlinear series this is not the case. You need to decide which is required and construct your sub-samples accordingly. The suggested cross-validation is a type of marginal forecast. To make a reasonable effort at the correct approach you should look up block boopstrap methods for time series. The key to these is they pick blocks that match the thing you're trying to measure. If, for example, you were computing the autocorrelation y_t vs. y_(t-1) then the blocks are made of pairs of adjacent values. For seasonal data you must block to ensure seasons are maintained. Finally, your Australian colleague Rob Hyndman is a good source for bootstrapping time series - his website may have details of work he did on electricity demand which you might find useful. Gerard Gad Abraham <gabra...@csse.un imelb.edu.au> To Sent by: Evan DeCorte <evan...@gwu.edu> r-help-boun...@r- cc project.org r-help@r-project.org Subject Re: [R] Testing predictive power of 14/12/2008 01:50 ARIMA model Evan DeCorte wrote: > Thanks for the great feedback. Conceptually I understand how you would go about testing out of sample performance. It seems like accuracy() would be the best way to test out of forecast performance and will help to automate the construction of statistics I would have calculated on my own. > > However, the real question now is how do you loop through a time series and automatically split a time series into training and testing sets. I know how I would do it for individual sets but to do so manually over a large number of time series seems excessively burdensome. You don't have to do it manually. For example, if you want to do 10-fold cross-validation, and you have a time series of length n, then split it into n/10 blocks, e.g. using the index i <- rep(1:10, each=n/10) (assuming n is divisible by 10), loop 10 times using 9 blocks as training and 1 block as test (different test block each time) and measure the MSE for each repetition. Repeat this for all your time series. -- Gad Abraham Dept. CSSE and NICTA The University of Melbourne Parkville 3010, Victoria, Australia email: gabra...@csse.unimelb.edu.au web: http://www.csse.unimelb.edu.au/~gabraham ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ********************************************************************************** The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. It is the policy of the Department of Justice, Equality and Law Reform and the Agencies and Offices using its IT services to disallow the sending of offensive material. Should you consider that the material contained in this message is offensive you should contact the sender immediately and also mailminder[at]justice.ie. Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti. Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt, Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú. Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus le mailminder[ag]justice.ie chomh maith. *********************************************************************************** ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.