Good evening!

I have a question regarding  forecast package and time series analysis.
My syntax:

x<-c(253, 252, 275, 275, 272, 254, 272, 252, 249, 300, 244, 258, 255, 285, 301, 
278, 279, 304, 275, 276, 313, 292, 302, 322, 281, 298, 305, 295, 286, 327, 286, 
270, 289, 293, 287, 267, 267, 288, 304, 273, 264, 254, 263, 265, 278)
library(forecast)
arima(x, order=c(1,1,2), seasonal=list(order=c(0,1,0), period=12))->l
auto.arima(x)->k
sd(l$resid)
sd(k$resid)
predict(l,n.ahead=1)
predict(k,n.ahead=1)

1. I understand that auto.arima will find the best time series model choosing 
the smaller AIC, BIC and AICc from competing models, but my model finds a 
smaller AIC than that of the auto.arima. but the sd of the residuals for my 
model is somehow bigger. 
Why? Am I missing something? 
Now the sd of the residuals for my model is somehow bigger, as well as the se 
for the predicted value.  What model would you choose between this two and why? 
  

2. This question is more theoretical 

 m<-sample(c(10:20),10,replace=T)
 f<-sample(c(10:20),10,replace=T)
 t<-m+f
 s<-rbind(m,f,t)
 s

Let's say I have a panel sample at disposal and consider m to be the monthly 
average quantity of juice consumption for the  male part of the sample and f to 
be the monthly average quantity of juice consumption for the  female part of 
the sample, and t the average quantity of juice consumption for the whole 
sample. For the mean of the whole sample i have a confidence interval of say 
+/-2 each month (say I have a sample of 2000 individuals). If I try to come up 
with a confidence interval only for the male population (which in my sample is  
say 1000) it would certainly by bigger, because i now have a male sample of 
1000 for determining the mean consumption for the whole male population. So my 
confidence interval is bigger for mean male consumption than for the whole 
sample (because N declines from 2000 to 1000). Now if I tried to predict the 
the next month's consumption for both my time series (male and whole sample) 
the prediction would not "care" that when establishing the
 mean consumption i used first 2000 people and then 1000. Am I right?
Imagine that each month (from 10 that I sampled above) has such a confidence 
interval of +/-3. Now how would a future prediction would incorporate this 
fact: that my mean consumption is not measured via a Census, but using a 
sample, and that the number is an estimation of the real consumption, within a 
confidence interval?
Is there a good reference text for this incorporation of the confidence 
interval  of past values in determining  the future values ? 

Thank you and have a great day!




       
---------------------------------

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to