The subject is a Generalized Additive Model. Experts caution us against overfitting the data, which can cause inaccurate results. I am not a statistician (my background is in Computer Science). Perhaps some kind soul would take a look and vet the model for overfitting the data.
The study estimated the ebb and flow of traffic through a voting place. Just one voting place was studied; the election was the U.S. mid-term election about a year ago. Procedure: The voting day was divided into five-minute bins, and the number of voters arriving in each bin was recorded. The voting day was 13 hours long, giving 156 bins. See http://tinyurl.com/36vzop for the scatterplot. There is a rather high random variation, due in part to the fact that the bin width was intentionally set to be narrow, in order to improve the amount of timing information gathered. http://tinyurl.com/3xjsyo displays the fitted curve. A GAM was used, with the loess smoothing algorithm (locally weighted regression). The default span was used. http://tinyurl.com/34av6l gives the scatterplot and the fitted curve. The two seem to match reasonably well. However, when I tried to generate the standard errors, things went awry. (Please see http://tinyurl.com/38ej2t ) There are three curves, seemingly the fitted curve and the curves for plus and minus two standard errors. The shapes seem okay, but there are large errors in the y values. Question: Have I overfitted the data? Feedback? Tom Thomas L. Jones, PhD, Computer Science ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.