Sorry, let me fix 1 sentence. "Here I try to mean by "overfitting" that GCV was significantly SMALLER than the mean square error of prediction of the validation data, which was randomly selected and not used for regression."
> Thank you for valuable advices. > I'm sorry Dr. N. Wood that by mistake I sent this reply firstly to > your personal e-mail address. > > I will use the "min.sp" argument when the data size is very small. I'd > like to know if there is any criteria for selecting "min.sp." > > I compared gamma=1.0 and 1.4, and I could see the smoothing effects of > enhancing gamma by comparing edf and smoothing parameter. But it was > not enough to suppress the overfitting when data size was small. > > Here I try to mean by "overfitting" that GCV was significantly larger > than the mean square error of prediction of the validation data, which > was randomly selected and not used for regression. > > Best Wishes, > Ariyo > > 2007/10/3, Simon Wood <[EMAIL PROTECTED]>: > > On Wednesday 03 October 2007 10:49, Ariyo Kanno wrote: > > > I appreciate your quick reply. > > > I am using the model of the following structure : > > > > > > fit <- gam(y~x1+s(x2)) > > > > > > ,where y, x1, and x2 are quantitative variables. > > > So the response distribution is assumed to be gaussian(default). > > > > > > Now I understand that the data size was too small. > > -- Well, the 10 end is definitely too small, but you can get quite > > reasonable > > estimates of a single smoothing parameter from 30+ gaussian data. > > -- You can force smoother models my either setting the smoothing parameter > > yourself using the `sp' argument to `gam', or by using the `min.sp' argument > > to set a lower bound on the smoothing parameter. > > -- I'm suprised that `gamma' had no effect - how high did you try? > > > > best, > > Simon > > > > > > > > > Thank you. > > > > > > Best Wishes, > > > > > > Ariyo > > > > > > 2007/10/3, Simon Wood <[EMAIL PROTECTED]>: > > > > What sort of model structure are you using? In particular what is the > > > > response distribution? For poisson and binomial then overfitting can be > > > > a > > > > sign of overdispersion and quasipoisson or quasibinomial may be better. > > > > Also I would not expect to get useful smoothing parameter estimates from > > > > 10 data! > > > > > > > > best, > > > > Simon > > > > > > > > On Wednesday 03 October 2007 06:55, 神野有生 wrote: > > > > > Dear listers, > > > > > > > > > > I'm using gam(from mgcv) for semi-parametric regression on small and > > > > > noisy datasets(10 to 200 > > > > > observations), and facing a problem of overfitting. > > > > > > > > > > According to the book(Simon N. Wood / Generalized Additive Models: An > > > > > Introduction with R), it is > > > > > suggested to avoid overfitting by inflating the effective degrees of > > > > > freedom in GCV evaluation with > > > > > increased "gamma" value(e.g. 1.4). But in my case, it didn't make a > > > > > significant change in the > > > > > results. > > > > > > > > > > The only way I've found to suppress overfitting is to set the basis > > > > > dimension "k" at very low values > > > > > (3 to 5). However, I don't think this is reasonable because knots > > > > > selection will then be an > > > > > important issue. > > > > > > > > > > Is there any other means to avoid overfitting when alalyzing small > > > > > datasets? > > > > > > > > > > Thank you for your help in advance, > > > > > Ariyo Kanno > > > > > > > > > > -- > > > > > Ariyo Kanno > > > > > 1st-year doctor's degree student at > > > > > Institute of Environmental Studies, > > > > > The University of Tokyo > > > > > > > > > > ______________________________________________ > > > > > R-help@r-project.org mailing list > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > PLEASE do read the posting guide > > > > > http://www.R-project.org/posting-guide.html and provide commented, > > > > > minimal, self-contained, reproducible code. > > > > > > > > -- > > > > > > > > > Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY > > > > > UK > > > > > +44 1225 386603 www.maths.bath.ac.uk/~sw283 > > > > > > > > ______________________________________________ > > > > R-help@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html and provide commented, > > > > minimal, self-contained, reproducible code. > > > > -- > > > Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK > > > +44 1225 386603 www.maths.bath.ac.uk/~sw283 > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > >
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.