Sincere thanks for both the replies. 0. I agree, I'm waiting for my copy of a regression book to arrive. Meanwhile, I'm trying to read on google.
1. My bad, I'm using Gaussian noise. 2. I didn't have x^3 b/c that co-efficient happens to be zero in this fitting. 3. I used lines() b/c I wanted to superimpose the curve from regression atop my first plot of the original data points (x,y). I'm not sure how to use plot(f, x1 = NA) after my first plot(). The examples I managed to find on google all use plot() followed by lines(). [In Matlab, I'd just say "hold" in between these calls.] Also, I'm forced to call win.graph() before my first plot() to see the first plot. Is that normal? 4. I really could use some guidance on this part. I need to use rcs() to fit points in a high-dimensional space and I'm trying to understand and use it correctly. I started with testing it on just x,y dimensions so that I can visually evaluate the fitting. I tried y=x, y=x^2 etc, adding Gaussian noise each time (to the y). I plot original x,y and x,y' where y' is calculated using the co-efficients returned by rcs. I find that the regression curve differs from the actual points by as high as 10^5 with 3 knots and roughly -10^5 with 4 knots as I make y=x^2, y=x^3.... If this is NOT a good way to test fitting, could you pls tell me a better way? Respectfully, sp --- On Tue, 12/23/08, Frank E Harrell Jr <f.harr...@vanderbilt.edu> wrote: > From: Frank E Harrell Jr <f.harr...@vanderbilt.edu> > Subject: Re: [R] newbie problem using Design.rcs > To: "David Winsemius" <dwinsem...@comcast.net> > Cc: to_rent_2...@yahoo.com, r-help@r-project.org > Date: Tuesday, December 23, 2008, 9:41 AM > In addition to David's excellent response, I'll add > that your problems seem to be statistical and not > programming ones. I recommend that you spend a significant > amount of time with a good regression text or course before > using the software. Also, with Design you can find out the > algebraic form of the fit: > > f <- ols(y ~ rcs(x,3), data=mydata) > Function(f) > > Frank > > > David Winsemius wrote: > > > > On Dec 22, 2008, at 11:38 PM, sp wrote: > > > >> Hi, > >> > >> I read data from a file. I'm trying to > understand how to use Design.rcs by using simple test data > first. I use 1000 integer values (1,...,1000) for x (the > predictor) with some noise (x+.02*x) and I set the response > variable y=x. Then, I try rcs and ols as follows: > >> > > Not sure what sort of noise that is. > > > >> m = ( sqrt(y1) ~ ( rcs(x1,3) ) ); #I tried without > sqrt also > >> f = ols(m, data=data_train.df); > >> print(f); > >> > >> [I plot original x1,y1 vectors and the regression > as in > >> y <- coef2[1] + coef2[2]*x1 + coef2[3]*x1*x1] > > > > That does not look as though it would capture the > structure of a restricted **cubic** spline. The usual method > in Design for plotting a model prediction would be: > > > > plot(f, x1 = NA) > > > >> > >> > >> But this gives me a VERY bad fit: > >> " > > > > Can you give some hint why you consider this to be a > "VERY bad fit"? It appears a rather good fit to > me, despite the test case apparently not being construct > with any curvature which is what the rcs modeling strategy > should be detecting. > > > > > -- Frank E Harrell Jr Professor and Chair > School of Medicine > Department of Biostatistics > Vanderbilt University ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.