Thank you very much. It worked. I think I need to digest this further later. Thanks again for the help.
On Thu, Jul 16, 2015 at 4:51 PM, William Dunlap <wdun...@tibco.com> wrote: > This might do what you want: > > OPoly <- function(x, degree=1, weight=1, coefs=NULL, rangeX=NULL){ > weight <- round(weight,0)# weight need to be integer > if(length(weight)!=length(x)) { > weight <- rep(1,length(x)) > } > if (is.null(rangeX)) { > rangeX <- range(x) > } > p <- poly(4*(rep(x,weight)-mean(rangeX))/diff(rangeX), degree=degree, > coefs=coefs) > # why t(t(...))? That strips the attributes. > Z <- t( t(p[cumsum(weight),]) * sqrt(attr(p,"coefs")$norm2[-seq(2)]) )[, > degree, drop=FALSE] > class(Z) <- "OPoly" > attr(Z, "coefs") <- attr(p, "coefs") > attr(Z, "rangeX") <- rangeX > Z > } > > makepredictcall.OPoly<-function(var,call) > { > if (is.call(call)) { > if (identical(call[[1]], quote(OPoly))) { > if (!is.null(tmp <- attr(var, "coefs"))) { > call$coefs <- tmp > } > if (!is.null(tmp <- attr(var, "rangeX"))) { > call$rangeX <- tmp > } > call$weight <- 1 # weight not relevant in predictions > } > } > call > } > > d <- data.frame(Y=1:8, X=log(1:8), Weight=1:8) > fit <- lm(data=d, Y ~ OPoly(X, degree=2, weight=Weight)) > predict(fit)[c(3,8)] > predict(fit, newdata=data.frame(X=d$X[c(3,8)])) # same result > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Thu, Jul 16, 2015 at 4:39 PM, William Dunlap <wdun...@tibco.com> wrote: > >> OPoly<-function(x,degree=1,weight=1){ >> weight=round(weight,0)# weight need to be integer >> if(length(weight)!=length(x))weight=rep(1,length(x)) >> p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree) >> Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[- >> seq(2)]))[,degree]) >> class(Z)<-"OPoly";Z >> } >> >> You need to make OPoly to have optional argument(s) that give >> the original-regressor-dependent information to OPoly and then >> have it return, as attributes, the value of those arguments. >> makepredictcall >> will take the attributes and attach them to the call in predvars so >> predict uses values derived from the original regressors, not value >> derived >> from the data to be predicted from. >> >> Take a look at a pair like makepredictcall.scale() and scale() for an >> example: >> scale has optional arguments 'center' and 'scale' that it returns as >> attributes >> and makepredictcall.scale adds those to the call to scale that it is >> given. >> Thus when you predict, the scale and center arguments come from the >> original data, not from the data you are predicting from. >> >> >> >> >> >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> On Thu, Jul 16, 2015 at 3:43 PM, Kunshan Yin <yinkuns...@gmail.com> >> wrote: >> >>> Thanks Bill for your quick reply. >>> >>> I tried your solution and it did work for the simple user defined >>> function xploly. But when I try with other function, it gave me error again: >>> >>> OPoly<-function(x,degree=1,weight=1){ >>> weight=round(weight,0)# weight need to be integer >>> if(length(weight)!=length(x))weight=rep(1,length(x)) >>> p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree) >>> >>> Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree]) >>> class(Z)<-"OPoly";Z >>> } >>> >>> ##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps >>> the x to range[-2,2] then do QR, then return the results with sqrt(norm2). >>> Comparing with poly, this transformation will make the model coefficients >>> within a similar range as other variables, the R poly routine will usually >>> give you a very large coefficients. I did not find such routine in R, so I >>> have to define this as user defined function. >>> ####### >>> >>> I also have following function as you suggested: >>> >>> makepredictcall.OPoly<-function(var,call) >>> { >>> if (is.call(call)) { >>> if (identical(call[[1]], quote(OPoly))) { >>> if (!is.null(tmp <- attr(var, "coefs"))) { >>> call$coefs <- tmp >>> } >>> } >>> } >>> call >>> } >>> >>> >>> But I still got error for following: >>> >>> > g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family = Gamma) >>> >>> > predict(g3,dc)Error in poly(4 * (rep(x, weight) - >>> > mean(range(x)))/diff(range(x)), degree) : >>> missing values are not allowed in 'poly' >>> >>> I thought it might be due to the /diff(range(x) in the function. But >>> even I remove that part, it will still give me error. Any idea? >>> >>> Many thanks in advance. >>> >>> Alex >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdun...@tibco.com> >>> wrote: >>> >>>> Read about the 'makepredictcall' generic function. There is a method, >>>> makepredictcall.poly(), for poly() that attaches the polynomial >>>> coefficients >>>> used during the fitting procedure to the call to poly() that predict() >>>> makes. >>>> You ought to supply a similar method for your xpoly(), and xpoly() >>>> needs to return an object of a a new class that will cause that method to >>>> be used. >>>> >>>> E.g., >>>> >>>> xpoly <- function(x,degree=1,...){ ret <- poly(x,degree=degree,...); >>>> class(ret) <- "xpoly" ; ret } >>>> makepredictcall.xpoly <- function (var, call) >>>> { >>>> if (is.call(call)) { >>>> if (identical(call[[1]], quote(xpoly))) { >>>> if (!is.null(tmp <- attr(var, "coefs"))) { >>>> call$coefs <- tmp >>>> } >>>> } >>>> } >>>> call >>>> } >>>> >>>> g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) >>>> predict(g2,dc) >>>> # 1 2 3 4 >>>> 5 >>>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 >>>> #-0.01398928608 >>>> # 6 7 8 9 >>>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608 >>>> >>>> You can see the effects of makepredictcall() in the 'terms' component >>>> of glm's output. The 'variables' attribute of it gives the original >>>> function >>>> calls and the 'predvars' attribute gives the calls to be used for >>>> prediction: >>>> > attr(g2$terms, "variables") >>>> list(lot1, log(u), xpoly(u, 1)) >>>> > attr(g2$terms, "predvars") >>>> list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 = c(1, >>>> 9, 8850)))) >>>> >>>> >>>> >>>> Bill Dunlap >>>> TIBCO Software >>>> wdunlap tibco.com >>>> >>>> On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkuns...@gmail.com> >>>> wrote: >>>> >>>>> Hello, I have a question about the formula and the user defined >>>>> function: >>>>> >>>>> I can do following: >>>>> ###Case 1: >>>>> > clotting <- data.frame( >>>>> + u = c(5,10,15,20,30,40,60,80,100), >>>>> + lot1 = c(118,58,42,35,27,25,21,19,18), >>>>> + lot2 = c(69,35,26,21,18,16,13,12,12)) >>>>> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma) >>>>> > dc=clotting >>>>> > dc$u=1 >>>>> > predict(g1,dc) >>>>> 1 2 3 4 5 >>>>> 6 7 8 9 >>>>> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 >>>>> -0.01398929 -0.01398929 -0.01398929 >>>>> >>>>> However, if I just simply wrap the poly as a user defined function ( in >>>>> reality I would have my own more complex function) then I will get >>>>> error: >>>>> ###Case 2: >>>>> > xpoly<-function(x,degree=1){poly(x,degree)} >>>>> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma) >>>>> > predict(g2,dc) >>>>> Error in poly(x, degree) : >>>>> 'degree' must be less than number of unique points >>>>> >>>>> It seems that the predict always treat the user defined function in the >>>>> formula with I(). My question is how can I get the results for Case2 >>>>> same >>>>> as case1? >>>>> >>>>> Anyone can have any idea about this? >>>>> >>>>> Thank you very much. >>>>> >>>>> Alex >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>> >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.