These are all field measured values. For a little background here, I have field measurements of SO4, NO3 and NH4. I used these variables in an atmospheric chemistry model to calculate PBW on a line-by-line basis.
To bypass the use of the complex atmospheric chemistry model in the future, I want to develop a regression equation based on the current results I have. Also, I know the atmospheric chemistry model requires SO4, NO3 and NH4 to estimate PBW. So I am using the same as IVs for the regression model. Aparna -----Original Message----- From: Dimitri Liakhovitski [mailto:ld7...@gmail.com] Sent: Tuesday, April 21, 2009 9:31 AM To: Vemuri, Aparna Subject: Re: [R] Fitting linear models Aparna, why are your IVs so highly intercorrelated? It's not a good sign... On Tue, Apr 21, 2009 at 12:29 PM, Dimitri Liakhovitski <ld7...@gmail.com> wrote: > But if the multicollinearity is so strong, then I am wondering why it > worked in the data frame as opposed to 4 seprate vectors? It should > not make any difference... > Dimitri > > On Tue, Apr 21, 2009 at 12:21 PM, Vemuri, Aparna <avem...@epri.com> wrote: >> Thanks Dimitri! Following exactly what you did, I wrote all my individual >> variable vectors to a data frame and used lm(formula,data) and this time it >> works for me too. >> >> Marc, your theory is correct.NH4 variable shares a strong correlation with >> one of the IV along with the DV. >> SO4 NO3 NH4 PBW >> SO4 1 -0.0867 0.999 0.999 >> NO3 -0.0867 1 -0.0527 -0.0938 >> NH4 0.999 -0.0527 1 0.999 >> PBW 0.999 -0.0938 0.999 1 >> >> >> Aparna >> >> -----Original Message----- >> From: Dimitri Liakhovitski [mailto:ld7...@gmail.com] >> Sent: Tuesday, April 21, 2009 9:02 AM >> To: Vemuri, Aparna >> Cc: r-help@r-project.org; David Winsemius >> Subject: Re: [R] Fitting linear models >> >> I am not sure what the problem is. >> I found no errors: >> >> data<-read.csv(file.choose()) # I had to change your file extension >> to .csv first >> dim(data) >> names(data) >> >> lapply(data,function(x){sum(is.na(x))}) >> lm.model.1<-lm(PBW~SO4+NO3+NH4,data) >> lm.model.2<-lm(PBW~SO4+NH4+NO3,data) >> print(lm.model.1) # Getting nice results >> print(lm.model.2) # Getting same results >> >> # Another method (gets exactly the same results): >> library(Design) >> ols.model.1<-ols(PBW~SO4+NO3+NH4,data) >> ols.model.2<-ols(PBW~SO4+NH4+NO3,data) >> >> Dimitri >> On Tue, Apr 21, 2009 at 11:50 AM, Vemuri, Aparna <avem...@epri.com> wrote: >>> Attached are the first hundred rows of my data in comma separated format. >>> Forcing the regression line through the origin, still does not give a >>> coefficient on the last independent variable. Also, I don't mind if there >>> is a coefficient on the dependent axis. I just want all of the variables to >>> have coefficients in the regression equation or a at least a consistent >>> result, irrespective of the order of input information. >>> >>> -----Original Message----- >>> From: David Winsemius [mailto:dwinsem...@comcast.net] >>> Sent: Tuesday, April 21, 2009 8:38 AM >>> To: Vemuri, Aparna >>> Cc: r-help@r-project.org >>> Subject: Re: [R] Fitting linear models >>> >>> >>> On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote: >>> >>>> David, >>>> Thanks for the suggestions. No, I did not label my dependent >>>> variable "function". >>> >>> That was from my error in reading your call to lm. In my defense I am >>> reasonably sure the proper assignment to arguments is lm(formula= ...) >>> rather than lm(function= ...). >>>> >>>> >>>> My dependent variable PBW and all the independent variables are >>>> continuous variables. It is especially troubling since the order in >>>> which I input independent variables determines whether or not it >>>> gets a coefficient. Like I already mentioned, I checked the >>>> correlation matrix and picked the variables with moderate to high >>>> correlation with the independent variable. . So I guess it is not so >>>> naïve to expect a regression coefficient on all of them. >>>> >>>> Dimitri >>>> model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before. >>> >>> Did you get the expected results with; >>> model1<-lm(formula=PBW~SO4+NO3+NH4+0) >>> >>> You could, of course, provide either the data or the results of str() >>> applied to each of the variables and then we could all stop guessing. >>> >>>> >>>> Aparna >>>> >>>>> >>>>> >>>>> I am using the lm() function in R to fit a dependent variable to a >>>>> set >>>>> of 3 to 5 independent variables. For this, I used the following >>>>> commands: >>>>> >>>>>> model1<-lm(function=PBW~SO4+NO3+NH4) >>>>> Coefficients: >>>>> (Intercept) SO4 NO3 NH4 >>>>> 0.01323 0.01968 0.01856 NA >>>>> >>>>> and >>>>> >>>>>> model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl) >>>>> >>>>> Coefficients: >>>>> (Intercept) SO4 NO3 NH4 >>>>> Na Cl >>>>> -0.0006987 -0.0119750 -0.0295042 0.0842989 0.1344751 >>>>> NA >>>>> >>>>> In both cases, the last independent variable has a coefficient of NA >>>>> in >>>>> the result. I say last variable because, when I change the order of >>>>> the >>>>> variables, the coefficient changes (see below). Can anyone point me >>>>> to >>>>> the reason R behaves this way? Is there anyway for me to force R to >>>>> use >>>>> all the variables? I checked the correlation matrices to makes sure >>>>> there is no orthogonality between the variables. >>>> >>>> You really did not name your dependent variable "function" did you? >>>> Please stop that. >>>> >>>> Just a guess, ... since you have not provided enough information to do >>>> otherwise, ... Are all of those variables 1/0 dummy variables? If so >>>> and if you want to have an output that satisfies your need for >>>> labeling the coefficients as you naively anticipate, then put "0+" at >>>> the beginning of the formula or "-1" at the end, so that the intercept >>>> will disappear and then all variables will get labeled as you expect. >>> -- >>> David Winsemius, MD >>> Heritage Laboratories >>> West Hartford, CT >>> >>> >> >> >> >> -- >> Dimitri Liakhovitski >> MarketTools, Inc. >> dimitri.liakhovit...@markettools.com >> > > > > -- > Dimitri Liakhovitski > MarketTools, Inc. > dimitri.liakhovit...@markettools.com > -- Dimitri Liakhovitski MarketTools, Inc. dimitri.liakhovit...@markettools.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.