Hello,

I have a statistical problem that I am using R for, but I am not making sense of the results. I am trying to use multiple regression to explore which variables (weather conditions) have the greater effect on a local atmospheric variable. The data is taken from a database that has 20391 data points (Z1).

A simplified version of the data I'm looking at is given below, but I have a problem in that there is a disagreement in sign between the regression coefficients and the standardised regression coefficients. Intuitively I would expect both to be the same sign, but in many of the parameters, they are not.

I am aware that there is a strong opinion that using standardised correlation coefficients is highly discouraged by some people, but I would nevertheless like to see the results. Not least because it has made me doubt the non-standardised values of B that R has given me.

The code I have used, and some of the data, is as follows (once the database has been imported from SQL, and outliers removed).



Z1sub  <- Z1[, c(2, 5, 7,11, 12, 13, 15, 16)]
colnames(Z1sub) <- c("temp", "hum", "wind", "press", "rain", "s.rad", "mean1", "sd1" )

attach(Z1sub)
names(Z1sub)


Model1d <- lm(mean1 ~ hum*wind*rain +  I(hum^2) + I(wind^2) + I(rain^2) )

summary(Model1d)

Call:
lm(formula = mean1 ~ hum * wind * rain + I(hum^2) + I(wind^2) +
   I(rain^2))

Residuals:
    Min       1Q   Median       3Q      Max
-1230.64   -63.17    18.51    97.85  1275.73

Coefficients:
               Estimate Std. Error t value Pr(>|t|)
(Intercept)   -9.243e+02  5.689e+01 -16.246  < 2e-16 ***
hum            2.835e+01  1.468e+00  19.312  < 2e-16 ***
wind           1.236e+02  4.832e+00  25.587  < 2e-16 ***
rain          -3.144e+03  7.635e+02  -4.118 3.84e-05 ***
I(hum^2)      -1.953e-01  9.393e-03 -20.793  < 2e-16 ***
I(wind^2)      6.914e-01  2.174e-01   3.181  0.00147 **
I(rain^2)      2.730e+02  3.265e+01   8.362  < 2e-16 ***
hum:wind      -1.782e+00  5.448e-02 -32.706  < 2e-16 ***
hum:rain       2.798e+01  8.410e+00   3.327  0.00088 ***
wind:rain      6.018e+02  2.146e+02   2.805  0.00504 **
hum:wind:rain -6.606e+00  2.401e+00  -2.751  0.00594 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 180.5 on 20337 degrees of freedom
Multiple R-squared: 0.2394,     Adjusted R-squared: 0.239
F-statistic: 640.2 on 10 and 20337 DF,  p-value: < 2.2e-16





To calculate the standardised coefficients, I used the following:

Z1sub.scaled <- data.frame(scale( Z1sub[,c('temp', 'hum', 'wind', 'press', 'rain', 's.rad', 'mean1', 'sd1' ) ] ) )

attach(Z1sub.scaled)
names(Z1sub.scaled)


Model1d.sc <- lm(mean1 ~ hum*wind*rain +  I(hum^2) + I(wind^2) + I(rain^2) )

summary(Model1d.scaled)

Call:
lm(formula = mean1 ~ hum * wind * rain + I(hum^2) + I(wind^2) +
   I(rain^2))

Residuals:
    Min       1Q   Median       3Q      Max
-5.94713 -0.30527  0.08946  0.47287  6.16503

Coefficients:
               Estimate Std. Error t value Pr(>|t|)
(Intercept)    0.0806858  0.0096614   8.351  < 2e-16 ***
hum           -0.4581509  0.0073456 -62.371  < 2e-16 ***
wind          -0.1995316  0.0073767 -27.049  < 2e-16 ***
rain          -0.1806894  0.0158037 -11.433  < 2e-16 ***
I(hum^2)      -0.1120435  0.0053885 -20.793  < 2e-16 ***
I(wind^2)      0.0172870  0.0054346   3.181  0.00147 **
I(rain^2)      0.0040575  0.0004853   8.362  < 2e-16 ***
hum:wind      -0.2188729  0.0066659 -32.835  < 2e-16 ***
hum:rain       0.0267420  0.0146201   1.829  0.06740 .
wind:rain      0.0365615  0.0122335   2.989  0.00281 **
hum:wind:rain -0.0438790  0.0159479  -2.751  0.00594 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8723 on 20337 degrees of freedom
Multiple R-squared: 0.2394,     Adjusted R-squared: 0.239
F-statistic: 640.2 on 10 and 20337 DF,  p-value: < 2.2e-16



So having, for instance for humidity (hum), B = 28.35 +/- 1.468, while Beta = -0.4581509 +/- 0.0073456 is concerning. Is this normal, or is there an error in my code that has caused this contradiction?

Many thanks,

James.


----------------------
JC Matthews
School of Chemistry
Bristol University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to