Hi:

On Wed, Dec 15, 2010 at 4:24 AM, beatlebg <rhelpfo...@gmail.com> wrote:

>
> Am I trying to perform multiple linear regressions on each 'VARIABLE2'. I
> figured out that there are different ways, using the following code:
> (data
> is given at the end of this message)
> reg <- lapply(split(TRY, VARIABLE2), function(X){lm(X2 ~ X3, data=X)})
> lapply(reg, summary)
>
> Which produces the following:
>
> $`1`
>
> Call:
> lm(formula = X2 ~ X3, data = X)
>
> Residuals:
>     Min       1Q   Median       3Q      Max
> -1.24233 -0.30028  0.03706  0.46170  1.12408
>
> Coefficients:
>            Estimate Std. Error t value Pr(>|t|)
> (Intercept)   3.0705     0.2323  13.215 5.95e-15 ***
> X3            0.4744     0.2640   1.797   0.0813 .
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.5752 on 34 degrees of freedom
> Multiple R-squared: 0.08672,    Adjusted R-squared: 0.05986
> F-statistic: 3.228 on 1 and 34 DF,  p-value: 0.08126
>                                         ^^^^^^^^^^^
>
> $`2`
>
> Call:
> lm(formula = X2 ~ X3, data = X)
>
> Residuals:
>    Min      1Q  Median      3Q     Max
> -1.1358 -0.6403  0.2505  0.4055  1.2088
>
> Coefficients:
>            Estimate Std. Error t value Pr(>|t|)
> (Intercept)   2.5859     0.2968   8.713 4.53e-10 ***
> X3            0.4957     0.3435   1.443    0.158
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.6765 on 33 degrees of freedom
> Multiple R-squared: 0.05937,    Adjusted R-squared: 0.03086
> F-statistic: 2.083 on 1 and 33 DF,  p-value: 0.1584
>                                         ^^^^^^^^
>
> $`3`
>
> Call:
> lm(formula = X2 ~ X3, data = X)
>
> Residuals:
>     Min       1Q   Median       3Q      Max
> -1.70021 -0.66049 -0.00138  0.81210  1.26162
>
> Coefficients:
>            Estimate Std. Error t value Pr(>|t|)
> (Intercept)   1.9473     0.3522   5.529 2.73e-06 ***
> X3            0.8515     0.3954   2.154   0.0378 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.8979 on 37 degrees of freedom
> Multiple R-squared: 0.1114,     Adjusted R-squared: 0.08739
> F-statistic: 4.639 on 1 and 37 DF,  p-value: 0.03784
>                                           ^^^^^^^^
> It should also be possible to use the lmList function, but remarkebly, I
> get
> the same estimates, but different Std. Errors... I used the following code:
>
>
> modlst <- lmList(X2 ~ X3 | VARIABLE2, TRY)
> summary(modlst)
>
> Which produces
>
> Call:
>  Model: X2 ~ X3 | VARIABLE2
>   Data: TRY
>
> Coefficients:
>   (Intercept)
>  Estimate Std. Error   t value     Pr(>|t|)
> 1 3.070507  0.2969014 10.341841 0.000000e+00
> 2 2.585938  0.3224380  8.019952 1.665779e-12
> 3 1.947292  0.2882936  6.754546 8.454271e-10
>   X3
>   Estimate Std. Error  t value    Pr(>|t|)
> 1 0.4744112  0.3373931 1.406108 0.162672738
> 2 0.4957349  0.3731949 1.328354 0.186968753
> 3 0.8515270  0.3236325 2.631154 0.009803152
>
> Residual standard error: 0.7350239 on 104 degrees of freedom
>

^^^^^^^^^^^^^^^^^^^^^^^^^^
(33 + 34 + 37) = 104.

The residual variance in lmList() is based on a pooling of all the data. It
considers the groups to be part of the same data frame. Read its help page
carefully to understand what it is meant to do.


> I do not understand what is the difference between these two methods and
> what causes the difference in Std. Errors. Which method is preferable? I
> checked the results with other software programm, and those results
> corresponded with the first method...
>

Which is preferable depends on your goals. If you intend for each subgroup
of data to be independent, then your listwise method is appropriate; if the
groups are meant to be part of the same data set (e.g., if you want to
perform comparisons that involve the different subgroups), then the lmList()
approach would seem more appropriate, at least with respect to the purpose
to which lmList() is intended. How you perceive the connections between the
grouped data frames matters.

HTH,
Dennis

>
> I really hope someone can explain where I made a mistake. Thank you.
>
>
>
> data.frame: TRY:
>
>   VARIABLE2        X2         X3
> 1           1 2.3025851 1.00000000
> 2           1 3.8286414 1.00000000
> 3           1 4.3820266 1.00000000
> 4           1 3.6375862 1.00000000
> 5           1 3.7841896 1.00000000
> 6           1 3.4965076 1.00000000
> 7           1 2.8332133 1.00000000
> 8           1 3.6375862 1.00000000
> 9           1 4.0775374 1.00000000
> 10          1 3.4339872 1.00000000
> 11          1 3.5263605 1.00000000
> 12          1 3.0445224 1.00000000
> 13          1 2.8332133 1.00000000
> 14          1 2.7725887 1.00000000
> 15          1 3.0910425 1.00000000
> 16          1 4.1108739 1.00000000
> 17          1 3.2958369 1.00000000
> 18          1 2.7080502 1.00000000
> 19          1 2.9957323 1.00000000
> 20          1 3.6375862 1.00000000
> 21          1 3.8918203 1.00000000
> 22          1 3.8712010 1.00000000
> 23          1 3.4011974 1.00000000
> 24          1 3.2958369 1.00000000
> 25          1 4.1271344 1.00000000
> 26          1 4.1588831 1.00000000
> 27          1 4.1271344 0.90476190
> 28          1 3.8712010 0.66666667
> 29          1 4.5108595 0.66666667
> 30          1 3.9120230 0.33333333
> 31          1 3.6375862 0.23809524
> 32          1 3.4339872 0.04761905
> 33          1 2.8903718 0.00000000
> 34          1 2.8903718 0.00000000
> 35          1 2.8332133 0.00000000
> 36          1 1.9459101 0.00000000
> 37          2 2.0794415 1.00000000
> 38          2 3.4657359 1.00000000
> 39          2 3.9889840 1.00000000
> 40          2 3.4339872 1.00000000
> 41          2 3.4011974 1.00000000
> 42          2 3.3322045 1.00000000
> 43          2 2.8903718 1.00000000
> 44          2 3.3672958 1.00000000
> 45          2 3.3322045 1.00000000
> 46          2 3.4339872 1.00000000
> 47          2 3.4011974 1.00000000
> 48          2 3.2958369 1.00000000
> 49          2 2.8332133 1.00000000
> 50          2 3.3322045 1.00000000
> 51          2 3.3672958 1.00000000
> 52          2 3.6635616 1.00000000
> 53          2 2.8903718 1.00000000
> 54          2 1.9459101 1.00000000
> 55          2 2.0794415 1.00000000
> 56          2 2.3025851 1.00000000
> 57          2 2.4849066 1.00000000
> 58          2 2.0794415 1.00000000
> 59          2 2.3978953 1.00000000
> 60          2 2.4849066 1.00000000
> 61          2 4.2904594 1.00000000
> 62          2 3.9889840 0.57142857
> 63          2 3.6109179 0.52380952
> 64          2 3.5553481 0.33333333
> 65          2 3.1780538 0.33333333
> 66          2 3.1780538 0.33333333
> 67          2 2.7725887 0.33333333
> 68          2 3.1354942 0.19047619
> 69          2 1.7917595 0.09523810
> 70          2 1.9459101 0.19047619
> 71          2 1.6094379 0.00000000
> 72          3 2.3978953 1.00000000
> 73          3 2.4849066 1.00000000
> 74          3 1.6094379 1.00000000
> 75          3 1.3862944 1.00000000
> 76          3 1.7917595 1.00000000
> 77          3 1.0986123 1.00000000
> 78          3 2.0794415 1.00000000
> 79          3 1.3862944 1.00000000
> 80          3 1.9459101 1.00000000
> 81          3 3.1780538 1.00000000
> 82          3 2.1972246 1.00000000
> 83          3 2.4849066 1.00000000
> 84          3 2.6390573 1.00000000
> 85          3 3.6109179 1.00000000
> 86          3 2.3978953 1.00000000
> 87          3 2.1972246 1.00000000
> 88          3 1.6094379 1.00000000
> 89          3 3.0910425 1.00000000
> 90          3 3.6888795 1.00000000
> 91          3 3.3672958 1.00000000
> 92          3 3.4011974 1.00000000
> 93          3 2.4849066 1.00000000
> 94          3 3.4657359 1.00000000
> 95          3 4.0604430 1.00000000
> 96          3 3.6635616 1.00000000
> 97          3 3.6109179 1.00000000
> 98          3 3.8286414 1.00000000
> 99          3 3.6375862 1.00000000
> 100         3 3.7135721 1.00000000
> 101         3 3.8918203 0.80952381
> 102         3 3.7376696 0.85714286
> 103         3 3.0445224 0.66666667
> 104         3 3.2958369 0.33333333
> 105         3 2.7080502 0.00000000
> 106         3 1.9459101 0.00000000
> 107         3 2.4849066 0.04761905
> 108         3 1.9459101 0.00000000
> 109         3 0.6931472 0.00000000
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/lmList-and-lapply-lm-different-std-errors-tp3088903p3088903.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to