Rolf Turner <r.tur...@auckland.ac.nz> writes: > Oh for Pete's sake!
No, just for me. > Computers use floating point arithmetic. Your residual standard error in > case 2 (i.e. 1.44e-14) *is* 0, but floating point arithmetic can't quite see > that this is so. Yes, and that's fine. When I put together a lattice plot to display several hundred slope coefficients, I don't need to distinguish between 1.44e-14 and 0. Both are visually 'zero', and accurately reflect the lack of relationship. My problem came when viewing a lattice plot of several hundred adj. R sq values, and viewing a handful of very high values in cases where there is no actual relationship. In some cases R did what I expected, and gave me a NaN which didn't plot. In other cases, it gave me a very large number, which did plot, and was quite confusing in context. Anyways, it will be easy to add a check as you suggest. Thanks for your time, Tyler > Put in a check for the RSE being 0, and ``over- ride'' the adjusted R > squared to be NA (or NaN, or whatever floats your boat) in such > instances. The all.equal() function might be useful to you: > >> x <- 1.44e-14 >> all.equal(x,0) > [1] TRUE > > (Caution: Trap for Young Players: If x and y are ``really'' different, > then all.equal(x,y) doesn't return FALSE as you might expect, but rather > a description of the difference between x and y --- which may be complicated > if x and y are complicated objects. The function isTRUE() is useful here.) > > cheers, > > Rolf Turner > > > On 21/01/2009, at 9:21 AM, tyler wrote: > >> Hi, >> >> I'm analyzing a large number of simulations using lm(), a sample of the >> resulting data is pasted below. In some simulations, the response >> variable doesn't vary, ie: >> >>> tmp[[2]]$richness >> [1] 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 >> 40 >> >> When I analyze this using R version 2.8.0 (2008-10-20) on a linux >> cluster, I get an appropriate result: >> >> >> ## begin R ## >> >> summary(lm(richness ~ het, data = tmp[[2]])) >> >> Call: >> lm(formula = richness ~ het, data = tmp[[2]]) >> >> Residuals: >> Min 1Q Median 3Q Max >> 0 0 0 0 0 >> >> Coefficients: >> Estimate Std. Error t value Pr(>|t|) >> (Intercept) 40 0 Inf <2e-16 *** >> het 0 0 NA NA >> --- >> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >> >> Residual standard error: 0 on 23 degrees of freedom >> Multiple R-squared: , Adjusted R-squared: >> F-statistic: on 1 and 23 DF, p-value: NA >> >> ## end R ## >> >> This is good, as when I extract the Adjusted R-squared and slope I get >> NaN and 0, which are easily identified in my aggregate analysis, so I >> can deal with them appropriately. >> >> However, this isn't always the case: >> >> ## begin R ## >> >> tmp[[1]]$richness >> [1] 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 >> 40 >> [26] 40 40 40 40 40 40 40 40 40 40 40 >> >> summary(lm(richness ~ het, data = tmp[[1]])) >> >> Call: >> lm(formula = richness ~ het, data = tmp[[1]]) >> >> Residuals: >> Min 1Q Median 3Q Max >> -8.265e-14 1.689e-15 2.384e-15 2.946e-15 4.022e-15 >> >> Coefficients: >> Estimate Std. Error t value Pr(>|t|) >> (Intercept) 4.000e+01 8.418e-15 4.752e+15 <2e-16 *** >> het 1.495e-14 4.723e-14 3.160e-01 0.754 >> --- >> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >> >> Residual standard error: 1.44e-14 on 34 degrees of freedom >> Multiple R-squared: 0.5112, Adjusted R-squared: 0.4968 >> F-statistic: 35.56 on 1 and 34 DF, p-value: 9.609e-07 >> >> ## end R ## >> >> This is a problem, as when I plot the adj. R sq as part of an aggregate >> analysis of a large number of simulations, it appears to be a very >> strong regression. I wouldn't have caught this except it was >> exceptionally high for the simulation parameters. It also differs by >> more than rounding error from the results with R 2.8.1 running on my >> laptop (Debian GNU/Linux), i.e., adj. R sq 0.5042 vs 0.4968. >> Furthermore, on my laptop, none of the analyses produce a NaN adj. R sq, >> even for data that do produce that result on the cluster. >> >> Both my laptop and the linux cluster have na.action set to na.omit. Is >> there something else I can do to ensure that lm() returns slope == 0 >> and adj.R.sq == NaN when the response variable is fixed? >> >> Thanks for any suggestions, >> >> Tyler >> >> Data follows: >> >> `tmp` <- >> list(structure(list(richness = c(40, 40, >> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, >> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, >> 40, 40), range = c(0.655084651733024, 0.579667533660137, 0.433092220907644, >> 0.62937198839679, 0.787891987978164, 0.623511540624239, 0.542744487102066, >> 0.905937570175433, 0.806802881350753, 0.680413208666325, 0.873426339019084, >> 0.699982832956593, 0.697716600618959, 0.952729864926405, 0.782938474636578, >> 1.03899695305995, 0.715075858219333, 0.579749205792549, 1.20648999819246, >> 0.648677938600964, 0.651883559714785, 0.997318331273967, 0.926368116052012, >> 0.91001274146868, 1.20737951037620, 1.12006560586723, 1.09806272133903, >> 0.9750792390176, 0.356496202035743, 0.612018080768747, 0.701905693862144, >> 0.735857916053381, 0.991787489781244, 1.07247435214078, 0.60061903319766, >> 0.699733090379818), het = c(0.154538307084452, 0.143186508136608, >> 0.0690948358402777, 0.132337152911839, 0.169037344105692, 0.117783183361602, >> 0.117524251767612, 0.221161206774407, 0.204574928003633, 0.170571000779693, >> 0.204489357007294, 0.131749663515638, 0.154127894997213, 0.232672587431942, >> 0.198610891796736, 0.260497696582693, 0.129028191256682, 0.128717975847452, >> 0.254300896783617, 0.113546727236817, 0.142220347446853, 0.24828642688332, >> 0.194340945175726, 0.190782985783610, 0.214676796387244, 0.252940213066992, >> 0.22362832797347, 0.182423482989676, 0.0602332226418674, 0.145400861749859, >> 0.141297315445974, 0.139798699247632, 0.222815139716421, 0.211971297234962, >> 0.120813579628747, 0.150590744533818), n.rich = c(40, 40, 40, >> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, >> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, >> 40)), .Names = c("richness", "range", "het", "n.rich")), >> structure(list(richness = c(40, 40, >> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, >> 40, 40, 40, 40, 40, 40, 40), range = c(0.753203162648624, 0.599708526308711, >> 0.714477274087683, 0.892359682406808, 0.868440625159371, 0.753239521511417, >> 1.20164969658467, 1.20462111558583, 1.13142122690491, 0.95241921975703, >> 1.13214481653550, 0.827528954009827, 1.14827745443481, 0.936048043180592, >> 0.874649332193952, 1.38844778296649, 0.985016220913809, 1.18166853164661, >> 0.784679773255876, 0.94894149080785, 0.770312904574722, 1.10203660758219, >> 1.15624067277321, 0.692776967548628, 0.79343712876973), >> het = c(0.170481207967181, >> 0.108265674755723, 0.123316519598517, 0.220631611141464, 0.160460967122565, >> 0.145032358811883, 0.293678286125082, 0.284769842125969, 0.258637372765782, >> 0.18303781265474, 0.265304220319150, 0.194784967445680, 0.248055723803990, >> 0.204658616507612, 0.167203828355069, 0.287030735881294, 0.247639113771915, >> 0.269348295820692, 0.111409735752589, 0.209076579513581, 0.176890183224181, >> 0.249378876987384, 0.260323833307383, 0.177061093736427, 0.172263958005774 >> ), n.rich = c(40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, >> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40)), .Names = c >> ("richness", >> "range", "het", "n.rich"))) >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ###################################################################### > Attention: > This e-mail message is privileged and confidential. If you are not the > intended recipient please delete the message and notify the sender. > Any views or opinions presented are solely those of the author. > > This e-mail has been scanned and cleared by MailMarshal > www.marshalsoftware.com > ###################################################################### > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- I never loan my books, for people never return them. The only books remaining in my library are those I’ve borrowed from others. --unknown ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.