Pedro Mardones wrote:
Agreed. I tried disabling the check that causes R to protest, and then it gives similar DF but not quite the same statistics, quite possibly due to numerical instabilities in one or both systems. (You can easily try yourself, just do anova.mlm <- stats::anova.mlm and edit the qr() call inside.)Thanks for the reply. The SAS output is attached but seems to me that doesn't correspond to the wihtin-row contrasts as you suggested. By the way, yes the data are highly correlated, in fact each row correspond to the first part of a signal vector. Thanks anyway.... PM
> anova(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), test = "Wilks") Analysis of Variance TableDf Wilks approx F num Df den Df Pr(>F) (Intercept) 1 0.002537 1887.24 5 24 <2e-16 *** GROUP 2 0.62 1.29 10 48 0.2616 Residuals 28 ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The GLM Procedure Multivariate Analysis of Variance E = Error SSCP Matrix y1 y2 y3 y4 y5 y1 0.0353518799 0.035256904 0.0351327804 0.0349749601 0.0347868018 y2 0.035256904 0.0351627227 0.0350395053 0.0348827098 0.0346956744 y3 0.0351327804 0.0350395053 0.0349173343 0.0347617352 0.0345760232 y4 0.0349749601 0.0348827098 0.0347617352 0.0346075203 0.0344233531 y5 0.0347868018 0.0346956744 0.0345760232 0.0344233531 0.0342409225 Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r| DF = 28 y1 y2 y3 y4 y5 y1 1.000000 0.999992 0.999967 0.999921 0.999852 <.0001 <.0001 <.0001 <.0001 y2 0.999992 1.000000 0.999991 0.999963 0.999911 <.0001 <.0001 <.0001 <.0001 y3 0.999967 0.999991 1.000000 0.999990 0.999958 <.0001 <.0001 <.0001 <.0001 y4 0.999921 0.999963 0.999990 1.000000 0.999989 <.0001 <.0001 <.0001 <.0001 y5 0.999852 0.999911 0.999958 0.999989 1.000000 <.0001 <.0001 <.0001 <.0001 The SAS System 10:33 Wednesday, August 13, 2008 8 The GLM Procedure Multivariate Analysis of Variance H = Type III SSCP Matrix for group y1 y2 y3 y4 y5 y1 0.0023822408 0.002365848 0.0023471328 0.0023261249 0.0023030993 y2 0.002365848 0.0023495679 0.0023309816 0.0023101183 0.0022872511 y3 0.0023471328 0.0023309816 0.0023125426 0.0022918453 0.0022691608 y4 0.0023261249 0.0023101183 0.0022918453 0.0022713359 0.0022488593 y5 0.0023030993 0.0022872511 0.0022691608 0.0022488593 0.0022266141 Characteristic Roots and Vectors of: E Inverse * H, where H = Type III SSCP Matrix for group E = Error SSCP Matrix Characteristic Characteristic Vector V'EV=1 Root Percent y1 y2 y3 y4 y5 0.41840103 71.72 -7542.628 17131.814 5347.394 -31627.317 16700.100 0.16496011 28.28 -4180.854 -4413.446 32096.035 -35545.204 12040.697 0.00000001 0.00 -41004.875 107291.004 -95905.664 32641.189 -3028.470 0.00000000 0.00 -416.226 -111.206 410.721 295.193 -171.953 0.00000000 0.00 -14678.651 5787.997 54718.250 -69055.249 23218.580 MANOVA Test Criteria and F Approximations for the Hypothesis of No Overall group Effect H = Type III SSCP Matrix for group E = Error SSCP Matrix S=2 M=1 N=11 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.60518744 1.37 10 48 0.2227 Pillai's Trace 0.43658228 1.40 10 50 0.2095 Hotelling-Lawley Trace 0.58336114 1.37 10 33.362 0.2385 Roy's Greatest Root 0.41840103 2.09 5 25 0.1000 On Wed, Aug 13, 2008 at 4:34 AM, Peter Dalgaard <[EMAIL PROTECTED]> wrote:Pedro Mardones wrote:Dear R-users; Previously I posted a question about the problem of rank deficiency in summary.manova. As somebody suggested, I'm attaching a small part of the data set. #*************************************************** "test" <- structure(.Data = list(structure(.Data = c(rep(1,3),rep(2,18),rep(3,10)), levels = c("1", "2", "3"), class = "factor") ,c(0.181829,0.090159,0.115824,0.112804,0.134650,0.249136,0.163144,0.122012,0.157554,0.126283, 0.105344,0.125125,0.126232,0.084317,0.092836,0.108546,0.159165,0.121620,0.142326,0.122770, 0.117480,0.153762,0.156551,0.185058,0.161651,0.182331,0.139531,0.188101,0.103196,0.116877,0.113733) ,c(0.181445,0.090254,0.115840,0.112863,0.134610,0.249003,0.163116,0.122135,0.157206,0.126129, 0.105302,0.124917,0.126243,0.084455,0.092818,0.108458,0.158769,0.121244,0.141981,0.122595, 0.117556,0.153507,0.156308,0.184644,0.161421,0.181999,0.139376,0.187708,0.103126,0.116615,0.113746) ,c(0.181058,0.090426,0.115926,0.113022,0.134632,0.248845,0.163140,0.122331,0.156871,0.126023, 0.105335,0.124757,0.126325,0.084690,0.092885,0.108455,0.158386,0.120913,0.141676,0.122492, 0.117707,0.153293,0.156095,0.184242,0.161214,0.181670,0.139271,0.187318,0.103129,0.116421,0.113826) ,c(0.180692,0.090704,0.116110,0.113319,0.134745,0.248678,0.163256,0.122637,0.156581,0.125998, 0.105479,0.124686,0.126514,0.085066,0.093088,0.108587,0.158040,0.120674,0.141446,0.122488, 0.117972,0.153150,0.155954,0.183885,0.161063,0.181383,0.139251,0.186956,0.103232,0.116351,0.114001) ,c(0.180353,0.091088,0.116392,0.113753,0.134965,0.248520,0.163475,0.123046,0.156354,0.126067, 0.105726,0.124713,0.126821,0.085584,0.093432,0.108858,0.157742,0.120533,0.141309,0.122595, 0.118340,0.153088,0.155897,0.183582,0.160975,0.181143,0.139314,0.186636,0.103449,0.116415,0.114275) ) ,names = c("GROUP", "Y1", "Y2", "Y3", "Y4","Y5") ,row.names = seq(1:31) ,class = "data.frame" ) summary(manova(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), test = "Wilks") #Error in summary.manova(manova(cbind(Y1, Y2, Y3, Y4, Y5) ~ GROUP, test), : residuals have rank 3 < 5 #*************************************************** What I don't understand is why SAS returns no errors using PROC GLM for the same data set. Is because PROC GLM doesn't take into account problems of rank deficiency? So, should I trust manova instead of PROC GLM output? I know it can be a touchy question but I would like to receive some insights. Thanks PM ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.What you have here is extremely correlated data:(V <- estVar(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test)))Y1 Y2 Y3 Y4 Y5 Y1 0.001262567 0.001259177 0.001254746 0.001249106 0.001242385 Y2 0.001259177 0.001255814 0.001251416 0.001245812 0.001239132 Y3 0.001254746 0.001251416 0.001247055 0.001241494 0.001234861 Y4 0.001249106 0.001245812 0.001241494 0.001235983 0.001229405 Y5 0.001242385 0.001239132 0.001234861 0.001229405 0.001222889eigen(V)$values [1] 6.224077e-03 2.313066e-07 3.499837e-10 4.259125e-12 1.334146e-12 $vectors [,1] [,2] [,3] [,4] [,5] [1,] 0.4503756 0.61213579 0.5204920 -0.3485941 0.1732681 [2,] 0.4491807 0.32333236 -0.1873653 0.5929444 -0.5540795 [3,] 0.4476157 0.01442094 -0.5498688 0.1272921 0.6934503 [4,] 0.4456201 -0.31202109 -0.3198606 -0.6557557 -0.4144143 [5,] 0.4432397 -0.65052351 0.5378809 0.2840428 0.1017918 Notice the more than 9 orders of magnitude between the eigenvalues. I think that what is happening is that what SAS calls MANOVA is actually looking at within-row contrasts, which effectively removes the largest eigenvalue. In R, the equivalent would beanova(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), X=~1, test = "Wilks")Analysis of Variance Table Contrasts orthogonal to ~1 Df Wilks approx F num Df den Df Pr(>F) (Intercept) 1 0.037 164.873 4 25 <2e-16 *** GROUP 2 0.701 1.215 8 50 0.3098 Residuals 28 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or (this could be computationally more precice, but in fact it gives the same result)anova(lm(cbind(Y2,Y3,Y4,Y5)-Y1~GROUP, test), test = "Wilks")-- O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907
-- O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.