On Wed, 2009-11-04 at 18:19 +0000, Federico Calboli wrote: > On 4 Nov 2009, at 18:11, Gavin Simpson wrote: > > Is there a particular reason for choosing a VGLM here? My reading of > > your post suggests the response is an univariate, ordered factor and > > VGLMs are especially for multivariate responses. In which case, can > > you > > not use polr() in package MASS that comes with R or the lms() function > > in the rms package (available from CRAN). > > I have used polr() but: > > 1) seems to be even more reluctant to give me at least the t value for > my marker: for the very same data I used to understand what was going > on vglm does give a t value for x (my marker), polr does not. As much > as my data is *dire* I still need to use it and get as many p-values > as I can (ideally one for each marker). The goodness of the model is > going to be assessed, perversely, on the whole sheabang of p-values > and their aderence to a null distribution. > > Additionally, while extracting the t value is a piece of cake with > polr(), the p-value I get a nowhere close to a null distribution.
Yes - I see that polr() also doesn't produce p-values in the output from summary. You can use it to get "a" p-value for x if you use a LRT; fit the model using polr() with and without 'x' and then supply both fitted models to the anova() method for polr objects. > > I will try lms() and hope for the best. Sorry, I meant the lrm() function in package rms G > > > > I haven't really used either of these functions in earnest, but one or > > both may provide the p-values you desire, out of the box. > > I hope so! > > Thanks, > > F > > > > > > HTH > > > > G > > > >> > >> My response variable is the severity of diseases, going from 0 to 5 > >> (the > >> severity is actually an ordered factor). > >> > >> The independent variables are: 1 genetic marker, time of medical > >> observation, > >> age, sex. What I *need* is a p-value for the genetic marker. > >> Because I have ~1.5 > >> million markers I'd rather not faffing around too much. > >> > >> My model is: > >> > >>> mod.vglm = vglm(disease.status ~ x + time + age + sex, family = > >> cumulative(par = T)) > >> > >> where x is my genetic marker, coded as 0/1/2, time is days of > >> medical observation. > >> > >>> summary(mod.vglm) works: > >> > >> Call: > >> vglm(formula = disease.status ~ x + time + age + sex, family = > >> cumulative(par = T)) > >> > >> Pearson Residuals: > >> Min 1Q Median 3Q Max > >> logit(P[Y<=1]) -0.6642 -0.28704 -0.18329 -0.11681 3.8919 > >> logit(P[Y<=2]) -2.5580 -0.48080 -0.23315 0.47388 2.5983 > >> logit(P[Y<=3]) -2.1565 -0.56961 0.22089 0.44349 10.7964 > >> logit(P[Y<=4]) -3.3175 0.13064 0.20117 0.43176 12.5233 > >> > >> Coefficients: > >> Value Std. Error t value > >> (Intercept):1 -2.4460e+00 4.2791e-01 -5.7162 > >> (Intercept):2 -7.1078e-01 4.1628e-01 -1.7074 > >> (Intercept):3 3.7619e-01 4.1545e-01 0.9055 > >> (Intercept):4 1.7467e+00 4.2092e-01 4.1496 > >> x 4.1421e-01 1.9762e-01 2.0959 > >> time -3.6021e-04 3.0387e-05 -11.8540 > >> age -2.6115e-05 9.2504e-06 -2.8232 > >> sexM 1.0188e-01 1.2491e-01 0.8156 > >> > >> Number of linear predictors: 4 > >> > >> Names of linear predictors: > >> logit(P[Y<=1]), logit(P[Y<=2]), logit(P[Y<=3]), logit(P[Y<=4]) > >> > >> Dispersion Parameter for cumulative family: 1 > >> > >> Residual Deviance: 2475.937 on 3460 degrees of freedom > >> > >> Log-likelihood: -1237.969 on 3460 degrees of freedom > >> > >> ####################### > >> > >> So here are my questions: > >> > >> 1) I need to get the t value for x, so I can use "1 - pt(tvalue,1)" > >> to find some > >> sort of probability value for x. That's not trivial. Additionally, > >> I assume df > >> for x is 1, hence I plan to use "1 - pt(tvalue,1)", though I might > >> well be > >> wrong. In any case getting the darned t value seems impossible > >> > >> 2) because of the difficulty of getting (1), it there a way of > >> getting vglm() to > >> spit out a p-value for x please? > >> > >> I do recon many people might scoff at my crass desire for a p- > >> value, but I'm > >> dealing with some dire phenotype in a whole genome analysis where > >> the *only* > >> thing that matters are p-values. I have to be quite unsophysticated > >> I'm afraid. > >> > >> Best, > >> > >> Federico > >> > >> > > -- > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > > -- > Federico C. F. Calboli > Department of Epidemiology and Public Health > Imperial College, St. Mary's Campus > Norfolk Place, London W2 1PG > > Tel +44 (0)20 75941602 Fax +44 (0)20 75943193 > > f.calboli [.a.t] imperial.ac.uk > f.calboli [.a.t] gmail.com > > > > > -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.