Hello,
I have a problem with creating an identity matrix for glmnet by using the contrasts function. I have a factor with 4 levels. When I create dummy variables I think there should be n-1 variables (in this case 3) - so that the contrasts would be against the baseline level. This is also what is written in the help file for 'contrasts'. The problem is that the function creates a matrix with n variables (i.e. the same as the number of levels) and not n-1 (where I would have a baseline level for comparison). My questions are: 1. How can I create a matrix with n-1 dummy vars ? was I supposed to define explicitly that I want contr.treatment (contrasts) ? 2. If it is not possible, how should I interpret the hazard ratios in the Cox regression I am generating (I use glmnet for variable selection and then generate a Cox regression) - That is, if I get an HR of 3 for the variable 300mg what does it mean ? the hazard is 3 times higher of what ? Here is some code to reproduce the issue: # Create a 4 level example factor trt <- factor( sample( c("PLACEBO", "300 MG", "600 MG", "1200 MG"), 100, replace=TRUE ) ) # Use contrasts to get the identity matrix of dummy variables to be used in glmnet trt2 <- contrasts (trt,contrasts=FALSE) Results (as you can see all levels are represented in the identity matrix): > levels (trt) [1] "1200 MG" "300 MG" "600 MG" "PLACEBO" > print (trt2) 1200 MG 300 MG 600 MG PLACEBO 1200 MG 1 0 0 0 300 MG 0 1 0 0 600 MG 0 0 1 0 PLACEBO 0 0 0 1 Thank you, Erel [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.