OFFTOPIC! This is a statistical question, not an R question. Post on a statistics site like stats.stackexchange.com .
However, your post suggests that you are completely out of your depth here (0/1 responses suggest that glm modeling via logistic regression is called for). Remote internet advice is unlikely to fill the gap between what you seem to need and what you seem to know. I strongly suggest you find a local statistical expert to help if you wish to avoid producing nonsense. (Once you have figured out what you need to do, questions about how to use R tools to do it are of course appropriate). Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Sun, May 17, 2015 at 1:06 PM, Kristi Glover <kristi.glo...@hotmail.com> wrote: > HI R user, > I was trying to reduce my independent variables before I run models. I have a > dependent variable as a present or TRUE only (no Absence or False) whereas I > have more than 20 independent variables but they are highly correlated. I was > trying to reduce the independent variables . I found PCA for feature > selection are used. > but for the PCA feature selection, I realized that it used dependent variable > (as a linear model) with independent variables to select the variables based > on variation explained. But, for me , the dependent data are only "1". > Therefore, I could not run it. > > would you give me some suggestions on how I reduce the variables into a > certain numbers ? I have attached a sample data. In this data set, the > dependent variable is "sp" and other 20 variables are the independent > variables > > dat<-structure(list(sp = c(1L, 1L, 1L, 1L, 1L), var1 = c(32L, 222L, > 134L, 114L, 121L), var2 = c(188L, 175L, 167L, 166L, 167L), var3 = c(123L, > 129L, 136L, 138L, 137L), var4 = c(40L, 35L, 37L, 38L, 37L), var5 = c(6756L, > 8080L, 7856L, 7899L, 7891L), var6 = c(334L, 352L, 341L, 340L, > 341L), var7 = c(29L, -9L, -18L, -22L, -20L), var8 = c(305L, 361L, > 359L, 362L, 361L), var9 = c(108L, 217L, 167L, 166L, 166L), var10 = c(237L, > 67L, 61L, 59L, 60L), var11 = c(270L, 276L, 265L, 264L, 264L), > var12 = c(97L, 67L, 61L, 59L, 60L), var13 = c(1491L, 916L, > 1245L, 1282L, 1250L), var14 = c(168L, 127L, 154L, 155L, 154L > ), var15 = c(99L, 43L, 67L, 70L, 68L), var16 = c(15L, 32L, > 22L, 21L, 21L), var17 = c(432L, 313L, 390L, 400L, 392L), > var18 = c(308L, 148L, 254L, 269L, 257L), var19 = c(332L, > 213L, 269L, 277L, 271L), var20 = c(430L, 148L, 254L, 269L, > 257L)), .Names = c("sp", "var1", "var2", "var3", "var4", > "var5", "var6", "var7", "var8", "var9", "var10", "var11", "var12", > "var13", "var14", "var15", "var16", "var17", "var18", "var19", > "var20"), class = "data.frame", row.names = c(NA, -5L)) > > thanks > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.