There is probably an easier way to do this, but > set.seed(42) > mydf <- data.frame(t(replicate(100, sample(c("red", "blue", + "green", "yellow", NA), 4)))) > colnames(mydf) <- c("rank1", "rank2", "rank3", "rank4") > head(mydf) rank1 rank2 rank3 rank4 1 <NA> yellow red blue 2 yellow green <NA> red 3 yellow green blue <NA> 4 <NA> blue yellow green 5 <NA> red blue green 6 <NA> red green blue > lvls <- levels(mydf$rank1) > # convert color factors to numeric > for (i in seq_along(mydf)) mydf[,i] <- as.numeric(mydf[,i]) > # stack the columns > mydf2 <- stack(mydf) > # convert rank factor to numeric > mydf2$ind <- as.numeric(mydf2$ind) > # add row numbers > mydf2 <- data.frame(rows=1:100, mydf2) > # Create table > mytbl <- xtabs(ind~rows+values, mydf2) > # convert to data frame > mydf3 <- data.frame(unclass(mytbl)) > colnames(mydf3) <- lvls > head(mydf3) blue green red yellow 1 4 0 3 2 2 0 2 4 1 3 3 2 0 1 4 2 4 0 3 5 3 4 2 0 6 4 3 2 0
David C -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Simon Kiss Sent: Friday, August 15, 2014 3:58 PM To: r-help@r-project.org Subject: Re: [R] Turn Rank Ordering Into Numerical Scores By Transposing A Data Frame Both the suggestions I got work very well, but what I didn't realize is that NA values would cause serious problems. Where there is a missing value, using the argument na.last=NA to order just returns the the order of the factor levels, but excludes the missing values, but I have no idea where those occur in the or rather which of those variables were actually missing. Have I explained this problem sufficiently? I didn't think it would cause such a problem so I didn't include it in the original problem definition. Yours, Simon On Jul 25, 2014, at 4:58 PM, David L Carlson <dcarl...@tamu.edu> wrote: > I think this gets what you want. But your data are not reproducible since > they are randomly drawn without setting a seed and the two data sets have no > relationship to one another. > >> set.seed(42) >> mydf <- data.frame(t(replicate(100, sample(c("red", "blue", > + "green", "yellow"))))) >> colnames(mydf) <- c("rank1", "rank2", "rank3", "rank4") >> mydf2 <- data.frame(t(apply(mydf, 1, order))) >> colnames(mydf2) <- levels(mydf$rank1) >> head(mydf) > rank1 rank2 rank3 rank4 > 1 yellow green red blue > 2 green blue yellow red > 3 green yellow red blue > 4 yellow red green blue > 5 yellow red green blue > 6 yellow red blue green >> head(mydf2) > blue green red yellow > 1 4 2 3 1 > 2 2 1 4 3 > 3 4 1 3 2 > 4 4 3 2 1 > 5 4 3 2 1 > 6 3 4 2 1 > > ------------------------------------- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Simon Kiss > Sent: Friday, July 25, 2014 2:34 PM > To: r-help@r-project.org > Subject: [R] Turn Rank Ordering Into Numerical Scores By Transposing A Data > Frame > > Hello: > I have data that looks like mydf, below. It is the results of a survey where > participants were to put a number of statements (in this case colours) in > their order of preference. In this case, the rank number is the variable, and > the factor level for each respondent is which colour they assigned to that > rank. I would like to find a way to effectively transpose the data frame so > that it looks like mydf2, also below, where the colours the participants were > able to choose are the variables and the variable score is what that person > ranked that variable. > > Ultimately what I would like to do is a factor analysis on these items, so > I'd like to be able to see if people ranked red and yellow higher together > but ranked green and blue together lower, that sort of thing. > I have played around with different variations of t(), melt(), ifelse() and > if() but can't find a solution. > Thank you > Simon > #Reproducible code > mydf<-data.frame(rank1=sample(c('red', 'blue', 'green', 'yellow'), > replace=TRUE, size=100), rank2=sample(c('red', 'blue', 'green', 'yellow'), > replace=TRUE, size=100), rank3=sample(c('red', 'blue', 'green', 'yellow'), > replace=TRUE, size=100), rank4=sample(c('red', 'blue', 'green', 'yellow'), > replace=TRUE, size=100)) > > mydf2<-data.frame(red=sample(c(1,2,3,4), > replace=TRUE,size=100),blue=sample(c(1,2,3,4), > replace=TRUE,size=100),green=sample(c(1,2,3,4), replace=TRUE,size=100) > ,yellow=sample(c(1,2,3,4), replace=TRUE,size=100)) > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.