Hi, set.seed(24) dat1<-data.frame(X=sample(letters,20,replace=TRUE),Y=sample(1:40,20,replace=TRUE),stringsAsFactors=FALSE) mat1<-as.matrix(dat1) sapply(dat1,class) # X Y #"character" "integer"
sapply(split(mat1,col(mat1)),class) # 1 2 #"character" "character" str(as.data.frame(mat1)) #'data.frame': 20 obs. of 2 variables: # $ X: Factor w/ 14 levels "b","d","f","g",..: 5 3 11 8 10 14 5 12 13 4 ... # $ Y: Factor w/ 14 levels "10","13","15",..: 12 5 9 13 14 8 12 6 7 4 ... If you have data of the same type, matrix would be faster when compared to data.frame. set.seed(245) mat2<- matrix(sample(1:50,3*1e7,replace=TRUE),ncol=3) dat2<- as.data.frame(mat2) system.time(res1<- rowSums(mat2)) # user system elapsed # 0.132 0.016 0.201 system.time(res2<- rowSums(dat2)) # user system elapsed # 0.376 0.056 0.447 identical(res1,res2) #[1] TRUE A.K. ----- Original Message ----- From: Anika Masters <anika.mast...@gmail.com> To: R help <r-help@r-project.org> Cc: Sent: Thursday, June 27, 2013 2:26 PM Subject: [R] when to use & pros/cons of dataframe vs. matrix? When "should" I use a dataframe vs. a matrix? What are the pros and cons? If I have data of all the same type, am I usually better off using a matrix and not a dataframe? What are the advantages if any of using a dataframe vs. a matrix? (rownames and column names perhaps?) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.