On Wed, Oct 27, 2010 at 4:03 AM, Ivan Calandra <ivan.calan...@uni-hamburg.de> wrote: > Hi, > > Gabor gave you a great answer already. But I would add a few precisions. > Someone please correct me if I'm wrong. > > Arrays are matrices with more than 2 dimensions. Put the other way: matrices > are arrays with only 2 dimensions.
Arrays can have any number of dimensions including 1, 2, 3, etc. > # a 2d array is a matrix. Its composed from a vector plus two dimensions. > m <- array(1:4, c(2, 2)) > dput(m) structure(1:4, .Dim = c(2L, 2L)) > class(m) [1] "matrix" > is.array(m) [1] TRUE > # a 1d array is a vector plus a single dimension > a1 <- array(1:4, 4) > dput(a1) structure(1:4, .Dim = 4L) > dim(a1) [1] 4 > class(a1) [1] "array" > is.array(a1) [1] TRUE > # if we remove dimension part its no longer an array but just a vector > nota <- a1 > dim(nota) <- NULL > dput(nota) 1:4 > is.array(nota) [1] FALSE > is.vector(nota) [1] TRUE > > I would also add these: > - the components of a vector have to be of the same mode (character, > numeric, integer...) however, a list with no attributes is a vector too so this is a vector: > vl <- list(sin, 3, "a") > is.vector(vl) [1] TRUE A vector may not have attributes so arrays and factors are not vectors although they are composed from vectors. > - which implies that the components of matrices and arrays have to be also > of the same mode (which might lead to some coercion of your data if you > don't pay attention to it). > > Factor are character data, but coded as numeric mode. Each number is > associated with a given string, the so-called levels. Here is an example: > my.fac <- factor(c("something", "other", "more", "something", "other", > "more")) A factor is composed of an integer vector plus a levels attribute (called .Label internally) as in this code: > fac <- factor(c("b", "a", "b")) > dput(fac) structure(c(2L, 1L, 2L), .Label = c("a", "b"), class = "factor") > levels(fac) [1] "a" "b" > my.fac > [1] something other more something other more > Levels: more other something > mode(my.fac) > [1] "numeric" ## coded as numeric even though you gave character > strings! > class(my.fac) > [1] "factor" > levels(my.fac) > [1] "more" "other" "something" > as.numeric(my.fac) > [1] 3 2 1 3 2 1 ## internal representation > as.character(my.fac) > [1] "something" "other" "more" "something" "other" "more" ## > what you think it is! > > I found that the book "Data Manipulation with R" from Phil Spector (2008) > was quite well done to explain all these object modes and classes, even > though I wouldn't have understood completely by reading only this book (not > that I have yet completely mastered this topic...) > > HTH, > Ivan > > > > Le 10/27/2010 02:49, Gabor Grothendieck a écrit : >> >> On Tue, Oct 26, 2010 at 8:37 PM, Matt Curcio<matt.curcio...@gmail.com> >> wrote: >>> >>> Hi All, >>> I am learning R and having a little trouble with the usage and proper >>> definitions of data.frames vs. matrix vs vectors. I have read many R >>> tutorials, and looked over ump-teen 'cheat' sheets and have found that >>> no one has articulated a really good definition of the differences >>> between 'data.frames', 'matrix', and 'arrays' and even 'factors'. I >>> realize that I might have missed someones R tutorial, and actually >>> would like to receive 'your' most concise or most useful tutorial. >>> Any help would be appreciated. >>> >>> My particular favorite explanation and helpful hint is from the >>> 'R-Inferno'. Don't get me wrong... I think this pdf is great and >>> some tables are excellent. Overall it is a very good primer but this >>> one section leaves me puzzled. This quote belies the lack of hard and >>> fast rules for what and when to use 'data.frames', 'matrix', and >>> 'arrays'. It discusses ways in which to simplify your work. >>> >>> Here are a few possibilities for simplifying: >>> • Don’t use a list when an atomic vector will do. >>> • Don’t use a data frame when a matrix will do. >>> • Don’t try to use an atomic vector when a list is needed. >>> • Don’t try to use a matrix when a data frame is needed. >>> >>> Cheers, >>> Matt C >> >> Look at their internal representations and it will become clearer. v, >> a vector, has length 6. m, a matrix, is actually the same as the >> vector v except is has dimensions too. Since m is just a vector with >> dimensions, m has length 6 as well. L is a list and has length 2 >> because its a vector each of whose components is itself a vector. DF >> is a data frame and is the same as L except its 2 components must each >> have the same length and it must have row and column names. If you >> don't assign the row and column names they are automatically generated >> as we can see. Note that row.names = c(NA, -3L) is a short form for >> row names of 1:3 and .Names internally refers to column names. >> >>> v<- 1:6 # vector >>> dput(v) >> >> 1:6 >>> >>> m<- v; dim(m)<- 2:3 # m is a matrix since we added dimensions >>> dput(m) >> >> structure(1:6, .Dim = 2:3) >>> >>> L<- list(1:3, 4:6) >>> dput(L) >> >> list(1:3, 4:6) >>> >>> DF<- data.frame(1:3, 4:6) >>> dput(DF) >> >> structure(list(X1.3 = 1:3, X4.6 = 4:6), .Names = c("X1.3", "X4.6" >> ), row.names = c(NA, -3L), class = "data.frame") >> > > -- > Ivan CALANDRA > PhD Student > University of Hamburg > Biozentrum Grindel und Zoologisches Museum > Abt. Säugetiere > Martin-Luther-King-Platz 3 > D-20146 Hamburg, GERMANY > +49(0)40 42838 6231 > ivan.calan...@uni-hamburg.de > > ********** > http://www.for771.uni-bonn.de > http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.