On Mar 7, 2010, at 3:20 PM, Don MacQueen wrote:
And just a small followup. To find out what class each column is,
you wanted
lapply(a,class)
$x1
[1] "numeric"
$x2
[1] "factor"
$x3
[1] "factor"
With regard to your solution, and why it works, it is my
understanding that data frames are in some sense actually lists,
each column corresponding to one element in a list.
Hence, lapply() works column-wise on data frames.
Also for this reason it's pretty easy to convert back and forth
between data frames and lists . Provided, of course, that each
element of the list has an appropriate structure; see this example:
data.frame( list(a=1:2, b=3:4) )
a b
1 1 3
2 2 4
data.frame( list(a=1:2, b=3:7) )
Error in data.frame(a = 1:2, b = 3:7, check.names = FALSE,
stringsAsFactors = TRUE) :
arguments imply differing number of rows: 2, 5
No doubt there are subtle details, but don't ask me to provide
details on what exactly the "some sense" is!
It's not that complicated:
> class(dfrm)
[1] "data.frame"
> is.list(dfrm)
[1] TRUE
>
> dput(dfrm)
structure(list(a = 1:2, b = 3:4), .Names = c("a", "b"), row.names =
c(NA,
-2L), class = "data.frame")
# Let's do some violence to this dataframe ...
> class(dfrm) <- "list"
> dfrm
$a
[1] 1 2
$b
[1] 3 4
attr(,"row.names")
[1] 1 2
> is.data.frame(dfrm)
[1] FALSE
> is.data.frame(as.data.frame(dfrm))
[1] TRUE
> dput(dfrm)
structure(list(a = 1:2, b = 3:4), .Names = c("a", "b"), row.names =
c(NA,
-2L))
# Now let's restore it to its original data.frame-ish state:
> class(dfrm) <- "data.frame"
> dput(dfrm)
structure(list(a = 1:2, b = 3:4), .Names = c("a", "b"), row.names =
c(NA,
-2L), class = "data.frame")
-Don
At 12:07 PM +0200 3/7/10, Tal Galili wrote:
Hi all,
Let's say I have a data.frame and wants to turn each of it's
columns into a
factor.
My instinct would be to use as.factor with apply. But this won't
work, and
result with a data.frame of characters.
I found another solution for how to achieve this, but I would also
like to
understand - *WHY* does it work this way?
Here is an example script:
a <- data.frame(x1 = rnorm(100), x2 = sample(c("a","b"), 100,
replace = T),
x3 = factor(c(rep("a",50) , rep("b",50))))
apply(a2, 2,class) # why is column 3 not a factor ?
a[,3] # since it IS a factor.
a2 <- apply(a, 2,as.factor) # won't work - why not ?
a2[,3] # Why was this just turned into a character ???
# A solution
a2 <- lapply(a, as.factor)
a3 <- as.data.frame(a2)
str(a3)
Thanks,
Tal
----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com | 972-52-7275845
Read me: www.*talgalili.com (Hebrew) | www.*biostatistics.co.il
(Hebrew) |
www.*r-statistics.com (English)
----------------------------------------------------------------------------------------------
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
--
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
m...@llnl.gov
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.