Well, you can help yourself on this list if you stop letting your email
client determine the format (HTML in this case) that you use since that
format gets corrupted on this mailing list leading to frequent
misunderstandings. Learn how to make your email client send plain text
format.
If you go back to your first line and look at str(data), you will see that
read.csv automatically converted the gender column to a factor for you.
In your later attempt to convert it you thought it would draw on the
underlying integer values when it "acts" like character data so none of
the specified levels ("1" or "2") were found in it.
If you want to control the levels used in the factor (as I usually prefer
to do) then use either the as.is=TRUE or stringsAsFactors=FALSE parameter
to the read.csv function to make sure no factors are automatically
created. Then specify character values for your levels instead of
second-guessing R.
Note that there is a bit of an art to reading the help files, as in:
?read.csv
that you should start to practice. When you do read that help file, you
will find that there are a lot of parameters to the "read.table" function,
and rather fewer specified for the read.csv definition. The reason is that
the read.csv function simply calls the read.table function with certain
parameters forced to specific values. You can set any of the other
parameters that read.table expects in your call to read.csv and they will
be passed on to read.table.
Oh, and one other thing: functions are quite similar to data objects in R,
and there is a function called "data" that comes with R. While defining
your own object called "data" works in this case, it is good practice to
learn to not re-use object names like that since it can make reading your
code confusing at the very least.
On Sat, 11 Jul 2015, Dagmar Jurankov? wrote:
Hello everybody, I have a problem with R.
I uploaded a questionnaire saved as csv into R and I tried to test
independence between two variables.
data <- read.csv("C:/Users/Me/Desktop/data.csv")> View(data)> df =
read.csv("C:/Users/Me/Desktop/data.csv")> ls()
[1] "df" "data"> attributes(data$gender)
$levels
[1] " F" " M" "F" "M"
$class
[1] "factor"
I changed my variable "gender" into a factor using:
data$gender=factor(data$gender, levels=c(1:2), labels= c( "F", "M"),
exclude= NA, nmax= NA).
Then I wrote data$gender and the only thing i got was:
[1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
<NA> <NA> <NA> <NA> <NA> <NA>
[21] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
<NA> <NA> <NA> <NA> <NA> <NA>
[41] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
<NA> <NA> <NA> <NA> <NA> <NA>
[61] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Levels: F M
Does anybody know why?
-My csv doc in the column gender is filled out properly. (M=Male, F= Female)
-My imported dataset in R is complete (all values)
! I have done this with a different excel document and it worked out
without any problems. I am really clueless. I cant go further and compare
the variables and do t-tests without this working.
Could someone please help me out?
Thank you.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.