Replace your NA's column by column, not all at once. In your first example, of the form ifelse(condition, numbers, data.frame) the second and third arguments are replicated to the length of the first. A data.frame's length is the number of columns it has, so ifelse repeats its columns, not what you want. Also, the 2nd and 3rd arguments to ifelse should be of the same type, since the output will be a vector that accepts some values from each. If they don't have the same type the output will be of some type that can accept values from both types. That type is often character or list, not what you want
Your second example code used unlist(data.frame). data.frames contain columns of various classes and unlist(data.frame) creates a vector with one class, the class is chosen to retain the information, if not the format, of columns in the data.frame. It is generally not a useful thing, unless all columns have the same class. You showed some code but not data, so I'll make up something like you described df <- data.frame(stringsAsFactors=FALSE, Number1 = c(1, 2, 3, NA, 5, 6), Number2 = c(11, 12, 13, 14, 14, NA), String = c("one","two",NA,"four","five","six"), Factor = factor(c("Group A", NA, "Group A", "Group B", "Group B", "Group B"))) Look at its structure with > str(df) 'data.frame': 6 obs. of 4 variables: $ Number1: num 1 2 3 NA 5 6 $ Number2: num 11 12 13 14 14 NA $ String : chr "one" "two" NA "four" ... $ Factor : Factor w/ 2 levels "Group A","Group B": 1 NA 1 2 2 2 To do the sort of conversion you want try something like f <- function(d) { for(i in seq_along(d)) { di <- d[[i]] di[is.na(di)] <- if (is.numeric(di)) { # could use switch instead of if-then-else if (i==2) { 0 } else { 1 } } else if (is.factor(di)) { levels(di)[1] # I don't know what you want here } else if (is.character(di)) { "Unknown" } d[[i]] <- di } d } That would give you > str(f(df)) 'data.frame': 6 obs. of 4 variables: $ Number1: num 1 2 3 1 5 6 $ Number2: num 11 12 13 14 14 0 $ String : chr "one" "two" "Unknown" "four" ... $ Factor : Factor w/ 2 levels "Group A","Group B": 1 1 1 2 2 2 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of David Romano > Sent: Thursday, November 15, 2012 7:58 AM > To: Bert Gunter > Cc: r-help@r-project.org > Subject: Re: [R] using ifelse to remove NA's from specific columns of a data > frame > containing strings and numbers > > Thanks for the suggestion, Bert; I just re-read the introduction with > particular attention to the sections you mentioned, but I don't see how any > of it bears on my question. Namely -- to rephrase: What constraints are > there on the form of the "yes" and "no" values required by ifelse? The > introduction doesn't really speak to this, and the help documentation seems > to suggest that as long the shapes of the test, "yes" values, and "no" > values agree, that would be sufficient -- I don't see anything that > specifies that any of these should be of a particular data type. My > example, however, seems to indicate that the "yes" and "no" values can't be > a mixture of characters and numbers, and I'm trying to figure out what the > underlying constraints are on ifelse. > > Thanks again, > David > > On Thu, Nov 15, 2012 at 6:46 AM, Bert Gunter <gunter.ber...@gene.com> wrote: > > > David: > > > > You seem to be getting lost in basic R tasks. Have you read the Intro > > to R tutorial? If not, do so, as this should tell you how to do what > > you need. If so, re-read the sections on indexing ("["), replacement, > > and NA's. Also read about character vectors and factors. > > > > -- Bert > > > > On Thu, Nov 15, 2012 at 3:19 AM, David Romano <drom...@stanford.edu> > > wrote: > > > Hi everyone, > > > > > > I have a data frame one of whose columns is a character vector and the > > rest > > > are numeric, and in debugging a script, I noticed that an ifelse call > > seems > > > to be coercing the character column to a numeric column, and producing > > > unintended values as a result. Roughly, here's what I tried to do: > > > > > > df: a data frame with, say, the first column as a character column and > > the > > > second and third columns numeric. > > > > > > also: NA's occur only in the numeric columns, and if they occur in one, > > > they occur in the other as well. > > > > > > I wanted to replace the NA's in column 2 with 0's and the ones in column > > 3 > > > with 1's, so first I did this: > > > > > >> na.replacements <-ifelse(col(df)==2,0,1). > > > > > > Then I used a second ifelse call to try to remove the NA's as I wanted, > > > first by doing this: > > > > > >> clean.df <- ifelse(is.na(df), na.replacements, df), > > > > > > which produced a list of lists vaguely resembling df, with the NA's > > mostly > > > intact, and so then I tried this: > > > > > >> clean.df <- ifelse(is.na(df), na.replacements, unlist(df)), > > > > > > which seems to work if all the columns are numeric, but otherwise changes > > > strings to numbers. > > > > > > I can't make sense of the help documentation enough to clear this up, but > > > my guess is that the "yes" and "no" values passed to ifelse need to be > > > vectors, in which case it seems I'll have to use another approach > > entirely, > > > but even if is not the case and lists are acceptable, I'm not sure how to > > > convert a mixed-mode data frame into a vector-like list of elements > > (which > > > I would hope would work). > > > > > > I'd be grateful for any suggestions! > > > > > > Thanks, > > > David Romano > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > > > Bert Gunter > > Genentech Nonclinical Biostatistics > > > > Internal Contact Info: > > Phone: 467-7374 > > Website: > > > > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- > biostatistics/pdb-ncb-home.htm > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.