If I could add one comment, the above solution leaves NAs if you are not
changing the names of all of the variables (those not present are assigned
NA's).  The solution for this is of course:

df1 <-
data.frame(V1=1:3,V2=c(paste(LETTERS[1],LETTERS[1:3],sep='')),stringsAsFactors
= FALSE)
unique(df1$V2)
[1] "AA" "AB" "AC"

df1$new <- c("d","e")[match(df1$V2, unique(df1$V2)[-1])]
df1
  V1 V2  new
1  1 AA <NA>
2  2 AB    d
3  3 AC    e

na.loc <- which(df1$new %in% NA)
df1$new[na.loc] <- df1$V2[na.loc]
df1

df1
  V1 V2 new
1  1 AA  AA
2  2 AB   d
3  3 AC   e


Thanks again.

On Wed, Mar 28, 2012 at 3:52 PM, David Winsemius <dwinsem...@comcast.net>wrote:

>
> On Mar 28, 2012, at 6:40 PM, Trevor Davies wrote:
>
>  Thank you, works perfectly.
>>
>>
> Good. There is also a recode function in package 'car' (IIRC) which
> attempts to replicate the syntax of the same command in SPSS. But once I
> figured out how to use match() as an index with "[", I have never needed to
> use that method. (Which is not to say that I am not in debt to John Fox...
> his scatter3d functions is a real gem.)
>
> --
> david.
>
>  On Wed, Mar 28, 2012 at 3:11 PM, David Winsemius <dwinsem...@comcast.net
>> >wrote:
>>
>>
>>> On Mar 28, 2012, at 5:26 PM, Trevor Davies wrote:
>>>
>>> I've looked but I cannot find a more elegant solution.
>>>
>>>>
>>>> I would like to be able to scan through a data.frame and remove multiple
>>>> and various instances of certain contents.
>>>>
>>>> A trivial example is below.  It works, it just seems like there should
>>>> be
>>>> a
>>>> one line solution.
>>>>
>>>> #Example data:
>>>> a <-
>>>> data.frame(V1=1:3,V2=c(paste(****LETTERS[1],LETTERS[1:3],sep='**'**
>>>>
>>>> )),options(stringsAsFactors
>>>> = FALSE))
>>>>
>>>> #> a
>>>> # V1 V2
>>>> #1  1 AA
>>>> #2  2 AB
>>>> #3  3 AC
>>>>
>>>> #Cumbersome solution (which would be even more cumbersome with real
>>>> data)
>>>>
>>>> indices.of.aa <- which(a$V2 %in% "AA")
>>>> indices.of.ab <- which(a$V2 %in% "AB")
>>>> indices.of.ac <- which(a$V2 %in% "AC")
>>>> a$V2 <- replace(a$V2, indices.of.aa, "c")
>>>> a$V2 <- replace(a$V2, indices.of.ab, "d")
>>>> a$V2 <- replace(a$V2, indices.of.ac, "e")
>>>>
>>>>
>>>>  Use match:
>>>
>>>  df1 <- data.frame(V1=1:3,V2=c(paste(****LETTERS[1],LETTERS[1:3],sep='**
>>>> '**
>>>>
>>> )),
>>>                stringsAsFactors = FALSE))
>>>
>>>> unique(df1$V2)
>>>>
>>> [1] "AA" "AB" "AC"
>>>
>>>> df1$new <- c("c","d","e")[match(df1$V2, unique(df1$V2))]
>>>> df1
>>>>
>>> V1 V2 new
>>> 1  1 AA   c
>>> 2  2 AB   d
>>> 3  3 AC   e
>>>
>>>
>>>
>>> ## output
>>>
>>>> #> a
>>>> #  V1 V2
>>>> #1  1  c
>>>> #2  2  d
>>>> #3  3  e
>>>>
>>>> I know with the trivial example above there are extremely simple
>>>> solutions
>>>> but my data.frame is a few thousand rows.
>>>> Thanks all.
>>>> Trevor
>>>>
>>>>      [[alternative HTML version deleted]]
>>>>
>>>>
>>>
>>> ---
>>> David Winsemius, MD
>>> West Hartford, CT
>>>
>>>
>>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________**________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to