Re: [R] recoding data with loops

Donald Braman Mon, 19 May 2008 17:35:39 -0700

Many, many thanks Erik!  For anyone who is searching around looking for a
way to recode in R, here's the full code Erik provided:


var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1", "EDISCRIM",
"HREVDIS2")  ## my original list of variables
mdf <- data.frame(replicate(length(var_list), sample(7,100, replace =
TRUE)))   ## generate 100 records of random numbers sampled from 1:7
names(mdf) ## unnecessary, but helpful to see what R supplies as default
names
names(mdf) <- var_list ## substitues my variable names
mdf   ## lovely!


reverse_me_varnames <- c("HEQUAL", "HREVDIS1", "HREVDIS2")  ## these are the
variables I want to reverse code
reversed_varnames <-paste("R", reverse_me_varnames, sep = "")  ## this
generates the names of the reversed variables by taking on an "R"

mdf[reversed_varnames] <-
    lapply(mdf[reverse_me_varnames],
        function(x) recode(x, recodes = "5:7=NA; 1=4; 2=3; 3=2; 4=1;",
            as.factor.result = FALSE))  ## this applies the recode function
to all the variable I want to recode and stores them in the new "R___"
variables.
mdf  ## lovely!


I really like that R doesn't even need to use loops to do this -- seems very
efficient to me!




On Mon, May 19, 2008 at 6:49 PM, Erik Iverson <[EMAIL PROTECTED]>
wrote:

> Got it, I did not know of the 'recode' function in car.
>
> So you would like to recode those specific columns then?  Once again, we
> can do it without a loop, this time with the help of a function called
> lapply, which applies a function to each item in a list in turn.
>
> Try:
>
> reverse_me_varnames <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
> reversed_varnames <-paste("R", reverse_me_varnames, sep = "")
>
> ## See ?paste
>
> mdf[reversed_varnames] <-
>  lapply(mdf[reverse_me_varnames],
>         function(x) recode(x, recodes = "5:7=NA; 1=4; 2=3; 3=2; 4=1;",
>                as.factor.result = FALSE))
>
> Now what does this actually mean?  To the left of '<-' is simply the new
> columns of our data.frame.  We want to then use lapply to do some function
> to a list of objects.  The first argument to lapply is that list.  In this
> case, it is simply the columns of the data.frame you want reversed.  A
> data.frame is a list in R.  See ?list and ?data.frame. Then, the next
> argument to lapply is a function that we want to perform on each element in
> our list.  So, we create a function that accepts as input a variable I
> simply call 'x'.  This 'x' is going to be an item from the list we passed
> lapply, which is one of the columns of mdf in 'reverse_me_varnames'.
>
> We then use the recode function in the car package to recode x, in a
> similar way to what you tried before.  This function of x we define will get
> called three times in the above example, once for each of
> reverse_me_varnames.  It will then assign those three new columns to the
> left-hand side of the <- operator, which are three newly-named columns.
>
> To see why what you tried before did not work, with the for loop, try:
>
> mdf$HEQUAL
>
> contrasted with
>
> t1 <- c("HEQUAL")
> mdf$t1
>
> From the help for ?Extract, $ does not allow 'computed' indices.
>
> I hope this helps!
>
> Erik
>
>
> Donald Braman wrote:
>
>> Erik,
>>
>> Your example was just what I needed to generate the data -- many, many
>> thanks!  The names() function was something I had not grasped fully. I now
>> have this and it works very nicely:
>>
>> var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1", "EDISCRIM",
>> "HREVDIS2")
>> mdf <- data.frame(replicate(length(var_list), sample(7,100, replace =
>> TRUE))) ## generate random data
>> names(mdf) ## default names
>> names(mdf) <- var_list ## use our names
>> mdf
>>
>> I'm still trying to figure out how to recode (using the car package) data
>> into new variables using a similar loop. Basically, I'm not sure how to call
>> the variable name and append it to the dataframe name in a loop.  In Stata
>> I'd do this using single quotes, but clearly that's not how R works.  I
>> tried several variations on this:
>>
>> reverse_me_varnames <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
>> reversed_varnames <- c("RHEQUAL", "RHREVDIS1", "RHREVDIS2")
>> for(i in 1:length(reverse_me_varnames))
>>  {mdf$reversed_varnames[i] <- recode(mdf$reverse_me_varnames[i], '5:7=NA;
>> 1=4; 2=3; 3=2; 4=1;', as.factor.result=FALSE)
>>
>> While I don't get an error message, the data don't change.  Any advice on
>> reverse coding non-continguous variables?
>>
>>
>>
>> On Mon, May 19, 2008 at 4:12 PM, Donald Braman <[EMAIL PROTECTED]<mailto:
>> [EMAIL PROTECTED]>> wrote:
>>
>>    Many thanks --
>>
>>    You are right; I had rnorm() and sample() mixed up in my code. I'll
>>    work on generating a normal ordinal sample next.
>>
>>    Cheers, Don
>>
>>
>>    On Mon, May 19, 2008 at 4:07 PM, Erik Iverson
>>    <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>        Hello -
>>
>>
>>        Donald Braman wrote:
>>
>>            # I'm new to R and am trying to get the hang of how it handles
>>            # dataframes & loops. If anyone can help me with some simple
>>            tasks,
>>            # I'd be much obliged.
>>
>>            # First, i'd like to generate some random data in a dataframe
>>            # to efficiently illustrate what I'm up to.
>>            # let's say I have six variables as listed below (I really
>>            # have hundreds, but a few will illustrate the point).
>>            # I want to generate my dataframe (mdf)
>>            # with the 6 variables X 100 values with rnorm(7).
>>            # How do I do this?  I tried many variations on the following:
>>
>>            var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1",
>>            "EDISCRIM",
>>            "HREVDIS2")
>>            for(i in 1:length(var_list)) {var_list[1] <- rnorm(100)}
>>            mdf <- data.frame(cbind(varlist[1:length(var_list)])
>>            mdf
>>
>>        There are many ways to do this. Do you mean that you want 6
>>        columns, 100 observations in each column, each a sample from a
>>        normal distribution with mean = 7 and sd = 1?  You can do this
>>        without looping in one of several ways.  If you are coming from
>>        a SAS environment (my guess since you talk of looping over
>>        data.frames), you may be used to looping through a data object.
>>         In R, you can usually avoid this since many functions are
>>        vectorized, or take a 'whole object' approach.
>>
>>
>>        var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1",
>>        "EDISCRIM", "HREVDIS2")
>>
>>        mdf <- data.frame(replicate(6, rnorm(100, 7))) ## generate
>>        random data
>>        names(mdf) ## default names
>>        names(mdf) <- var_list ## use our names
>>
>>
>>
>>            # Then, I'd like to recode the variables that begin with the
>>            letter "H".
>>            # I've tried many variations of the following, but to no avail:
>>
>>            reverse_list <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
>>            reversed_list <- c("RHEQUAL", "RHREVDIS1", "RHREVDIS2")
>>            for(i in 1:length(reverse_list))
>>             {mdf[ ,e_reversed_list][[i]] <- recode(mdf[
>>            ,e_reverse_list][[i]],
>>            '5:99=NA; 1=4; 2=3; 3=2; 4=1; ', as.factor.result=FALSE)
>>
>>
>>        I'm not quite sure what you are after here.  What do you mean by
>>        recode? What package is your 'recode' function located in?
>>
>>        It appears that you may be under the impression that the
>>        data.frame contains integers, but certainly it will not since it
>>        was generated with rnorm?  sample can generate a samples of the
>>        type you may be after, for example,
>>
>>         > sample(7, 100, replace = TRUE)
>>
>>        Best,
>>        Erik Iverson
>>
>>
>>
>>
>>    --    Donald Braman
>>    http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
>>    http://research.yale.edu/culturalcognition
>>    http://ssrn.com/author=286206
>>
>>
>>
>> --
>> Donald Braman
>> http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
>> http://research.yale.edu/culturalcognition
>> http://ssrn.com/author=286206
>>
>


-- 
Donald Braman
http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
http://research.yale.edu/culturalcognition
http://ssrn.com/author=286206

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] recoding data with loops

Reply via email to