Re: [R] recoding data with loops

Erik Iverson Mon, 19 May 2008 16:48:01 -0700

Got it, I did not know of the 'recode' function in car.

So you would like to recode those specific columns then? Once again, wecan do it without a loop, this time with the help of a function calledlapply, which applies a function to each item in a list in turn.


Try:

reverse_me_varnames <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
reversed_varnames <-paste("R", reverse_me_varnames, sep = "")

## See ?paste

mdf[reversed_varnames] <-
  lapply(mdf[reverse_me_varnames],
         function(x) recode(x, recodes = "5:7=NA; 1=4; 2=3; 3=2; 4=1;",
                as.factor.result = FALSE))

Now what does this actually mean? To the left of '<-' is simply the newcolumns of our data.frame. We want to then use lapply to do somefunction to a list of objects. The first argument to lapply is thatlist. In this case, it is simply the columns of the data.frame you wantreversed. A data.frame is a list in R. See ?list and ?data.frame.Then, the next argument to lapply is a function that we want to performon each element in our list. So, we create a function that accepts asinput a variable I simply call 'x'. This 'x' is going to be an itemfrom the list we passed lapply, which is one of the columns of mdf in'reverse_me_varnames'.

We then use the recode function in the car package to recode x, in asimilar way to what you tried before. This function of x we define willget called three times in the above example, once for each ofreverse_me_varnames. It will then assign those three new columns to theleft-hand side of the <- operator, which are three newly-named columns.


To see why what you tried before did not work, with the for loop, try:

mdf$HEQUAL

contrasted with

t1 <- c("HEQUAL")
mdf$t1

From the help for ?Extract, $ does not allow 'computed' indices.

I hope this helps!

Erik


Donald Braman wrote:

Erik,

Your example was just what I needed to generate the data -- many, manythanks! The names() function was something I had not grasped fully. Inow have this and it works very nicely:

var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1", "EDISCRIM","HREVDIS2")mdf <- data.frame(replicate(length(var_list), sample(7,100, replace =TRUE))) ## generate random data

names(mdf) ## default names
names(mdf) <- var_list ## use our names
mdf

I'm still trying to figure out how to recode (using the car package)data into new variables using a similar loop. Basically, I'm not surehow to call the variable name and append it to the dataframe name in aloop. In Stata I'd do this using single quotes, but clearly that's nothow R works. I tried several variations on this:


reverse_me_varnames <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
reversed_varnames <- c("RHEQUAL", "RHREVDIS1", "RHREVDIS2")
for(i in 1:length(reverse_me_varnames))

{mdf$reversed_varnames[i] <- recode(mdf$reverse_me_varnames[i],'5:7=NA; 1=4; 2=3; 3=2; 4=1;', as.factor.result=FALSE)

While I don't get an error message, the data don't change. Any adviceon reverse coding non-continguous variables?

On Mon, May 19, 2008 at 4:12 PM, Donald Braman <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:


    Many thanks --

    You are right; I had rnorm() and sample() mixed up in my code. I'll
    work on generating a normal ordinal sample next.

    Cheers, Don


    On Mon, May 19, 2008 at 4:07 PM, Erik Iverson
    <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:

        Hello -


        Donald Braman wrote:

            # I'm new to R and am trying to get the hang of how it handles
            # dataframes & loops. If anyone can help me with some simple
            tasks,
            # I'd be much obliged.

            # First, i'd like to generate some random data in a dataframe
            # to efficiently illustrate what I'm up to.
            # let's say I have six variables as listed below (I really
            # have hundreds, but a few will illustrate the point).
            # I want to generate my dataframe (mdf)
            # with the 6 variables X 100 values with rnorm(7).
            # How do I do this?  I tried many variations on the following:

            var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1",
            "EDISCRIM",
            "HREVDIS2")
            for(i in 1:length(var_list)) {var_list[1] <- rnorm(100)}
            mdf <- data.frame(cbind(varlist[1:length(var_list)])
            mdf

        There are many ways to do this. Do you mean that you want 6
        columns, 100 observations in each column, each a sample from a
        normal distribution with mean = 7 and sd = 1?  You can do this
        without looping in one of several ways.  If you are coming from
        a SAS environment (my guess since you talk of looping over
        data.frames), you may be used to looping through a data object.
         In R, you can usually avoid this since many functions are
        vectorized, or take a 'whole object' approach.


        var_list <- c("HEQUAL", "EWEALTH", "ERADEQ", "HREVDIS1",
        "EDISCRIM", "HREVDIS2")

        mdf <- data.frame(replicate(6, rnorm(100, 7))) ## generate
        random data
        names(mdf) ## default names
        names(mdf) <- var_list ## use our names



            # Then, I'd like to recode the variables that begin with the
            letter "H".
            # I've tried many variations of the following, but to no avail:

            reverse_list <- c("HEQUAL", "HREVDIS1", "HREVDIS2")
            reversed_list <- c("RHEQUAL", "RHREVDIS1", "RHREVDIS2")
            for(i in 1:length(reverse_list))
             {mdf[ ,e_reversed_list][[i]] <- recode(mdf[
            ,e_reverse_list][[i]],
            '5:99=NA; 1=4; 2=3; 3=2; 4=1; ', as.factor.result=FALSE)


        I'm not quite sure what you are after here.  What do you mean by
        recode? What package is your 'recode' function located in?

        It appears that you may be under the impression that the
        data.frame contains integers, but certainly it will not since it
        was generated with rnorm?  sample can generate a samples of the
        type you may be after, for example,

         > sample(7, 100, replace = TRUE)

        Best,
        Erik Iverson

--Donald Braman

    http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
    http://research.yale.edu/culturalcognition

http://ssrn.com/author=286206




--
Donald Braman
http://www.law.gwu.edu/Faculty/profile.aspx?id=10123
http://research.yale.edu/culturalcognition
http://ssrn.com/author=286206


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] recoding data with loops

Reply via email to