HI Sarah, I run the same code from your reply email. For the makegroup2, the results are 0 in places of NA.
> makegroup1 <- function(x,y) { + group <- numeric(length(x)) + group[x <= 1990 & y > 1990] <- 1 + group[x <= 1991 & y > 1991] <- 2 + group[x <= 1992 & y > 1992] <- 3 + group + } > makegroup2 <- function(x, y) { + ifelse(x <= 1990 & y > 1990, 1, + ifelse(x <= 1991 & y > 1991, 2, + ifelse(x <= 1992 & y > 1992, 3, 0))) + } > makegroup1(df$begin,df$end) [1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0 > makegroup2(df$begin,df$end) [1] 1 2 3 0 0 2 3 0 0 0 3 0 0 0 0 A. K. ----- Original Message ----- From: Sarah Goslee <sarah.gos...@gmail.com> To: g...@asu.edu Cc: "r-help@r-project.org" <r-help@r-project.org> Sent: Tuesday, May 8, 2012 2:33 PM Subject: Re: [R] grouping function Hi, On Tue, May 8, 2012 at 2:17 PM, Geoffrey Smith <g...@asu.edu> wrote: > Hello, I would like to write a function that makes a grouping variable for > some panel data . The grouping variable is made conditional on the begin > year and the end year. Here is the code I have written so far. > > name <- c(rep('Frank',5), rep('Tony',5), rep('Edward',5)); > begin <- c(seq(1990,1994), seq(1991,1995), seq(1992,1996)); > end <- c(seq(1995,1999), seq(1995,1999), seq(1996,2000)); > > df <- data.frame(name, begin, end); > df; Thanks for providing reproducible data. Two minor points: you don't need ; at the end of lines, and calling your data frame df is confusing because there's a df() function. > #This is the part I am stuck on; > > makegroup <- function(x,y) { > group <- 0 > if (x <= 1990 & y > 1990) {group==1} > if (x <= 1991 & y > 1991) {group==2} > if (x <= 1992 & y > 1992) {group==3} > return(x,y) > } > > makegroup(df$begin,df$end); > > #I am looking for output where each observation belongs to a group > conditional on the begin year and end year. I would also like to use a for > loop for programming accuracy as well; This isn't a clear specification: 1990, 1994 for instance fits into all three groups. Do you want to extend this to more start years, or are you only interested in those three? Assuming end is always >= start, you don't even need to consider the end years in your grouping. Here are two methods, one that "looks like" your pseudocode, and one that is more R-ish. They give different results because of different handling of cases that fit all three groups. Rearranging the statements in makegroup1() from broadest to most restrictive would make it give the same result as makegroup2(). makegroup1 <- function(x,y) { group <- numeric(length(x)) group[x <= 1990 & y > 1990] <- 1 group[x <= 1991 & y > 1991] <- 2 group[x <= 1992 & y > 1992] <- 3 group } makegroup2 <- function(x, y) { ifelse(x <= 1990 & y > 1990, 1, ifelse(x <= 1991 & y > 1991, 2, ifelse(x <= 1992 & y > 1992, 3, 0))) } > makegroup1(df$begin,df$end) [1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0 > makegroup2(df$begin,df$end) [1] 1 2 3 NA NA 2 3 NA NA NA 3 NA NA NA NA > df But really, it's a better idea to develop an unambiguous statement of your desired output. Sarah -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.