gravityflyer <gravityflyer <at> yahoo.com> writes: > > Hi everyone, > > I've got a dataset with 12,000 observations. One of the variables > (cleary$D1) is for an individual's country, coded 1 - 15. I'd like to create > a dummy variable for the Baltic states which are coded 4,6, and 7. In other > words, as a dummy variable Baltic states would be coded 1, else 0. I've > attempted the following for loop: > > dummy <- matrix(NA, nrow=nrow(cleary), ncol=1) > for (i in 1:length(cleary$D1)){ > if (cleary$D1 == 4){dummy[i] = 1} > else {dummy[i] = 0} > } > > Unfortunately it generates the following error: > > 1: In if (cleary$D1 == 4) { ... : > the condition has length > 1 and only the first element will be used > > Another options I've tried is the following: > > binary <- vector(length=length(cleary$D1)) > for (i in 1:length(cleary$D1)) { > if (cleary$D1 == 4 | cleary$D1 == 6 | cleary$D1 == 7 ) {binary[i] = 1} > else {binary[i] = 0} > } > > Unfortunately it simply responds with "syntax error". > > Any thoughts would be greatly appreciated! >
Be aware that R is a vectorised programming language, therefore your for loop in completely unnecessary. This is what I'd do: dummy <- rep(0, nrow(cleary)) dummy[cleary$D1 %in% c(4,6,7)] <- 1 This is your dummy variable. Below is your working (though VERY inefficient) version of the for loop: binary <- vector(length=length(cleary$D1)) for (i in 1:length(cleary$D1)) { if (cleary$D1[i] == 4 | cleary$D1[i] == 6 | cleary$D1[i] == 7 ) { binary[i] = 1 } else { binary[i] = 0 } } Now try to figure out: - what is the difference between your for() loop and mine? - which code is more simple (and better), the vectorised or the for() loop? I hope it helps, Adrian ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.