try the following: out <- tapply(data1$ID, list(data1$ID, data1$Year), length) out[is.na(out)] <- 0 out
I hope it helps. Best, Dimitris Tom La Bone wrote:
Assume that I have the dataframe "data1", which is listed at the end of this message. I want count the number of lines that each person has for each year. For example, the person with ID=213 has 15 entries (NinYear) for 1953. The following bit of code calculates NinYear: for (i in 1:length(data1$ID)) { data1$NinYear[i] <- length(data1[data1$Year==data1$Year[i] & data1$ID==data1$ID[i],1]) } This seems to work but is horribly slow (some files I am working with have over 500,000 lines). Can anyone suggest a faster way of doing this, perhaps a way that does not use a for loop? Thanks. Tom ID Year NinYear 209 1971 0 209 1971 0 213 1951 0 213 1951 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1954 0 213 1955 0 213 1955 0 234 1953 0 234 1953 0 234 1953 0 234 1953 0 234 1953 0 234 1958 0 234 1958 0 234 1965 0 234 1965 0 234 1965 0 249 1952 0 249 1952 0
-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.