Hello everybody, I have a data frame in R which is similar to the follows. Actually my real 'df' dataframe is much bigger than this one here but I really do not want to confuse anybody so that is why I try to simplify things as much as possible.
So here's the data frame. id <-c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3) a <-c(3,1,3,3,1,3,3,3,3,1,3,2,1,2,1,3,3,2,1,1,1,3,1,3,3,3,2,1,1,3) b <-c(3,2,1,1,1,1,1,1,1,1,1,2,1,3,2,1,1,1,2,1,3,1,2,2,1,3,3,2,3,2) c <-c(1,3,2,3,2,1,2,3,3,2,2,3,1,2,3,3,3,1,1,2,3,3,1,2,2,3,2,2,3,2) d <-c(3,3,3,1,3,2,2,1,2,3,2,2,2,1,3,1,2,2,3,2,3,2,3,2,1,1,1,1,1,2) e <-c(2,3,1,2,1,2,3,3,1,1,2,1,1,3,3,2,1,1,3,3,2,2,3,3,3,2,3,2,1,3) df <-data.frame(id,a,b,c,d,e) df Basically what I would like to do is to get the occurrences of numbers for each column (a,b,c,d,e) and for each id group (1,2,3) (for this latter grouping see my column 'id'). So, for column 'a' and for id number '1' (for the latter see column 'id') the code would be something like this: as.numeric(table(df[1:10,2])) The results are: [1] 3 7 Just to briefly explain my results: in column 'a' (and regarding only those records which have number '1' in column 'id') we can say that: number 1 occured 3 times, and number 3 occured 7 times. Again, just to show you another example. For column 'a' and for id number '2' (for the latter grouping see again column 'id'): as.numeric(table(df[11:20,2])) After running the codes the results are: [1] 4 3 3 Let me explain a little again: in column 'a' and regarding only those observations which have number '2' in column 'id') we can say that number 1 occured 4 times number 2 occured 3 times and number 3 occured 3 times. Last example: for column 'e' and for id number '3' the code would be: as.numeric(table(df[21:30,6])) With the results: [1] 1 4 5 ...meaning that number '1' occured once, number '2' occured four times and number '3' occured 5 times. So this is what I would like to do. Calculating the occurrences of numbers for each custom-defined subsets (and then collecting these values into a data frame). I know it is NOT a difficult task but the PROBLEM is that I'm gonna have to change the input 'df' dataframe on a regular basis and hence both the overall number of rows and columns might CHANGE over time... What I have done so far is that I have separated the 'df' dataframe by columns, like this: for (z in (2:ncol(df))) assign(paste("df",z,sep="."),df[,z]) So df.2 will refer to df$a, df.3 will equal df$b, df.4 will equal df$c etc. But I'm really stuck now and I don't know how to move forward, you know, getting the occurrences for each column and each group of ids. Do you have any ideas? Best regards, Laszlo ____________________________________________________________________________________________________ Ez az e-mail és az összes hozzá tartozó csatolt melléklet titkos és/vagy jogilag, szakmailag vagy más módon védett információt tartalmazhat. Amennyiben nem Ãn a levél cÃmzettje akkor a levél tartalmának közlése, reprodukálása, másolása, vagy egyéb más úton történÅ terjesztése, felhasználása szigorúan tilos. Amennyiben tévedésbÅl kapta meg ezt az üzenetet kérjük azonnal értesÃtse az üzenet küldÅjét. Az Erste Bank Hungary Zrt. (EBH) nem vállal felelÅsséget az információ teljes és pontos - cÃmzett(ek)hez történÅ - eljuttatásáért, valamint semmilyen késésért, kapcsolat megszakadásból eredÅ hibáért, vagy az információ felhasználásából vagy annak megbÃzhatatlanságából eredÅ kárért. Az üzenetek EBH-n kÃvüli küldÅje vagy cÃmzettje tudomásul veszi és hozzájárul, hogy az üzenetekhez más banki alkalmazott is hozzáférhet az EBH folytonos munkamenetének biztosÃtása érdekében. This e-mail and any attached files are confidential and/...{{dropped:19}}
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.