On Wed, Aug 27, 2008 at 12:11 PM, Josip Dasovic <[EMAIL PROTECTED]> wrote:
> Hello:
>
> As someone making the move from STATA to R, I'm finding it difficult at times 
> to perform basic tasks in R, so forgive me if I've missed an obvious and 
> easily obtained solution to my problem.   I've searched the help guides and 
> the archives and have not been able to find a solution that works.
>
> I have a data frame with thousands of observations that looks something like 
> this:
>
> YEAR MONTH DAY   COUNTRY         REGION                  PROVINCE             
>  CITY
> 1994     1  22 Sri Lanka     South Asia       Northern (Province)       
> Pungudutivu
> 1994     1  25 Sri Lanka     South Asia        Central (Province)             
> Kandy
> 1994     2  26 Sri Lanka     South Asia        Central (Province)             
> Kandy
> 1994     2  28 Sri Lanka     South Asia        Eastern (Province)         
> Wakianeri
> 1994     6  28 Sri Lanka     South Asia        Eastern (Province)        
> Valachenai
> 1994     6  31 Sri Lanka     South Asia        Central (Province)             
> Kandy
> 1995     3   1 Sri Lanka     South Asia          North (Province)       
> Kilinochchi
> 1995     3   6 Sri Lanka     South Asia        Western (Province)           
> Colombo
> 1995     7  15 Sri Lanka     South Asia       Northern (Province)          
> Mankulam
> 1995     7  23 Sri Lanka     South Asia       Northern (Province)       Point 
> Pedro
> 1995     9  25 Sri Lanka     South Asia       Northern (Province)            
> Kilali
> ...
>
> What I would like to do is to calculate the total number of observations by 
> unique combinations of the values of (some of the) variables above.
>
> For example, I would like to know how many observations (i.e. rows) have the 
> values YEAR==1994 and MONTH==1.
>
> In the end, I'd like a table that looks like this:
>
> YEAR MONTH #OBS
> 1994     1  2
> 1994     2  2
> 1994     3  0
> 1994     4  0
> 1994     5  0
> 1994     6  2
> 1994     7  0
> 1994     8  0
> 1994     9  0
> 1994     10  0
> 1994     11  0
> 1994     12  0
> 1995     1  0
> 1995     2  0
> 1995     3  2
> 1995     4  0
> ...
>
> I do need to fill out the table with all the possible combinations, even 
> where there are no observations with that combination in the data set.
> At first, it seemed like this would not be  think that aggregate is probably 
> the way to go, but there doesn't seem to be an appropriate summary function 
> (FUN) available.  Thanks in advance for any help in this matter,


For this, and other related problems, you might want to look at the
reshape package - http://had.co.nz/reshape

Hadley

-- 
http://had.co.nz/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to