On Nov 3, 2011, at 12:28 PM, Stefano Sofia wrote:

Dear R users,
I have got the following data frame, called my_df:

  gender day_birth month_birth year_birth labour
1 F 22 10 2001 1 2 M 29 10 2001 2 3 M 1 11 2001 1 4 F 3 11 2001 1 5 M 3 11 2001 2 6 F 4 11 2001 1 7 F 4 11 2001 2 8 F 5 12 2001 2 9 M 22 14 2001 2 10 F 29 13 2001 2
...

I need to count data in different ways:

1. count the births for each day (having 0 when necessary) independently from the value of the "labour" column

xtabs sometimes give better results. If you want all 31 days then make day_birth a factor with levels=1:31)

> xtabs(  ~ day_birth + month_birth + year_birth, data=dat)
, , year_birth = 2001

         month_birth
day_birth 10 11 12 13 14
       1   0  1  0  0  0
       3   0  2  0  0  0
       4   0  2  0  0  0
       5   0  0  1  0  0
       22  1  0  0  0  1
       29  1  0  0  1  0


2. count the births for each day (having 0 when necessary), divided by the value of "labour" (which can have two valuers, 1 or 2)

Cannot figure out what is being asked here. What to do with the two values? Just count them? This would give a partitioned count

> xtabs( labour==1 ~ day_birth + month_birth , data=dat)
         month_birth
day_birth 10 11 12 13 14
       1   0  1  0  0  0
       3   0  1  0  0  0
       4   0  1  0  0  0
       5   0  0  0  0  0
       22  1  0  0  0  0
       29  0  0  0  0  0
> xtabs( labour==2 ~ day_birth + month_birth , data=dat)
         month_birth
day_birth 10 11 12 13 14
       1   0  0  0  0  0
       3   0  1  0  0  0
       4   0  1  0  0  0
       5   0  0  1  0  0
       22  0  0  0  0  1
       29  1  0  0  1  0



3. count the births for each day of all the years (i.e. the 22nd of October of all the years present in the data frame) independently from the value of "labour"

If I understand correctly:

> xtabs(  ~ day_birth + month_birth + year_birth, data=dat)
, , year_birth = 2001

         month_birth
day_birth 10 11 12 13 14
       1   0  1  0  0  0
       3   0  2  0  0  0
       4   0  2  0  0  0
       5   0  0  1  0  0
       22  1  0  0  0  1
       29  1  0  0  1  0


4. count the births for each day of all the years (i.e. the 22nd of October of all the years present in the data frame), divided by the value of "labour"

Again confusing. Do you mean to use separate tables for labour==1 and labour==2? Perhaps context to explain what these values represent. Some of us are "concrete". The results of xtabs are tables and can be divided like matrices.


I tried with the command

table(my_df$year_birth, my_df$month_birth, my_df$day_birth)

which satisfies (partially) question numer 1 (I am not able to have 0 in the not available days).

Is there a smart way to do that without invoking too many loops?

thank you for your help


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to