Your sample data has startdate and enddate columns as though these values might vary throughout the data set, but your description suggests that those values are fixed for the whole dataset.

If those values are really constant, then

x$Year <- 1995L + 12 * ( ( x$month - 1L ) %/% 12L )
x$Month <- ( x$month - 1L ) %% 12L + 1

should be enough.

If in fact your startdate can be different for different IDs, then you would need to wrap this up a bit:

############ base R
x$startdate <- as.Date( as.character( x$startdate ) )
x$Y <- as.integer( format( DF$startdate, format = "%Y" ) )
xlist <- split( x, x$id )
xlist1 <- lapply( xlist
                , FUN = function( DF ) {
                    DF$Year <- Y + 12 * ( ( DF$month - 1L ) %/% 12L )
                    DF$Month <- ( DF$month - 1L ) %% 12L + 1
                    DF
                  }
                )
x1 <- unsplit( xlist1, x$id )

###### dplyr
library(dplyr)
x2 <- (   x
      %>% mutate( Y = as.integer( format( startdate, format = "%Y" ) ) )
      %>% group_by( id )
      %>% mutate( Year = Y + 12L * ( ( month - 1L ) %/% 12L )
                , Month = ( month - 1L ) %% 12L + 1L
                )
      %>% ungroup
      )
#########

On Mon, 2 Sep 2019, Miluji Sb wrote:

Dear all,

I have a panel data with a large number of groups and the cumulative number
of months (1 - 372) for January 1995 to December 2005. My goal is to
extract the corresponding month and year for each observation.

I tried the following;

###
x %>%
 group_by(id) %>%
 do( data.frame(., Date= seq(.$startdate,
                             as.Date('1975-01-01'), by = '1 month')))

However, I get the following error;

###
Error in seq.default(.$startdate, as.Date("1975-01-01"), by = "1 month") :
 'from' must be of length 1

Essentially, I want to convert the month number (1-372) to January 1975 to
December 2005 by ID. Is that possible? Any help will be highly appreciated.

### Data ###
x <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), month =
c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), startdate = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1975-01-01", class = "factor"),
   enddate = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
"2005-12-31", class = "factor")), class = "data.frame", row.names = c(NA,
-9L))

Cross-posted in Statalist - however, Stata seems to have a very inflexible
date-time structure.

Best regards,

Milu

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to