For sequential analysis of sequences of events, I want to calculate a series of lagged versions of a (numeric or character) variable. The simple function below does this, but I can't see how to generalize this to the case where there is also a factor variable and I want to calculate lags separately for each level of the factor (by). Can anyone help?

# produce k lagged versions of a numeric or character variable
lags <- function(x, k=1, prefix='lag', by) {
  if(missing(by)) {
  n <- length(x)
  res <- data.frame(lag0=x)
  for (i in 1:k) {
    res <- cbind(res, c(rep(NA, i), x[1:(n-i)]))
  }
  colnames(res) <- paste0(prefix, 0:k)
  return(res)
  }
  else {
    stop('by not yet implemented')
    }
}

# tests
> events <- sample(letters[1:4], 10, replace=TRUE)
> lags(events)
   lag0 lag1
1     c <NA>
2     a    c
3     b    a
4     d    b
5     d    d
6     c    d
7     d    c
8     c    d
9     c    c
10    d    c
> lags(events, 3)
   lag0 lag1 lag2 lag3
1     c <NA> <NA> <NA>
2     a    c <NA> <NA>
3     b    a    c <NA>
4     d    b    a    c
5     d    d    b    a
6     c    d    d    b
7     d    c    d    d
8     c    d    c    d
9     c    c    d    c
10    d    c    c    d
>

# similar, with by=sub variable

> events2 <- data.frame(sub=rep(1:2, each=5),
+                       event=sample(letters[1:4], 10, replace=TRUE),
+                       stringsAsFactors=FALSE)
> events2
   sub event
1    1     b
2    1     d
3    1     d
4    1     c
5    1     b
6    2     b
7    2     b
8    2     b
9    2     d
10   2     a

> # do it separately for each sub ...
> (lg <- lapply(split(events2$event, events2$sub), lags, 2))
$`1`
  lag0 lag1 lag2
1    b <NA> <NA>
2    d    b <NA>
3    d    d    b
4    c    d    d
5    b    c    d

$`2`
  lag0 lag1 lag2
1    b <NA> <NA>
2    b    b <NA>
3    b    b    b
4    d    b    b
5    a    d    b

This gives sort of what I want, but I need to have the 'sub' variable explicit in the result

> do.call(rbind, lg)
    lag0 lag1 lag2
1.1    b <NA> <NA>
1.2    d    b <NA>
1.3    d    d    b
1.4    c    d    d
1.5    b    c    d
2.1    b <NA> <NA>
2.2    b    b <NA>
2.3    b    b    b
2.4    d    b    b
2.5    a    d    b
>

--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to