Stephan Kolassa wrote: > > Dear guRus, > > is there a package that calculates the Approximate Entropy (ApEn) of a > time series? > > RSiteSearch only gave me a similar question in 2004, which appears not > to have been answered: > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/28830.html > > RSeek.org didn't yield any results at all. > > Happy holidays (where appropriate), > Stephan > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > I went ahead and translated D. Kaplan's MATLAB code for this purpose into R. Warning: it's barely tested, and I'm not sure I understand what it's doing -- you're on your own from here ... (it won't work if you simply cut and paste what's here, since the examples are above the definitions of the utility functions etc.) ## http://www.macalester.edu/~kaplan/hrv/doc/funs/apen.html ## Approximate Entropy ## Syntax ## entropy = apen( pre, post, r ); ## Arguments ## pre An embedding of data. ## post The images of the data in the embedding. ## r The filter factor, which sets the length scale over which to compute the approximate entropy. ## Returned Values ## entropy The numerical value of the approximate entropy. ## Description ## The "approximate entropy" was introduced by Pincus to quantify the ## creation of information in a time series. A low value of the ## entropy indicates that the time series is deterministic; a high ## value indicates randomness. ## The "filter factor" r is an important parameter. In principle, with ## an infinite amount of data, it should approach zero. With finite ## amounts of data, or with measurement noise, it is not always clear ## what is the best value to choose. Past work on heart rate ## variability has suggested setting r to be 0.2 times the standard ## deviation of the data. ## Another important parameter is the "embedding dimension." Again, ## there is know precise means of knowing the best such dimension, but ## previous work has used a dimension of 2. The final parameter is the ## embedding lag, which is often set to 1, but perhaps more ## appropriately is set to be the smallest lag at which the ## autocorrelation function of the time series is close to zero. ## The apen function expects the data to be presented in a specific ## format. Working with a time series tseries, the following steps ## will compute the approximate entropy, with an embedding dimension ## of 2 and a lag of 1. ## edim = 2; ## lag = 1; ## edata = lagembed(tseries,edim,lag); ## [pre,post] = getimage(edata,lag); ## r = 0.2*std(tseries); ## apen(pre,post,r); edim <- 2 lag <- 1 edata <- lagembed(tseries,edim,lag) im <- getimage(edata,lag) r <- 0.2*sd(tseries) apen(im$pre,im$post,r) apenembed <- function(tseries,edim,lag=1,relr=0.2,r) { edata <- lagembed(tseries,edim,lag) im <- getimage(edata,lag) if (missing(r)) r <- relr*sd(tseries) apen(im$pre,im$post,r) } ## References ## * SM Pincus (1991) Proc. Natl. Acad. Sci. USA 88:2297-2301 ## * D Kaplan, MI Furman, SM Pincus, SM Ryan, LA Lipsitz, AL Goldberger (1991) "Aging and the complexity of cardiovascular dynamics," Biophys.J. 59:945-949 ## See Also ## apenhr. lagembed. getimage. ## Examples lagembed <- function(ts,d,lag) { z <- embed(ts,d) z[seq(1,nrow(z),by=lag),] } getimage <- function(x,pred) { pre <- x[1:(nrow(x)-pred),] post <- x[(pred+1):nrow(x),1] list(pre=pre,post=post) } ## tseries = randn(500,1); ## should have a large approximate entropy. set.seed(1001) apenembed(rnorm(500),edim=3) ## 0.3240493 apenembed(sin(seq(1,100,by=0.2)),edim=3) ## should have a small approximate entropy since it is deterministic. ## 0.1001811 ## examples from utils web page: ts <- 1:5 x <- lagembed(ts,3,1) im <- getimage(x,1) apen <- function(pre,post,r) { ## translated from Kaplan D (1998) HRV software. ## http://www.macalester.edu/~kaplan/hrv/doc/Feb3snap.tar ## computer approximate entropy a la Steve Pincus N <- nrow(pre) p <- ncol(pre) ## number of pairs of points closer than r in pre/post space phiM <- phiMplus1 <- 0 for (k in 1:N) { ## replicate the current point foo <- matrix(rep(pre[k,],N),byrow=TRUE,nrow=N) ## calculate distance (max norm) goo <- abs(foo-pre)<=r ## which ones of them are closer than r using the max norm? closerpre <- if (p==1) goo else apply(goo,1,all) precount <- sum(closerpre) phiM <- phiM+log(precount) ## of the ones that were closer in the pre space, how many are closer ## in post also ? postcount <- sum(abs(post[closerpre] - post[k])<r) phiMplus1 <- phiMplus1 + log(postcount) ## cat(k,precount,phiM,postcount,phiMplus1,"\n") } (phiM-phiMplus1)/N } ## also see: ## http://www.cbi.dongnocchi.it/glossary/ApEn.html ## http://www.physionet.org/physiotools/ApEn/ ts <- rep(61:65,10) apenembed(ts,d=5,r=2) ## get 0 instead of 0.00189? -- View this message in context: http://www.nabble.com/Approximate-Entropy--tp21144062p21147242.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.