> -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap > Sent: Thursday, May 27, 2010 12:24 PM > To: Hutchinson,David [PYR]; Tonja Krueger > Cc: r-help@r-project.org > Subject: Re: [R] Peak Over Threshold values > > Here is another version, loopless, but still > a bit clumsy (can the call to which be removed?):
Note that avoiding loops is mostly for the aesthetic effect. On my aging laptop the following loopy and vector-growing version takes 3 seconds on a million-long vector while the non-loopy one takes 0.22 seconds (timings done with plot=FALSE). That is a nice ratio but not much of a difference. f0 <- function (x = walevel, startThreshhold = 5.45, stopThreshhold = 5.3, plot = TRUE) { stopifnot(startThreshhold > stopThreshhold) inRun <- FALSE start <- integer() stop <- integer() for (i in seq_along(x)) { if (inRun) { if (x[i] < stopThreshhold) { stop[length(stop) + 1] <- i - 1 inRun <- FALSE } } else { if (x[i] > startThreshhold) { start[length(start) + 1] <- i inRun <- TRUE } } } if (inRun) stop[length(stop) + 1] <- length(x) if (length(stop) > length(start)) stop <- stop[-1] if (plot) { plot(x, cex = 0.5) abline(h = c(startThreshhold, stopThreshhold)) abline(v = start, col = "green") abline(v = stop, col = "red") } data.frame(start = start, stop = stop) } > > f <- function (x = walevel, > startThreshhold = 5.45, > stopThreshhold = 5.3, > plot = TRUE) > { > stopifnot(startThreshhold > stopThreshhold) > isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)]) > isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE) > isOverStart <- x >= startThreshhold > isOverStop <- x >= stopThreshhold > possibleStartPt <- which(isFirstInRun(isOverStart) & isOverStart) > possibleStopPt <- which(isLastInRun(isOverStop) & isOverStop) > pts <- c(possibleStartPt, possibleStopPt) > names(pts) <- rep(c("start", "stop"), > c(length(possibleStartPt), length(possibleStopPt))) > pts <- pts[order(pts)] > tmp <- isFirstInRun(names(pts)) > start <- pts[tmp & names(pts) == "start"] > stop <- pts[tmp & names(pts) == "stop"] > if (length(stop) > length(start)) { > # i.e., when first downcrossing happens before > # first upcrossing > stop <- stop[-1] > } > if (plot) { > plot(x, cex = 0.5) > abline(h = c(startThreshhold, stopThreshhold)) > abline(v = start, col = "green") > abline(v = stop, col = "red") > } > data.frame(start = start, stop = stop) > } > > # define the data in the original message and call f(). > > The isFirstInRun and isLastInRun functions do the > first part of the calculation that rle does and > avoid the cumsum(diff()) roundtrip that > cumsum(rle()$lengths) involves and their names > make it clearer what I'm trying to do. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -----Original Message----- > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] On Behalf Of > > Hutchinson,David [PYR] > > Sent: Thursday, May 27, 2010 10:41 AM > > To: Tonja Krueger > > Cc: r-help@r-project.org > > Subject: Re: [R] Peak Over Threshold values > > > > Perhaps not elegant, but does the job > > > > threshold <- 5.45 > > meetThreshold <- walevel >= threshold > > > > d <- rle(meetThreshold) > > startPos <- c(1, 1 + cumsum(d$lengths[-length(d$lengths)])) > > > > abv <- which(d$values == TRUE) > > p.o.t <- data.frame() > > > > for (i in seq( along = abv ) ) { > > a <- startPos[abv[i]] > > b <- a + (d$lengths[abv[i]] - 1) > > e <- which.max(walevel[a:b]) > > p.o.t <- rbind( p.o.t, data.frame( > > pos = a + e - 1, > > val = walevel[a:b][e] > > ) ) > > } > > > > plot ( > > day, > > walevel, type = 'l' > > ) > > > > points( > > p.o.t$pos, > > p.o.t$val, > > col = 'red' > > ) > > > > abline(h = threshold, lty = 2, col = 'red') > > > > -----Original Message----- > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] On Behalf Of Tonja Krueger > > Sent: Thursday, May 27, 2010 1:47 AM > > To: Vito Muggeo (UniPa); Clint Bowman > > Cc: r-help@r-project.org > > Subject: [R] Peak Over Threshold values > > > > > > I'm sorry, but that's not exactly what I was looking for. > > I obviously > > didn't explain properly: > > > > Within my dataframe (df) I would like to find POT values > > that are not > > linked. In my definition two maximum values are linked if > > walevel does not > > fall below a certain value (the lower threshold value). > > > > walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, > > 5.86, 5.91, 5.91, > > 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, > > 6.11, 6.14, 6.12, > > 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, > > 5.72, 5.70, 5.65, > > 5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, > > 5.19, 5.19, 5.13, > > 5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, > > 5.22, 5.32, 5.29, > > 5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, > > 5.66, 5.68, 5.72, > > 5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, > > 5.62, 5.62, 5.61, > > 5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, > > 5.45, 5.47, 5.50, > > 5.50, 5.49, 5.43, 5.39, 5.33, 5.26) > > > > day <- c(1:100) > > > > df <- data.frame(day,walevel) > > > > For example in my dataframe a threshold value = 5.45 and > > an lower threshold > > value = 5.3 should lead to two maximum values because > > between the 2^nd and > > 3^rd peak walevel does not fall below the lower threshold value. > > > > With "clusters (...)" from "evd package", I could find > > out POT values but > > even though I set a lower threshold value for my example > > dataframe I would > > get three maximum values instead of two. > > > > library(evd) > > > > clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T) > > > > clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T) > > > > Changing r to 30 (for example) connects the 2^nd and 3^rd > > maximum events and > > gives out the right 'end' for the first extreme event but > > not for the second > > event. (Also I'd rather not use a time interval to prove > > that the events are > > linked) > > > > clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T) > > > > clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T) > > > > What went wrong??? > > > > Tonja > > -----Ursprüngliche Nachricht----- > > How about? > > hi.rle<-rle(walevel>5.79) > > lo.rle<-rle(walevel<5.36) > > plot(walevel) > > abline(h=5.8,col=2,lty=3) > > abline(h=5.35,col=3,lty=3) > > hi.lo.rle<-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths))) > > abline(v=hi.lo.rle) > > You can use the $values from the rle to sort things out. Probably > > want to ignore the end point at length(walevel). > > -- > > Clint Bowman INTERNET: cl...@ecy.wa.gov > > Air Quality Modeler INTERNET: cl...@math.utah.edu > > Department of Ecology VOICE: (360) 407-6815 > > PO Box 47600 FAX: (360) 407-7534 > > Olympia, WA 98504-7600 > > On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote: > > > dear Tonja, > > > By plotting your data > > > > > > plot(df) > > > > > > it seems to me that you are looking for a piecewise > > linear relationships. > > If > > > this is the case, have a look to the package segmented. > > You have to > > specify > > > or not the number and the starting values for the breakpoints > > > > > > library(segmented) > > > olm<-lm(walevel~day) > > > > > > #specify number and starting values for the breakpoints.. > > > oseg<-segmented(olm, seg.Z=~day, psi=c(20,50,80)) > > > plot(oseg,add=TRUE,col=2) > > > oseg$psi > > > > > > #or you can avoid to specify starting values for psi > > > oseg<-segmented(olm, seg.Z=~day, > > > psi=NA,control=seg.control(stop.if.error=FALSE,K=30)) > > > > > > plot(oseg,add=TRUE,col=2) > > > oseg$psi > > > > > > > > > best, > > > vito > > > > > > > > > Tonja Krueger ha scritto: > > >> Dear List > > >> > > >> I hope you can help me: I've got a dataframe (df) > > within which I am > > >> looking > > >> for Peak Over Threshold values as well as the length of > > the events. An > > >> event > > >> starts when walevel equals 5.8 and it should end when > > walevel equals > > >> the > > >> lower threshold value (5.35). > > >> > > >> I tried "clusters (...)" from "evd package", and varied > > r (see example) > > >> but it > > >> did not work for all events (again see example). > > >> > > >> walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, > > 5.86, 5.91, > > >> 5.91, > > >> 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, > > 6.11, 6.14, 6.12, > > >> 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, > > 5.72, 5.70, 5.65, > > >> 5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, > > 5.19, 5.19, 5.13, > > >> 5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, > > 5.22, 5.32, 5.29, > > >> 5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, > > 5.66, 5.68, 5.72, > > >> 5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, > > 5.62, 5.62, 5.61, > > >> 5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, > > 5.45, 5.47, 5.50, > > >> 5.50, 5.49, 5.43, 5.39, 5.33, 5.26) > > >> > > >> day <- c(1:100) > > >> > > >> df <- data.frame(day,walevel) > > >> > > >> library(evd) > > >> clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax > > = T, plot = T) > > >> clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, > > cmax = T, plot = T) > > >> > > >> What have I done wrong? > > >> > > >> Tonja > > >> ______________________________________________ > > >> R-help@r-project.org mailing list > > >> [1]https://stat.ethz.ch/mailman/listinfo/r-help > > >> PLEASE do read the posting guide > > >> [2]http://www.R-project.org/posting-guide.html > > >> and provide commented, minimal, self-contained, > > reproducible code. > > >> > > > > > > > > > > GRATIS: Movie-Flat mit über 300 Top-Videos. Für WEB.DE Nutzer > > dauerhaft kostenlos! Jetzt freischalten unter > > http://movieflat.web.de > > > > References > > > > 1. > > file://localhost/../jump.htm?goto=https%3A%2F%2Fstat.ethz.ch%2 > Fmailman%2Flistinfo%2Fr-help > > 2. > > file://localhost/../jump.htm?goto=http%3A%2F%2Fwww.R-project.o > rg%2Fposting-guide.html > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.