Thanks a lot for your help. That’s the time period I was looking for. I’ve got one more question: for further analyses I need the respective maximum values within these time periods (between the green and red lines). Preferably in combination with the date the maximum event happened.
Thank you in advance, Tonja -----Ursprüngliche Nachricht----- Von: William Dunlap <wdun...@tibco.com> Gesendet: 27.05.2010 22:13:21 An: "Hutchinson,David [PYR]" <david.hutchin...@ec.gc.ca>,Tonja Krueger <tonja.krue...@web.de> Betreff: RE: [R] Peak Over Threshold values >> -----Original Message----- >> From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap >> Sent: Thursday, May 27, 2010 12:24 PM >> To: Hutchinson,David [PYR]; Tonja Krueger >> Cc: r-help@r-project.org >> Subject: Re: [R] Peak Over Threshold values >> >> Here is another version, loopless, but still >> a bit clumsy (can the call to which be removed?): > >Note that avoiding loops is mostly for >the aesthetic effect. On my aging laptop >the following loopy and vector-growing version >takes 3 seconds on a million-long vector >while the non-loopy one takes 0.22 seconds >(timings done with plot=FALSE). That is >a nice ratio but not much of a difference. > >f0 <- function (x = walevel, > startThreshhold = 5.45, > stopThreshhold = 5.3, > plot = TRUE) >{ > stopifnot(startThreshhold > stopThreshhold) > inRun <- FALSE > start <- integer() > stop <- integer() > for (i in seq_along(x)) { > if (inRun) { > if (x[i] < stopThreshhold) { > stop[length(stop) + 1] <- i - 1 > inRun <- FALSE > } > } > else { > if (x[i] > startThreshhold) { > start[length(start) + 1] <- i > inRun <- TRUE > } > } > } > if (inRun) > stop[length(stop) + 1] <- length(x) > if (length(stop) > length(start)) > stop <- stop[-1] > if (plot) { > plot(x, cex = 0.5) > abline(h = c(startThreshhold, stopThreshhold)) > abline(v = start, col = "green") > abline(v = stop, col = "red") > } > data.frame(start = start, stop = stop) >} > >> >> f <- function (x = walevel, >> startThreshhold = 5.45, >> stopThreshhold = 5.3, >> plot = TRUE) >> { >> stopifnot(startThreshhold > stopThreshhold) >> isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)]) >> isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE) >> isOverStart <- x >= startThreshhold >> isOverStop <- x >= stopThreshhold >> possibleStartPt <- which(isFirstInRun(isOverStart) & isOverStart) >> possibleStopPt <- which(isLastInRun(isOverStop) & isOverStop) >> pts <- c(possibleStartPt, possibleStopPt) >> names(pts) <- rep(c("start", "stop"), >> c(length(possibleStartPt), length(possibleStopPt))) >> pts <- pts[order(pts)] >> tmp <- isFirstInRun(names(pts)) >> start <- pts[tmp & names(pts) == "start"] >> stop <- pts[tmp & names(pts) == "stop"] >> if (length(stop) > length(start)) { >> # i.e., when first downcrossing happens before >> # first upcrossing >> stop <- stop[-1] >> } >> if (plot) { >> plot(x, cex = 0.5) >> abline(h = c(startThreshhold, stopThreshhold)) >> abline(v = start, col = "green") >> abline(v = stop, col = "red") >> } >> data.frame(start = start, stop = stop) >> } >> >> # define the data in the original message and call f(). >> >> The isFirstInRun and isLastInRun functions do the >> first part of the calculation that rle does and >> avoid the cumsum(diff()) roundtrip that >> cumsum(rle()$lengths) involves and their names >> make it clearer what I'm trying to do. >> >> Bill Dunlap >> Spotfire, TIBCO Software >> wdunlap tibco.com >> >> > -----Original Message----- >> > From: r-help-boun...@r-project.org >> > [mailto:r-help-boun...@r-project.org] On Behalf Of >> > Hutchinson,David [PYR] >> > Sent: Thursday, May 27, 2010 10:41 AM >> > To: Tonja Krueger >> > Cc: r-help@r-project.org >> > Subject: Re: [R] Peak Over Threshold values >> > >> > Perhaps not elegant, but does the job >> > >> > threshold <- 5.45 >> > meetThreshold <- walevel >= threshold >> > >> > d <- rle(meetThreshold) >> > startPos <- c(1, 1 + cumsum(d$lengths[-length(d$lengths)])) >> > >> > abv <- which(d$values == TRUE) >> > p.o.t <- data.frame() >> > >> > for (i in seq( along = abv ) ) { >> > a <- startPos[abv[i]] >> > b <- a + (d$lengths[abv[i]] - 1) >> > e <- which.max(walevel[a:b]) >> > p.o.t <- rbind( p.o.t, data.frame( >> > pos = a + e - 1, >> > val = walevel[a:b][e] >> > ) ) >> > } >> > >> > plot ( >> > day, >> > walevel, type = 'l' >> > ) >> > >> > points( >> > p.o.t$pos, >> > p.o.t$val, >> > col = 'red' >> > ) >> > >> > abline(h = threshold, lty = 2, col = 'red') >> > >> > -----Original Message----- >> > From: r-help-boun...@r-project.org >> > [mailto:r-help-boun...@r-project.org] On Behalf Of Tonja Krueger >> > Sent: Thursday, May 27, 2010 1:47 AM >> > To: Vito Muggeo (UniPa); Clint Bowman >> > Cc: r-help@r-project.org >> > Subject: [R] Peak Over Threshold values >> > >> > >> > I'm sorry, but that's not exactly what I was looking for. >> > I obviously >> > didn't explain properly: >> > >> > Within my dataframe (df) I would like to find POT values >> > that are not >> > linked. In my definition two maximum values are linked if >> > walevel does not >> > fall below a certain value (the lower threshold value). >> > >> > walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, >> > 5.86, 5.91, 5.91, >> > 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, >> > 6.11, 6.14, 6.12, >> > 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, >> > 5.72, 5.70, 5.65, >> > 5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, >> > 5.19, 5.19, 5.13, >> > 5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, >> > 5.22, 5.32, 5.29, >> > 5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, >> > 5.66, 5.68, 5.72, >> > 5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, >> > 5.62, 5.62, 5.61, >> > 5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, >> > 5.45, 5.47, 5.50, >> > 5.50, 5.49, 5.43, 5.39, 5.33, 5.26) >> > >> > day <- c(1:100) >> > >> > df <- data.frame(day,walevel) >> > >> > For example in my dataframe a threshold value = 5.45 and >> > an lower threshold >> > value = 5.3 should lead to two maximum values because >> > between the 2^nd and >> > 3^rd peak walevel does not fall below the lower threshold value. >> > >> > With "clusters (...)" from "evd package", I could find >> > out POT values but >> > even though I set a lower threshold value for my example >> > dataframe I would >> > get three maximum values instead of two. >> > >> > library(evd) >> > >> > clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, cmax=T) >> > >> > clusters(df$walevel,u =5.45, ulow= 5.3, r = 1, plot=T) >> > >> > Changing r to 30 (for example) connects the 2^nd and 3^rd >> > maximum events and >> > gives out the right 'end' for the first extreme event but >> > not for the second >> > event. (Also I'd rather not use a time interval to prove >> > that the events are >> > linked) >> > >> > clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, cmax=T) >> > >> > clusters(df$walevel,u =5.45, ulow= 5.3, r = 30, plot=T) >> > >> > What went wrong??? >> > >> > Tonja >> > -----Ursprüngliche Nachricht----- >> > How about? >> > hi.rle<-rle(walevel>5.79) >> > lo.rle<-rle(walevel<5.36) >> > plot(walevel) >> > abline(h=5.8,col=2,lty=3) >> > abline(h=5.35,col=3,lty=3) >> > hi.lo.rle<-sort(c(cumsum(hi.rle$lengths),cumsum(lo.rle$lengths))) >> > abline(v=hi.lo.rle) >> > You can use the $values from the rle to sort things out. Probably >> > want to ignore the end point at length(walevel). >> > -- >> > Clint Bowman INTERNET: cl...@ecy.wa.gov >> > Air Quality Modeler INTERNET: cl...@math.utah.edu >> > Department of Ecology VOICE: (360) 407-6815 >> > PO Box 47600 FAX: (360) 407-7534 >> > Olympia, WA 98504-7600 >> > On Wed, 26 May 2010, Vito Muggeo (UniPa) wrote: >> > > dear Tonja, >> > > By plotting your data >> > > >> > > plot(df) >> > > >> > > it seems to me that you are looking for a piecewise >> > linear relationships. >> > If >> > > this is the case, have a look to the package segmented. >> > You have to >> > specify >> > > or not the number and the starting values for the breakpoints >> > > >> > > library(segmented) >> > > olm<-lm(walevel~day) >> > > >> > > #specify number and starting values for the breakpoints.. >> > > oseg<-segmented(olm, seg.Z=~day, psi=c(20,50,80)) >> > > plot(oseg,add=TRUE,col=2) >> > > oseg$psi >> > > >> > > #or you can avoid to specify starting values for psi >> > > oseg<-segmented(olm, seg.Z=~day, >> > > psi=NA,control=seg.control(stop.if.error=FALSE,K=30)) >> > > >> > > plot(oseg,add=TRUE,col=2) >> > > oseg$psi >> > > >> > > >> > > best, >> > > vito >> > > >> > > >> > > Tonja Krueger ha scritto: >> > >> Dear List >> > >> >> > >> I hope you can help me: I've got a dataframe (df) >> > within which I am >> > >> looking >> > >> for Peak Over Threshold values as well as the length of >> > the events. An >> > >> event >> > >> starts when walevel equals 5.8 and it should end when >> > walevel equals >> > >> the >> > >> lower threshold value (5.35). >> > >> >> > >> I tried "clusters (...)" from "evd package", and varied >> > r (see example) >> > >> but it >> > >> did not work for all events (again see example). >> > >> >> > >> walevel <- c(5.75, 5.75, 5.75, 5.76, 5.80, 5.82, 5.85, >> > 5.86, 5.91, >> > >> 5.91, >> > >> 5.88, 5.92, 5.99, 6.02, 6.00, 6.08, 6.11, 6.10, 6.10, >> > 6.11, 6.14, 6.12, >> > >> 6.15, 6.17, 6.15, 6.08, 6.01, 5.95, 5.89, 5.83, 5.77, >> > 5.72, 5.70, 5.65, >> > >> 5.59, 5.51, 5.43, 5.33, 5.30, 5.25, 5.22, 5.21, 5.19, >> > 5.19, 5.19, 5.13, >> > >> 5.15, 5.07, 5.04, 5.02, 4.99, 5.05, 5.07, 5.12, 5.17, >> > 5.22, 5.32, 5.29, >> > >> 5.33, 5.33, 5.36, 5.37, 5.42, 5.43, 5.48, 5.55, 5.57, >> > 5.66, 5.68, 5.72, >> > >> 5.77, 5.79, 5.81, 5.80, 5.80, 5.77, 5.72, 5.70, 5.66, >> > 5.62, 5.62, 5.61, >> > >> 5.59, 5.57, 5.51, 5.47, 5.41, 5.39, 5.40, 5.40, 5.42, >> > 5.45, 5.47, 5.50, >> > >> 5.50, 5.49, 5.43, 5.39, 5.33, 5.26) >> > >> >> > >> day <- c(1:100) >> > >> >> > >> df <- data.frame(day,walevel) >> > >> >> > >> library(evd) >> > >> clusters(df$walevel, u = 5.80, r = 1, ulow = 5.35, cmax >> > = T, plot = T) >> > >> clusters(df$walevel, u = 5.80, r = 50, ulow = 5.35, >> > cmax = T, plot = T) >> > >> >> > >> What have I done wrong? >> > >> >> > >> Tonja >> > >> ______________________________________________ >> > >> R-help@r-project.org mailing list >> > >> [1]https://stat.ethz.ch/mailman/listinfo/r-help >> > >> PLEASE do read the posting guide >> > >> [2]http://www.R-project.org/posting-guide.html >> > >> and provide commented, minimal, self-contained, >> > reproducible code. >> > >> >> > > >> > > >> > >> > GRATIS: Movie-Flat mit über 300 Top-Videos. Für WEB.DE Nutzer >> > dauerhaft kostenlos! Jetzt freischalten unter >> > http://movieflat.web.de >> > >> > References >> > >> > 1. >> > file://localhost/../jump.htm?goto=https%3A%2F%2Fstat.ethz.ch%2 >> Fmailman%2Flistinfo%2Fr-help >> > 2. >> > file://localhost/../jump.htm?goto=http%3A%2F%2Fwww.R-project.o >> rg%2Fposting-guide.html >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> ___________________________________________________________ GRATIS für alle WEB.DE Nutzer: Die maxdome Movie-FLAT! Jetzt freischalten unter http://movieflat.web.de ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.