On Jan 12, 2010, at 2:13 AM, Benedikt Drosse wrote:

Dear R-Users,

I have a question concerning the determination of breakpoints and comparison of slopes from broken-line regression models. Although this is rather a standard problem in data analysis, all information I gathered so far, did not answer my questions.

I added a subsetted example of my data. Basically it is a timeseries of recorded phenotypes in three different groups of plants. You will see in the plot that the phenotype between groups behaves differently over time. Plotted data of Group 3 look like a nice linear relationship between phenotype (y) and time (x). Group 1 seems to have 1 breakpoint (at x=31) and Group 3 has 2 breakpoints (at x=c(14,31)) in the data.

1. I am very interested in finding the breakpoints (x-values), where the slope of a regression line changes. In the plot I estimated the break-points by eye, subsetted the data and performed linear regression on the subsets for each group. However, this is not the way I want to go, since I would like to have some reasonable criteria (better than my eye) to determine breakpoints.

2. Additionally I am interested in testing, whether the slopes of (partial-)regression lines between groups are different from each other.

Considering my questions, the 'segmented' package should exactly fit my needs. The problem I have is, that on the one hand I would also like to compare slopes of regression lines, which do not have breakpoints (e.g group 3 of the example) to lines, which have breakpoints (group 1 and 2). This seems to be not possible with the 'segmented' package. On the other hand the 'segmented' package seems to have a poor error- reporting, so it is very hard for me to figure out, what is going wrong in the analysis (see example).

Can you think of any other way/procedure/package I could use to analyze my data? Hopefully you can also give me a clue on how to proceed with the analysis, when the error of the segmented()- function in the creation of the covariance-matrix occurs.

I would be grateful for any comment on how to determine breakpoints and compare slopes of regression lines in a different way than subsetting the data by hand and regress the subsetted parts as I did it for the example plot. Thanks to anyone, who has an idea, already. Sorry for the long example code.


Have you looked at the capabilities of the 'strucchange' package?

http://cran.r-project.org/web/packages/strucchange/index.html

--
David.
Best regards,
Benedikt




#Start of Code
library(car)
library(segmented)

#Data file:
 #Timescale:
x <- c(0,0,0,3,3,3,3,3,3,6,6,6,9,9,9,14,14,14,17,17,17,20,20,20,24,24,24,28,28,28,31,31,31,34,34,34,38,38,38,45,45,45,0,0,0,3,3,3,3,3,3,6,6,6,9,9,9,14,14,14,17,17,17,20,20,20,24,24,24,28,28,28,31,31,31,34,34,34,38,38,38,45,45,45,0,0,0,3,3,3,7,7,7,9,9,9,13,13,13,13,13,16,16,16,20,20,20,24,24,24,28,28,28,34,34,34,41,41,41,50,50,50)

 #y-data for three groups (t1, t2, t3):
t1 <- c (41,35,38,43,47,46,50,41,47,74,61,58,72,87,84,89,114,93,113,99,105,106,98,200,170,145,153,187,202,195,198,177,195,288,274,233,320,348,428,545,520,549 ) t2 <- c (44,40,38,57,42,47,42,57,40,71,75,64,70,83,88,115,95,106,132,145,115,192,172,190,273,300,250,403,392,343,400,402,444 ,NA,NA,NA,NA,NA,NA,430,430,459) t3 <- c (41,35,38,39,43,47,58,33,45,67,60,70,64,78,78,67,62,78,79,71,93,86,82,105,96,102,131,126,124,112,100,133,170,161,163,154,178,185 )

group <- as.factor(c(rep(1, times=length(t1)), rep(2, times=length(t2)), rep(3, times=length(t3))))

   data <- data.frame(group, x=x, y=c(t1,t2,t3))

#Groupwise plot of datapoints vs time:
scatterplot(y~x*group, data=data, smooth=FALSE, reg.line=FALSE, legend.coords="topleft")


#Regressions for group 1 with 1 breakpoints (red, dashed line)
reg_1.1 <- lm(data[which(data$x<=30 & data$group==1), 3]~data[which(data$x<=30 & data$group==1),2])
   abline(reg_1.1, col="black")

reg_1.2 <- lm(data[which(data$x>=30 & data$group==1), 3]~data[which(data$x>=30 & data$group==1),2])
     abline(reg_1.2, col="black")


#Regressions for group 2 with 2 breakpoints (black, solid line)
reg_2.1 <- lm(data[which(data$x<=14 & data$group==2), 3]~data[which(data$x<=14 & data$group==2),2])
   abline(reg_2.1, col="red", lty=2)

reg_2.2 <- lm(data[which(data$x>=14 & data$x<45 & data$group==2), 3]~data[which(data$x>=14 & data$x<45 & data$group==2),2])
     abline(reg_2.2, col="red", lty=2)

reg_2.3 <- lm(data[which(data$x>=31 & data$x<=45 & data $group==2),3]~data[which(data$x>=31 & data$x<=45 & data$group==2),2])
     abline(reg_2.3, col="red", lty=2)


#Regression for group 3 without breakpoints (green, solid line)
reg_3.1 <- lm(data[which(data$group==3),3]~data[which(data $group==3),2])
   abline(reg_3.1, col="green", lty=1)


#Segmented package to estimate breakpoints and determine slopes for group2 and 3:
 subset_data <- subset(data, subset=data$group!=3)
   names(subset_data) <- c("sub_group", "sub_x", "sub_y")
     attach(subset_data)

 X <- model.matrix(~0+sub_group)*sub_x

 time.1 <- X[,1]
 time.2 <- X[,2]
 time.3 <- X[,3]

 olm <- lm(sub_y~0 + sub_group + time.1 + time.2)
os <- segmented(olm, seg.Z= ~ time.1 + time.2, psi=list(time. 1=c(14,30), time.2=40))

#End of code

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to