Hello, I would like to run a series of regressions on my data (response variable over time):
1) regression from T1 to T2 2) regressions from T1 through T3 3) regression from T1 through T4, etc. I have been struggling to find a way to do this through commands, as opposed to cutting up the data manually (my dataset has over 6000 rows/observations). An illustrative dataset can be created thusly: dat <- structure(list(Years= c(0, 0, 0, 0.36, 0.36, 0.36, 0.67, 0.67, 0.67, 0.74, 0.74, 0.74), Obs = c(0, 0, 0, 2.3, 1.9, 2.1, 4.5, 4.5, 4.6, 5.3, 5.5, 5.6)), .Names = c("Years","Obs"), row.names = c(NA, -12L), class = "data.frame") I was trying to use a loop to create subsets of the data corresponding to the sets of time intervals required (e.g. T1 to T2, T1 through T3, etc.), but I am having trouble generating a new variable to index time (instead of the decimal values). I was figuring that indexing time would allow me to use a loop to generate the required subsets of data. I can figure out how many time periods I have and assign a sequential number to them: Years <- unique(set.data$Yrs) Yrs_count <- seq(from = 1, to = length(Years), by = 1) And then I can combine these into a dataframe: Yrs_combo <- cbind(Years,Yrs_count) However, how do I combine this data frame with my larger dataset, which has different numbers of rows? But this is just an intermediary step in the process.... Some of you might suggest an entirely different route. For now, I can manually create this new time index: dat2 <- structure(list(Years= c(0, 0, 0, 0.36, 0.36, 0.36, 0.67, 0.67, 0.67, 0.74, 0.74, 0.74), Obs = c(0, 0, 0, 2.3, 1.9, 2.1, 4.5, 4.5, 4.6, 5.3, 5.5, 5.6), Yrs_count = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4)), .Names = c("Years","Obs","Yrs_count"), row.names = c(NA, -12L), class = "data.frame") The next question is how can I index temporary files in a loop that I use for extracting the needed data? I thought I might need two loops: one to identify the length of the time series, the other to accumulate the data from T1 through the identified end point - maybe something like: for (i in 1:Yrs_count) { for (j in 1:i) { keyj <- dat2[,3]==j dat2j <- dat2[keyj,] # here is where I want to create a temporary file to accumulate the different dat2j's I create in this inside loop } # here is where I want to save the file for future use in my regressions } I hope this example is clear enough. My apologies if it isn't - and I thank the R community for any ideas, tips, or directions to information that might be helpful. Best, -Philippe -- Philippe Hensel, PhD NOAA National Geodetic Survey NGS ECO <http://www.ngs.noaa.gov/web/science_edu/ecosystems_climate/> N/NGS2 SSMC3 #8859 1315 East-West Hwy Silver Spring MD 20910 (301) 713 3198 x 137 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.