On Tue, 2009-06-23 at 14:18 -0400, Paul Simonin wrote: > Hello R Users, > I have a question regarding fitting a model with GAM{mgcv}. I have data > from several predictor (X) variables I wish to use to develop a model to > predict one Y variable. I am working with ecological data, so have data > collected many times (about 20) over the course of two years. Plotting > data independently for each date there appears to be relationships > between Y (fish density) and at least several X variables (temperature > and light). However, the actual value of X variables (e.g., temperature) > changes with date/season. In other words, fish distribution is likely > related to temperature, but available temperatures change through the > season. Thus, when data from all dates are combined to create a model > from the entire dataset, I think I need to include some type of > metric/variable/interaction term to account for this date relationship. > I have written the following code using a "by" term:
A by smooth like this will produce a varying coefficient model, where a separate smooth is fitted to the levels of the by variable. As unique datecodes becomes large, this may be inefficient for the purposes you have in mind - which as per several of your recent emails appears, is to fit an interaction. In mgcv, interactions can be fitted using a bivariate smooth of the two variables of interest. Without your data or information on the structure of datecode, it is difficult to suggest exactly how to proceed (and perhaps why you haven't received replies to your postings), but if datecode represents the day-of-year or the month (expressed as a numeric) in which sampling took place, then an interaction between datecode and temperature could be achieved using a tensor product smooth: te(temperature, datecode, bs = c("cr","cc")) I use a tensor product as we do not expect isotropy and the two variables are on different scales. The bs species that temperature is a cubic regression spline and that datecode is a cyclic cubic regression spline so the end points join (i.e. you want smooth transition from December to January). If your datecode variable does not got from month 1 (day-of-year 1) to month 12 (day-of-year 365(6)) then you might want to specify the end-points so this smooth covers the full year. In your call to gam, add argument 'knots'. ## for datecode === day-of-year one might use knots = list(datecode = c(1, 365)) ## for datecode === month, one might use knots = list(datecode = c(1, 12)) i.e. with recent versions of mgcv (1.5-x branch) you can specify only the start and end points of the smooth and gam fills in the knots up to the chosen or default value of 'k' (the number of knots). The te() smooth also needs version 1.5-5 of mgcv as there was a bug in the code when setting up "cc" smooths in te() smooths that is fixed in that version. > > Distribution.s.temp.logwm2.deltaT<-gam(yoyras~s(temp,by=datecode)+s(logwm2,by=datecode)+s(DeltaT,by=datecode),data=AllData) > This seems completely over-specified - you are asking the model to estimate separate smooths for each value of datecode for each of three covariates. You'll need lots of data to fit such a model. > > However, I am not convinced this is the correct way to account for this > relationship. What do you think? Is there another way to include this in > my model? Maybe I should simply include date ("datecode") as another > term in the model? > > I also believe there may be an interaction between temperature and > light (logwm2), and based on what I have read the "by" method may be the > best way to include this. Correct? In that case, temp, datecode and light will all be related via interaction. te() smooths can take more than two covariates, so you could try fitting a smooth on all three, but you are starting off with quite a complicated model here. I think, in general, interactions are fitted via multivariate smooths of the variables of interest, not using by variables. HTH G > > Thank you for any input, tips, or advice you may be able to offer. I am > new to R, so especially grateful! > > Thanks again, > Paul Simonin > (PhD student) > > PS- If you would like additional information let me know. Also, if this > question is inappropriate for the help list please let me know. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
signature.asc
Description: This is a digitally signed message part
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.