[R] Plotting prediction curves and CIs from GAM models

Adriaan de Jong Tue, 14 Feb 2023 07:59:34 -0800

Dear List member,
My data are from 30 years of opportunistic counting of migratory Eurasian 
Curlew (Numenius arquata) during the core breeding season when the local 
population is supposed to be stable. My main objective is the trend in numbers 
over the years, but information about sighting efficiency over the days of 
season (DoS) and the time of the day (ToD) is also desired (they are not just 
nuisance variables). Observations were made while driving along the very same 
25 km road in rural northern Sweden. This driving was for everyday life 
purposes, not for the sake of this study, i.e. the data are virtually 
zero-cost, zero-effort and zero-emission. For each counting event (N=1020), I 
registered date, midpoint-time (5 AM - 9 PM) and observed number of curlews. 
The date was used to create variables Year (integer) and Day-of-Season 
(integer, May 1 = 1 to June 14, leap-year adjusted). The ToD variable was 
expressed as decimal hours (e.g. 8.15)  I chose to use GAM-family models to 
describe the Count vs Year, DoS and ToD relationship (subset listed below, full 
dataset available at request). A overdisp_fun check of the Poisson distribution 
GAMs showed that a shift to negative binomial distribution (.. nb()..) GAMs was 
appropriate.


Preliminary model selection favoured the following model (Total = count result):

mod1<-gam(Total~s(Year) +
                   te(DoS,ToD, k=5),
           family=nb(),
           data=Trend1,
           method="ML")

The model explained 38.8% of overall deviance. Model check-ups (gam.check, 
qq.gam and overdisp_fun) were all satisfactory.

plot.gam(mod1) (and plot(mod1)) produced a s(Year)~Year curve with confidence 
interval lines and a ToD vs DoS "topography" plot with CI lines (PDF-copies 
attached). From what I understand, these curves/lines show estimated smoother 
and tensor values, respectively. These are useful plots for scientists, but not 
really what I want to present to non-academic ornithologists.

In the next step, I used

preddata<-predict.gam(newdata=Trend1, mod1, type="response")

to produce a predicted dataset for the same (original) data and

plot(preddata~Trend1$Year, xlab = "Year", ylab="Predicted Eurasian Curlew 
counts")

to visualize the trend in numbers over the study period (scatterplot attached).

Here is where my questions start.

1. How can I fit a GAM-type trendline in this graph and how can I add upper and 
lower CI-limits to this trendline? It's the complex GAM structure that confuses 
me. Are these model components in the model output somewhere? Will this 
trendline be "rinsed" from the effects of DoS and ToD? If not, how can I 
find/create one?

2. How can I produce a "topography"-plot of the predicted count numbers over 
ToD vs DoS, similar to the one for the tensor estimates produced by plot.gam?


Thanks in advance for any references, comments and suggestions.
Have a nice day,
Adriaan de Jong
Swedish University of Agricultural Sciences


Data structure

Year   DoS   ToD Total
1993     6      9.25   7
1993   11    12.50   4
.
.
.
2022    40    7.75    3
---
När du skickar e-post till SLU så innebär detta att SLU behandlar dina 
personuppgifter. För att läsa mer om hur detta går till, klicka här 
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more 
information on how this is done, click here 
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>

AskR1.pdf
Description: AskR1.pdf

AskR2.pdf
Description: AskR2.pdf

AskR3.pdf
Description: AskR3.pdf

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting prediction curves and CIs from GAM models

Reply via email to