All of your points are accepted, and I also give you credit for
reading the "formula" page better than I.
On Jun 19, 2009, at 10:08 AM, Gavin Simpson wrote:
On Fri, 2009-06-19 at 09:24 -0400, David Winsemius wrote:
On Jun 19, 2009, at 9:00 AM, onyourmark wrote:
<snip />
means and also, I see
data=crs$dataset[,c(1:59,922)]
I have read that the data argument is optional here
"an optional data frame, list or environment (or object coercible by
as.data.frame to a data frame) containing the variables in the
model. If not
found in data, the variables are taken from environment(formula),
typically
the environment from which glm is called"
when they say "data", is that meant to include the dependent
variable as
well.
Yes.
It has to be defined in 'data' or the environment of 'formula', so it
depends on what the OP meant by "meant to include". You can include it
in 'data' but don't have to.
In other words,
in the above statement 'value' is the dependent variable and it is
also
column 922 in the data set.
Is this correct?
Yes.
No - you can't say that it is variable 922, or even any of 1:59 or 922
for the reasons mentioned above.
set.seed(123)
dat <- data.frame(A = rnorm(100), B = rnorm(100), C = rnorm(100))
Y <- rpois(100, 2)
mod <- glm(Y ~ ., data = dat[,c(1,3)], family = poisson)
mod
If all you have is this:
mod <- glm(Y ~ ., data = dat[,c(1,3)], family = poisson)
You can't say anything more about Y than that it is either in 'dat' or
in the environment of 'formula ', which in this case is the global
workspace.
G
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.