Dear R users,

I would like to employ count data as covariates while fitting a logistic regression model. My question is:

do I violate any assumption of the logistic (and, more in general, of the generalized linear) models by employing count, non-negative integer variables as independent variables?

I found a lot of references in the literature regarding hot to use count data as outcome, but not as covariates; see for example the very clear paper: "N E Breslow (1996) Generalized Linear Models: Checking Assumptions and Strengthening Conclusions, Congresso Nazionale Societa Italiana di Biometria, Cortona June 1995", available at
http://biostat.georgiahealth.edu/~dryu/course/stat9110spring12/land16_ref.pdf.

Loosely speaking, it seems that glm assumptions may be expressed as follows:

iid residuals;
the link function must correctly represent the relationship among dependent and independent variables;
absence of outliers

Does everybody knows whether there exists any other assumption/technical problem that may suggest to use some other type of models for dealing with count covariates?

Finally, please notice that my data contain relatively few samples (<100) and that count variables' ranges can vary within 3-4 order of magnitude (i.e. some variables has value in the range 0-10, while other variables may have values within 0-10000).

A simple example code follows:

###########################################################

#genrating simulated data
var1 = sample(0:10, 100, replace = TRUE);
var2 = sample(0:1000, 100, replace = TRUE);
var3 = sample(0:100000, 100, replace = TRUE);
outcome = sample(0:1, 100, replace = TRUE);
dataset = data.frame(outcome, var1, var2, var3);

#fitting the model
model = glm(outcome ~ ., family=binomial, data = dataset)

#inspecting the model
print(model)

###########################################################

Regards,

--
Vincenzo Lagani
Research Fellow
BioInformatics Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to