[R] count data as independent variable in logistinc regression
vlagani at ics.forth.gr
vlagani at ics.forth.gr
Tue Oct 2 18:10:51 CEST 2012
Dear R users,
I would like to employ count data as covariates while fitting a
logistic regression model. My question is:
do I violate any assumption of the logistic (and, more in general, of
the generalized linear) models by employing count, non-negative
integer variables as independent variables?
I found a lot of references in the literature regarding hot to use
count data as outcome, but not as covariates; see for example the very
clear paper: "N E Breslow (1996) Generalized Linear Models: Checking
Assumptions and Strengthening Conclusions, Congresso Nazionale Societa
Italiana di Biometria, Cortona June 1995", available at
http://biostat.georgiahealth.edu/~dryu/course/stat9110spring12/land16_ref.pdf.
Loosely speaking, it seems that glm assumptions may be expressed as follows:
iid residuals;
the link function must correctly represent the relationship among
dependent and independent variables;
absence of outliers
Does everybody knows whether there exists any other
assumption/technical problem that may suggest to use some other type
of models for dealing with count covariates?
Finally, please notice that my data contain relatively few samples
(<100) and that count variables' ranges can vary within 3-4 order of
magnitude (i.e. some variables has value in the range 0-10, while
other variables may have values within 0-10000).
A simple example code follows:
###########################################################
#genrating simulated data
var1 = sample(0:10, 100, replace = TRUE);
var2 = sample(0:1000, 100, replace = TRUE);
var3 = sample(0:100000, 100, replace = TRUE);
outcome = sample(0:1, 100, replace = TRUE);
dataset = data.frame(outcome, var1, var2, var3);
#fitting the model
model = glm(outcome ~ ., family=binomial, data = dataset)
#inspecting the model
print(model)
###########################################################
Regards,
--
Vincenzo Lagani
Research Fellow
BioInformatics Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas
More information about the R-help
mailing list