[R] count data as independent variable in logistinc regression

vlagani at ics.forth.gr vlagani at ics.forth.gr
Tue Oct 2 18:10:51 CEST 2012


Dear R users,

I would like to employ count data as covariates while fitting a  
logistic regression model. My question is:

do I violate any assumption of the logistic (and, more in general, of  
the generalized linear) models by employing count, non-negative  
integer variables as independent variables?

I found a lot of references in the literature regarding hot to use  
count data as outcome, but not as covariates; see for example the very  
clear paper: "N E Breslow (1996) Generalized Linear Models: Checking  
Assumptions and Strengthening Conclusions, Congresso Nazionale Societa  
Italiana di Biometria, Cortona June 1995", available at
http://biostat.georgiahealth.edu/~dryu/course/stat9110spring12/land16_ref.pdf.

Loosely speaking, it seems that glm assumptions may be expressed as follows:

iid residuals;
the link function must correctly represent the relationship among  
dependent and independent variables;
absence of outliers

Does everybody knows whether there exists any other  
assumption/technical problem that may suggest to use some other type  
of models for dealing with count covariates?

Finally, please notice that my data contain relatively few samples  
(<100) and that count variables' ranges can vary within 3-4 order of  
magnitude (i.e. some variables has value in the range 0-10, while  
other variables may have values within 0-10000).

A simple example code follows:

###########################################################

#genrating simulated data
var1 = sample(0:10, 100, replace = TRUE);
var2 = sample(0:1000, 100, replace = TRUE);
var3 = sample(0:100000, 100, replace = TRUE);
outcome = sample(0:1, 100, replace = TRUE);
dataset = data.frame(outcome, var1, var2, var3);

#fitting the model
model = glm(outcome ~ ., family=binomial, data = dataset)

#inspecting the model
print(model)

###########################################################

Regards,

-- 
Vincenzo Lagani
Research Fellow
BioInformatics Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas



More information about the R-help mailing list