[R] Factors in an regression using lm()
Ivan Calandra
ivan.calandra at uni-hamburg.de
Tue Oct 12 11:56:36 CEST 2010
Hi,
Your response (dependent) variable, which has to be on the left side of
the '~' in the formula, should be numeric. In your example deny is a
factor; first problem
The explaining variables, on the right side of the '~', should be
factors. Here, hir, dir, css and mcs are numeric; second problem. Only
black is a factor.
There are two possibilities (not mutually exclusive):
- you should transform your factors into numeric and vice-versa as
needed, see ?factor and ?as.numeric, as well as StringAsFactor argument
from ?read.table (I guess you imported your data.frame that way)
- you should adjust your model formula. It might be that you mixed up
the variables in the formula. See ?formula
HTH,
Ivan
Le 10/12/2010 11:39, Gabriel Bergin a écrit :
> Hi,
>
> I am trying to do a multiple regression on the dataset "Hdma", available in
> the Ecdat package.
>
> The data looks like this:
>> str(Hdma)
> 'data.frame': 2381 obs. of 13 variables:
> $ dir : num 0.221 0.265 0.372 0.32 0.36 ...
> $ hir : num 0.221 0.265 0.248 0.25 0.35 ...
> $ lvr : num 0.8 0.922 0.92 0.86 0.6 ...
> $ ccs : num 5 2 1 1 1 1 1 2 2 2 ...
> $ mcs : num 2 2 2 2 1 1 2 2 2 1 ...
> $ pbcr : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
> $ dmi : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
> $ self : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
> $ single : Factor w/ 2 levels "no","yes": 1 2 1 1 1 1 2 1 1 2 ...
> $ uria : num 3.9 3.2 3.2 4.3 3.2 ...
> $ comdominiom: num 0 0 0 0 0 0 1 0 0 0 ...
> $ black : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
> $ deny : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
>
> I would like to try a more complex regression, but even this relatively
> uncomplicated one returns an error:
>
> summary(lm(deny ~ hir + dir + ccs + mcs + black))
>
> The error I get is:
> Error in storage.mode(y)<- "double" :
> invalid to change the storage mode of a factor
> In addition: Warning message:
> In model.response(mf, "numeric") :
> using type="numeric" with a factor response will be ignored
>
> I understand that there is something wrong due to the fact that some of the
> variables are factors. But as far as I've grasped, it should be possible to
> include factor variables when using lm(). Am I in error in thinking this?
>
> Sincerely,
> Gabriel Bergin
> Undergraduate economics student
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra at uni-hamburg.de
**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php
More information about the R-help
mailing list