[R] Factors in an regression using lm()
Ivan Calandra
ivan.calandra at uni-hamburg.de
Tue Oct 12 12:06:06 CEST 2010
Oops, my bad.
I rarely do regression, so I forgot that in your case the explaining
variables do not have to be factors.
The rest stands.
Ivan
Le 10/12/2010 11:56, Ivan Calandra a écrit :
> Hi,
>
> Your response (dependent) variable, which has to be on the left side
> of the '~' in the formula, should be numeric. In your example deny is
> a factor; first problem
> The explaining variables, on the right side of the '~', should be
> factors. Here, hir, dir, css and mcs are numeric; second problem. Only
> black is a factor.
>
> There are two possibilities (not mutually exclusive):
> - you should transform your factors into numeric and vice-versa as
> needed, see ?factor and ?as.numeric, as well as StringAsFactor
> argument from ?read.table (I guess you imported your data.frame that way)
> - you should adjust your model formula. It might be that you mixed up
> the variables in the formula. See ?formula
>
> HTH,
> Ivan
>
> Le 10/12/2010 11:39, Gabriel Bergin a écrit :
>> Hi,
>>
>> I am trying to do a multiple regression on the dataset "Hdma",
>> available in
>> the Ecdat package.
>>
>> The data looks like this:
>>> str(Hdma)
>> 'data.frame': 2381 obs. of 13 variables:
>> $ dir : num 0.221 0.265 0.372 0.32 0.36 ...
>> $ hir : num 0.221 0.265 0.248 0.25 0.35 ...
>> $ lvr : num 0.8 0.922 0.92 0.86 0.6 ...
>> $ ccs : num 5 2 1 1 1 1 1 2 2 2 ...
>> $ mcs : num 2 2 2 2 1 1 2 2 2 1 ...
>> $ pbcr : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>> $ dmi : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
>> $ self : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>> $ single : Factor w/ 2 levels "no","yes": 1 2 1 1 1 1 2 1 1 2 ...
>> $ uria : num 3.9 3.2 3.2 4.3 3.2 ...
>> $ comdominiom: num 0 0 0 0 0 0 1 0 0 0 ...
>> $ black : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>> $ deny : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
>>
>> I would like to try a more complex regression, but even this relatively
>> uncomplicated one returns an error:
>>
>> summary(lm(deny ~ hir + dir + ccs + mcs + black))
>>
>> The error I get is:
>> Error in storage.mode(y)<- "double" :
>> invalid to change the storage mode of a factor
>> In addition: Warning message:
>> In model.response(mf, "numeric") :
>> using type="numeric" with a factor response will be ignored
>>
>> I understand that there is something wrong due to the fact that some
>> of the
>> variables are factors. But as far as I've grasped, it should be
>> possible to
>> include factor variables when using lm(). Am I in error in thinking
>> this?
>>
>> Sincerely,
>> Gabriel Bergin
>> Undergraduate economics student
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra at uni-hamburg.de
**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php
More information about the R-help
mailing list