[R] Factors in an regression using lm()

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Tue Oct 12 11:54:04 CEST 2010


The problem is not in the covariates but in the respons variable. lm()
can only handle numerical variables. Deny is a factor, hence you get an
error.

HTH,

Thierry

------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx op inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

> -----Oorspronkelijk bericht-----
> Van: r-help-bounces op r-project.org 
> [mailto:r-help-bounces op r-project.org] Namens Gabriel Bergin
> Verzonden: dinsdag 12 oktober 2010 11:39
> Aan: r-help op r-project.org
> Onderwerp: [R] Factors in an regression using lm()
> 
> Hi,
> 
> I am trying to do a multiple regression on the dataset 
> "Hdma", available in the Ecdat package.
> 
> The data looks like this:
> > str(Hdma)
> 'data.frame': 2381 obs. of  13 variables:
>  $ dir        : num  0.221 0.265 0.372 0.32 0.36 ...
>  $ hir        : num  0.221 0.265 0.248 0.25 0.35 ...
>  $ lvr        : num  0.8 0.922 0.92 0.86 0.6 ...
>  $ ccs        : num  5 2 1 1 1 1 1 2 2 2 ...
>  $ mcs        : num  2 2 2 2 1 1 2 2 2 1 ...
>  $ pbcr       : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>  $ dmi        : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
>  $ self       : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>  $ single     : Factor w/ 2 levels "no","yes": 1 2 1 1 1 1 2 1 1 2 ...
>  $ uria       : num  3.9 3.2 3.2 4.3 3.2 ...
>  $ comdominiom: num  0 0 0 0 0 0 1 0 0 0 ...
>  $ black      : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
>  $ deny       : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 1 ...
> 
> I would like to try a more complex regression, but even this 
> relatively uncomplicated one returns an error:
> 
> summary(lm(deny ~ hir + dir + ccs + mcs + black))
> 
> The error I get is:
> Error in storage.mode(y) <- "double" :
>   invalid to change the storage mode of a factor In addition: 
> Warning message:
> In model.response(mf, "numeric") :
>   using type="numeric" with a factor response will be ignored
> 
> I understand that there is something wrong due to the fact 
> that some of the variables are factors. But as far as I've 
> grasped, it should be possible to include factor variables 
> when using lm(). Am I in error in thinking this?
> 
> Sincerely,
> Gabriel Bergin
> Undergraduate economics student
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list