[R] Using lm on data.frame with categorical data as character column results in error in plot.lm
Gerhard Burger
g@@@burger @ending from l@cdr@leidenuniv@nl
Tue Nov 13 18:24:32 CET 2018
Hi all,
Not sure if the following could be considered a bug, or just a user error
but here goes:
We're teaching our students to use the tidyverse for most of their R stuff
and the following gives problems (code adapted/shortened to pinpoint
problem):
```
iris_long = tidyr::gather(iris, key ="variable", value = "value", -Species)
iris_lm = lm( value ~ Species + variable, data = iris_long)
stats:::plot.lm(iris_lm, which = 5)
```
whereas, if we use reshape::melt instead of tidyr::gather it works fine:
```
iris_long = reshape2::melt(iris)
iris_lm = lm( value ~ Species + variable, data = iris_long)
stats:::plot.lm(iris_lm, which = 5)
```
Now the only difference between the output from melt and gather is that the
resulting "variable" column is a factor column in melt, but a character
column in gather:
```
testthat::expect_identical(reshape2::melt(iris), tidyr::gather(iris, key
="variable", value = "value", -Species))
```
This can be fixed by adding `factor_key = T` to the gather call, after
which everything works fine. Are categorical variables required to be in a
factor column? Because `lm` seems to handle it fine, but `plot.lm` gives
problems... Is this something that might need a fix in plot.lm?
Any insight appreciated!
Kind regards,
Gerhard
For completeness, my sessionInfo:
```
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8
LC_MONETARY=nl_NL.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=nl_NL.UTF-8 LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 tidyr_0.8.1 crayon_1.3.4 R6_2.2.2
plyr_1.8.4 magrittr_1.5 pillar_1.3.0 rlang_0.2.2
[9] stringi_1.2.4 reshape2_1.4.3 rstudioapi_0.7 testthat_2.0.0
tools_3.5.1 stringr_1.3.1 glue_1.3.0 purrr_0.2.5
[17] compiler_3.5.1 tidyselect_0.2.4 tibble_1.4.2
```
[[alternative HTML version deleted]]
More information about the R-help
mailing list