[R] Problem with lm Giving Wrong Results
Labone, Thomas
|@bone @end|ng |rom em@||@@c@edu
Thu Dec 2 11:50:13 CET 2021
In the code below the first and second plots should look pretty much the same, the only difference being that the first has n=1000 points and the second n=10000 points. On two of my Linux machines (info below) the second plot is a horizontal line (incorrect answer from lm), but on my Windows 10 machine and a third Linux machine it works as expected. The interesting thing is that the code works as expected for n <= 4095 but fails for n>=4096 (which equals 2^12). Can anyone else reproduce this problem? Any ideas on how to fix it?
set.seed(132)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# This works
n <- 1000# OK <= 4095
Z <- qnorm(ppoints(n))
k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
quantile(k,probs=c(0.025,0.5,0.975))
summary(k)
fit <- lm(log(k) ~ Z)
summary(fit)
gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd
plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#this does not
n <- 10000# fails >= 4096 = 2^12
Z <- qnorm(ppoints(n))
k <- sort(rlnorm(n,log(2131),log(1.61)) / rlnorm(n,log(355),log(1.61)))
quantile(k,probs=c(0.025,0.5,0.975))
summary(k)
fit <- lm(log(k) ~ Z)
summary(fit)
gm <- exp(coef(fit)[1])
gsd <- exp(coef(fit)[2])
gm
gsd
plot(Z,k,log="y",xlim=c(-4,4),ylim=c(0.1,100))
lines(Z,gm*gsd^Z,col="red")
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> sessionInfo() #for two Linux machines having problem
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.2
Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libmkl_rt.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.1.2 Matrix_1.3-4 tools_4.1.2 expm_0.999-6 grid_4.1.2 lattice_0.20-45
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> sessionInfo() # for a third Linux machine not having the problem
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.3
Matrix products: default
BLAS/LAPACK: /opt/intel/compilers_and_libraries_2020.0.166/linux/mkl/lib/intel64_lin/libmkl_rt.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.1.1 tools_4.1.1
Thomas R. LaBone
PhD student
Department of Epidemiology and Biostatistics
Arnold School of Public Health
University of South Carolina
Columbia, South Carolina USA
[[alternative HTML version deleted]]
More information about the R-help
mailing list