[R] Changing transformations in mi package
Elizabeth Hensor
E.M.A.Hensor at leeds.ac.uk
Thu May 5 09:54:43 CEST 2016
Thanks very much for your help David. I'll contact Ben Goodrich.
-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu]
Sent: 04 May 2016 22:24
To: Elizabeth Hensor; 'r-help at r-project.org'
Subject: RE: Changing transformations in mi package
Thank you for providing a working example. I think you need to contact the package maintainer:
> maintainer("mi")
[1] "Ben Goodrich <benjamin.goodrich at columbia.edu>"
When I run your code it appears that the c column is correctly transformed to square roots, but the show() function is incorrectly indicating a log transform:
> data.missingdf at variables$c at raw_data # The raw data
[1] 4.2 7.9 NA 16.1 19.9 23.0
> sqrt(data.missingdf at variables$c at raw_data) # The square root of the raw
> data
[1] 2.049390 2.810694 NA 4.012481 4.460942 4.795832
> data.missingdf at variables$c at data # The transformed data - square roots,
> not logs
[1] 2.049390 2.810694 NA 4.012481 4.460942 4.795832
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Elizabeth Hensor
Sent: Wednesday, May 4, 2016 5:44 AM
To: 'r-help at r-project.org'
Subject: [R] Changing transformations in mi package
Dear all,
I am an R beginner and new to the list. In preparation for using mi to impute missing values I am setting up the missing data frame and would like to specify the transformation types for some of my variables, as I will be using these transformations in my analysis models. According to the documentation the available options are "standardize" (the default), "identity", "log", "logshift" and "sqrt". I can successfully change the transformation types to "log" and "logshift", but when I attempt to change to "sqrt", this changes the type to "log" instead. I'd appreciate your help, please.
Below are details of my system and some code which replicates the issue.
> sessionInfo()
R version 3.2.5 (2016-04-14)
Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] lmerTest_2.0-30 truncnorm_1.0-7 mi_1.0 lme4_1.1-12 Matrix_1.2-4
[6] pls_2.5-0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.4 Formula_1.2-1 cluster_2.0.3 splines_3.2.5
[5] MASS_7.3-45 munsell_0.4.3 colorspace_1.2-6 arm_1.8-6
[9] lattice_0.20-33 minqa_1.2.4 plyr_1.8.3 nnet_7.3-12
[13] grid_3.2.5 nlme_3.1-126 gtable_0.2.0 latticeExtra_0.6-28
[17] coda_0.18-1 abind_1.4-3 survival_2.38-3 gridExtra_2.2.1
[21] RColorBrewer_1.1-2 nloptr_1.0.4 ggplot2_2.1.0 acepack_1.3-3.3
[25] rpart_4.1-10 scales_0.4.0 Hmisc_3.17-3 foreign_0.8-66
data <- data.frame(a=c(NA,2.1,3.3,4.5,5.9,6.2),b=c(2.2,NA,6.1,8.3,10.2,12.13),c=c(4.2,7.9,NA,16.1,19.9,23))
data
a b c
1 NA 2.20 4.2
2 2.1 NA 7.9
3 3.3 6.10 NA
4 4.5 8.30 16.1
5 5.9 10.20 19.9
6 6.2 12.13 23.0
data.missingdf <- missing_data.frame(data)
show(data.missingdf)
Object of class missing_data.frame with 6 observations on 3 variables
There are 4 missing data patterns
Append '@patterns' to this missing_data.frame to access the corresponding pattern for every observation or perhaps use table()
type missing method model
a continuous 1 ppd linear
b continuous 1 ppd linear
c continuous 1 ppd linear
family link transformation
a gaussian identity standardize
b gaussian identity standardize
c gaussian identity standardize
#Let's say I'd like to change transformation for a, b and c to "log", "logshift" and "sqrt" respectively
data.missingdf <- change(data.missingdf, y="a", what="transformation", to="logshift") data.missingdf <- change(data.missingdf, y="b", what="transformation", to="log") data.missingdf <- change(data.missingdf, y="c", what="transformation", to="sqrt")
show(data.missingdf)
Object of class missing_data.frame with 6 observations on 3 variables
There are 4 missing data patterns
Append '@patterns' to this missing_data.frame to access the corresponding pattern for every observation or perhaps use table()
type missing method model
a continuous 1 ppd linear
b continuous 1 ppd linear
c continuous 1 ppd linear
family link transformation
a gaussian identity logshift
b gaussian identity log
c gaussian identity log
#Transformation has been successfully changed for a and b, but for c has been changed to "log" instead of "sqrt"
Thanks in advance for your assistance,
Liz Hensor
Biostatistician
Leeds Institute of Rheumatic and Musculoskeletal Medicine & NIHR Leeds Musculoskeletal Biomedical Research Unit
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list