[R] Changing transformations in mi package

Elizabeth Hensor E.M.A.Hensor at leeds.ac.uk
Thu May 5 09:54:43 CEST 2016


Thanks very much for your help David. I'll contact Ben Goodrich.

-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu] 
Sent: 04 May 2016 22:24
To: Elizabeth Hensor; 'r-help at r-project.org'
Subject: RE: Changing transformations in mi package

Thank you for providing a working example. I think you need to contact the package maintainer:

> maintainer("mi")
[1] "Ben Goodrich <benjamin.goodrich at columbia.edu>"

When I run your code it appears that the c column is correctly transformed to square roots, but the show() function is incorrectly indicating a log transform:

> data.missingdf at variables$c at raw_data # The raw data
[1]  4.2  7.9   NA 16.1 19.9 23.0
> sqrt(data.missingdf at variables$c at raw_data) # The square root of the raw 
> data
[1] 2.049390 2.810694       NA 4.012481 4.460942 4.795832
> data.missingdf at variables$c at data # The transformed data - square roots, 
> not logs
[1] 2.049390 2.810694       NA 4.012481 4.460942 4.795832

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352



-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Elizabeth Hensor
Sent: Wednesday, May 4, 2016 5:44 AM
To: 'r-help at r-project.org'
Subject: [R] Changing transformations in mi package

Dear all,
I am an R beginner and new to the list. In preparation for using mi to impute missing values I am setting up the missing data frame and would like to specify the transformation types for some of my variables, as I will be using these transformations in my analysis models. According to the documentation the available options are "standardize" (the default), "identity", "log", "logshift" and "sqrt". I can successfully change the transformation types to "log" and "logshift", but when I attempt to change to "sqrt", this changes the type to "log" instead. I'd appreciate your help, please.
Below are details of my system and some code which replicates the issue.

> sessionInfo()
R version 3.2.5 (2016-04-14)
Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lmerTest_2.0-30 truncnorm_1.0-7 mi_1.0          lme4_1.1-12     Matrix_1.2-4   
[6] pls_2.5-0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4         Formula_1.2-1       cluster_2.0.3       splines_3.2.5      
 [5] MASS_7.3-45         munsell_0.4.3       colorspace_1.2-6    arm_1.8-6          
 [9] lattice_0.20-33     minqa_1.2.4         plyr_1.8.3          nnet_7.3-12        
[13] grid_3.2.5          nlme_3.1-126        gtable_0.2.0        latticeExtra_0.6-28
[17] coda_0.18-1         abind_1.4-3         survival_2.38-3     gridExtra_2.2.1    
[21] RColorBrewer_1.1-2  nloptr_1.0.4        ggplot2_2.1.0       acepack_1.3-3.3    
[25] rpart_4.1-10        scales_0.4.0        Hmisc_3.17-3        foreign_0.8-66

data <- data.frame(a=c(NA,2.1,3.3,4.5,5.9,6.2),b=c(2.2,NA,6.1,8.3,10.2,12.13),c=c(4.2,7.9,NA,16.1,19.9,23))
data

    a     b    c
1  NA  2.20  4.2
2 2.1    NA  7.9
3 3.3  6.10   NA
4 4.5  8.30 16.1
5 5.9 10.20 19.9
6 6.2 12.13 23.0

data.missingdf <- missing_data.frame(data)
show(data.missingdf)

Object of class missing_data.frame with 6 observations on 3 variables

There are 4 missing data patterns

Append '@patterns' to this missing_data.frame to access the corresponding pattern for every observation or perhaps use table()

        type missing method  model
a continuous       1    ppd linear
b continuous       1    ppd linear
c continuous       1    ppd linear

    family     link transformation
a gaussian identity    standardize
b gaussian identity    standardize
c gaussian identity    standardize

#Let's say I'd like to change transformation for a, b and c to "log", "logshift" and "sqrt" respectively

data.missingdf <- change(data.missingdf, y="a", what="transformation", to="logshift") data.missingdf <- change(data.missingdf, y="b", what="transformation", to="log") data.missingdf <- change(data.missingdf, y="c", what="transformation", to="sqrt")
show(data.missingdf)

Object of class missing_data.frame with 6 observations on 3 variables

There are 4 missing data patterns

Append '@patterns' to this missing_data.frame to access the corresponding pattern for every observation or perhaps use table()

        type missing method  model
a continuous       1    ppd linear
b continuous       1    ppd linear
c continuous       1    ppd linear

    family     link transformation
a gaussian identity       logshift
b gaussian identity            log
c gaussian identity            log

#Transformation has been successfully changed for a and b, but for c has been changed to "log" instead of "sqrt"

Thanks in advance for your assistance,
Liz Hensor

Biostatistician
Leeds Institute of Rheumatic and Musculoskeletal Medicine & NIHR Leeds Musculoskeletal Biomedical Research Unit

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list