[R] error in NORM lib
Leo Gürtler
leog at anicca-vijja.de
Wed Nov 9 02:54:21 CET 2005
Dear alltogether,
I experience very strange behavior of imputation of NA's with the NORM
library. I use R 2.2.0, win32.
The code is below and the same dataset was also tried with MICE and
aregImpute() from HMISC _without_ any problem.
The problem is as follows:
(1) using the whole dataset results in very strange imputations - values
far beyond the maximum of the respective column, > 200%! and this is
reproducible and true for the whole set of imputed NAs
(2) using just part (i.e. columns) of the dataset results in the fact
that some NAs are not imputed at all, i.e. NAs are still in the dataset
- but there is neither a warning nor an error
(3) data.augmentation with da.norm() fails, but not after the first
step, mostly 3-5 steps are ok, then it stops (see below)
The dataset is from educational research and should be almost normal
distributed (slight deviations, but not really that heavy to explain the
strange results).
I don't understand this, because the dataset works well with MICE and
aregImpute() and other statistics _and_ I checked the manpages and it
does not seem that the calls are wrong.
Thus, either it depends on the dataset (but why?) or it is maybe a bug.
I appreciate every help,
thanks,
leo gürtler
<---snip--->
library(norm)
rngseed(1234)
load(url("http://www.anicca-vijja.de/lg/dframe.Rdata")) # load object
"dframe"
dim(dframe)
apply(dframe,2,function(x) sum(is.na(x))) # check how many NAs in the
dataset
#dframe <-
subset(dframe,select=-c(alter,grpzugeh,is1,is4,is6,klassenstufe,mmit,vorai,vorap,voras,vorkf,vorsg,vorvb))
s1 <- prelim.norm(dframe)
s1$nmis # re-check of NAs should be identical to above
s2 <- prelim.norm(dframe[,1:32])# see below -> still NAs are available -
_not_ imputed
thetahat1 <- em.norm(s1)
theta1 <- da.norm(s1,thetahat1,steps=20,showits=TRUE) # error:
# Steps of Data
Augmentation:
#
1...2...3...4...5...6...7...8...Fehler: NA/NaN/Inf in externem
Funktionsaufruf (arg 2)
thetahat2 <- em.norm(s2)
( imputed1 <- imp.norm(s1,thetahat1,dframe) ) # very strange imputed
values
# almost >200% to big
than expected
( imputed1.1 <- imp.norm(s1,theta1,dframe) ) # not possible -
because da.norm gives no result!
( imputed2 <- imp.norm(s2,thetahat2,dframe) ) # still NAs in the matrix
# visualize the strange values
par(mfrow=c(2,1))
hist(dframe,prob=TRUE) # histogramm data set with NAs - original values
lines(density(na.omit(dframe)))
hist(imputed1,prob=TRUE) # histogramm of dataset with imputed values
lines(density(imputed1))
</---snip--->
More information about the R-help
mailing list