[R] error in NORM lib

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Wed Nov 9 08:57:18 CET 2005


On 09-Nov-05 Prof Brian Ripley wrote:
> You really need to send such issues to the _package_ maintainer: please
> see the posting guide.  He will need a completely reproducible example.
> 
> On Wed, 9 Nov 2005, Leo Gürtler wrote:
> 
>> Dear alltogether,
>>
>> I experience very strange behavior of imputation of NA's with the NORM
>> library. I use R 2.2.0, win32.
>> The code is below and the same dataset was also tried with MICE and
>> aregImpute() from HMISC _without_ any problem.

The Shafer-derived CAT, NORM and MIX require the data to be passed
as a *matrix*. As far as I can see from your code below, you have
passed the data as a dataframe (at least, that is what the name
"dframe" suggests).

First convert dframe to a matrix bbefore running *any* of the NORM
commands on it, e.g.

  dmatrix <- as.matrix(dframe)

and then try your NORM commands again.

The "matrix" requirement is in fact given in the documentation
for all three, which is essentially as in Shafer's original.
For example:

  library(norm)
  ?prelim.norm

--> "Usage:

     prelim.norm(x)

     Arguments:

       x: data matrix containing missing values. The rows of x 
          correspond to observational units, and the columns to
          variables.  Missing values are denoted by `NA'."

Note "matrix".

Hoping this helps,
Ted.

>> The problem is as follows:
>>
>> (1) using the whole dataset results in very strange imputations -
>> values
>> far beyond the maximum of the respective column, > 200%! and this is
>> reproducible and true for the whole set of imputed NAs
>> (2) using just part (i.e. columns) of the dataset results in the fact
>> that some NAs are not imputed at all, i.e. NAs are still in the
>> dataset
>> - but there is neither a warning nor an error
>> (3) data.augmentation with da.norm() fails, but not after the first
>> step, mostly 3-5 steps are ok, then it stops (see below)
>>
>> The dataset is from educational research and should be almost normal
>> distributed (slight deviations, but not really that heavy to explain
>> the
>> strange results).
>> I don't understand this, because the dataset works well with MICE and
>> aregImpute() and other statistics _and_ I checked the manpages and it
>> does not seem that the calls are wrong.
>> Thus, either it depends on the dataset (but why?) or it is maybe a
>> bug.
>>
>> I appreciate every help,
>>
>> thanks,
>>
>> leo gürtler
>>
>> <---snip--->
>>
>> library(norm)
>> rngseed(1234)
>> load(url("http://www.anicca-vijja.de/lg/dframe.Rdata"))   # load
>> object
>> "dframe"
>> dim(dframe)
>> apply(dframe,2,function(x) sum(is.na(x))) # check how many NAs in the
>> dataset
>> #dframe <-
>> subset(dframe,select=-c(alter,grpzugeh,is1,is4,is6,klassenstufe,mmit,vo
>> rai,vorap,voras,vorkf,vorsg,vorvb))
>> s1 <- prelim.norm(dframe)
>> s1$nmis   # re-check of NAs should be identical to above
>> s2 <- prelim.norm(dframe[,1:32])# see below -> still NAs are available
>> -
>> _not_ imputed
>> thetahat1 <- em.norm(s1)
>> theta1 <- da.norm(s1,thetahat1,steps=20,showits=TRUE)  # error:
>>                                                       # Steps of Data
>> Augmentation:
>>                                                       #
>> 1...2...3...4...5...6...7...8...Fehler: NA/NaN/Inf in externem
>> Funktionsaufruf (arg 2)
>> thetahat2 <- em.norm(s2)
>> ( imputed1 <- imp.norm(s1,thetahat1,dframe) )    # very strange
>> imputed
>> values
>>                                                 # almost >200% to big
>> than expected
>> ( imputed1.1 <- imp.norm(s1,theta1,dframe)  )    # not possible -
>> because da.norm gives no result!
>> ( imputed2 <- imp.norm(s2,thetahat2,dframe) )    # still NAs in the
>> matrix
>>
>> # visualize the strange values
>> par(mfrow=c(2,1))
>> hist(dframe,prob=TRUE)      # histogramm data set with NAs - original
>> values
>> lines(density(na.omit(dframe)))
>> hist(imputed1,prob=TRUE)   # histogramm of dataset with imputed values
>> lines(density(imputed1))
>>
>>
>> </---snip--->
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>>
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 09-Nov-05                                       Time: 07:57:11
------------------------------ XFMail ------------------------------




More information about the R-help mailing list