[BioC] wee query on pamr.knnimpute

Aedin aedin.culhane at ucd.ie
Wed Feb 18 15:09:28 MET 2004


Hi,
Dear BioC,

Is the following a small bug in pamr.knnimpute?

pamr.knnimpute<- function (data, k = 10)  {
    x <- data$x
    N <- dim(x)
    p <- N[2]
    N <- N[1]
    col.nas <- apply(x, 2, is.na)
    if ((sum(col.nas) == N) > 0) {
        stop("Error: A column has all missing values!")
    }

Using this, pamr.knnimpute will stop if the total number of missing values
in x = nrow(x).


> a
      [,1] [,2] [,3] [,4] [,5]
 [1,]    1   11   NA   31   41
 [2,]    2   12   22   32   NA
 [3,]   NA   13   23   NA   43
 [4,]    4   14   NA   34   44
 [5,]    5   NA   NA   35   45
 [6,]   NA   16   26   36   46
 [7,]    7   17   27   37   47
 [8,]    8   18   28   38   48
 [9,]    9   19   29   39   49
[10,]   NA   20   NA   40   50
> col.nas <- apply(a, 2, is.na)
> col.nas
       [,1]  [,2]  [,3]  [,4]  [,5]
 [1,] FALSE FALSE  TRUE FALSE FALSE
 [2,] FALSE FALSE FALSE FALSE  TRUE
 [3,]  TRUE FALSE FALSE  TRUE FALSE
 [4,] FALSE FALSE  TRUE FALSE FALSE
 [5,] FALSE  TRUE  TRUE FALSE FALSE
 [6,]  TRUE FALSE FALSE FALSE FALSE
 [7,] FALSE FALSE FALSE FALSE FALSE
 [8,] FALSE FALSE FALSE FALSE FALSE
 [9,] FALSE FALSE FALSE FALSE FALSE
[10,]  TRUE FALSE  TRUE FALSE FALSE

> sum(col.nas)
[1] 10
> N<-dim(a)[1]
> N
[1] 10
>     if ((sum(col.nas) == N) > 0) {
+         stop("Error: A column has all missing values!")
+     }
Error: Error: A column has all missing values!




I've edited it to the use the following rather ugly R code. It does checks
if a column contains only NA.

knnimpute<- function (data, k = 10) {
    x <- data$x
    p <- ncol(x)	# P is the ncol of x
    N <- nrow(x)	# N is the nrow of x

    col.nas <- apply(apply(x, 2, is.na), 2, sum)
    if (sum(is.element(col.nas,nrow(x)))>0) stop("A column has all missing
values!")
    if (sum(col.nas)/prod(dim(x))>0.15) stop("Greater than 15% missing
values in data")  # Extra check


Regards
Aedin



More information about the Bioconductor mailing list