[BioC] wee query on pamr.knnimpute
Aedin
aedin.culhane at ucd.ie
Wed Feb 18 15:09:28 MET 2004
Hi,
Dear BioC,
Is the following a small bug in pamr.knnimpute?
pamr.knnimpute<- function (data, k = 10) {
x <- data$x
N <- dim(x)
p <- N[2]
N <- N[1]
col.nas <- apply(x, 2, is.na)
if ((sum(col.nas) == N) > 0) {
stop("Error: A column has all missing values!")
}
Using this, pamr.knnimpute will stop if the total number of missing values
in x = nrow(x).
> a
[,1] [,2] [,3] [,4] [,5]
[1,] 1 11 NA 31 41
[2,] 2 12 22 32 NA
[3,] NA 13 23 NA 43
[4,] 4 14 NA 34 44
[5,] 5 NA NA 35 45
[6,] NA 16 26 36 46
[7,] 7 17 27 37 47
[8,] 8 18 28 38 48
[9,] 9 19 29 39 49
[10,] NA 20 NA 40 50
> col.nas <- apply(a, 2, is.na)
> col.nas
[,1] [,2] [,3] [,4] [,5]
[1,] FALSE FALSE TRUE FALSE FALSE
[2,] FALSE FALSE FALSE FALSE TRUE
[3,] TRUE FALSE FALSE TRUE FALSE
[4,] FALSE FALSE TRUE FALSE FALSE
[5,] FALSE TRUE TRUE FALSE FALSE
[6,] TRUE FALSE FALSE FALSE FALSE
[7,] FALSE FALSE FALSE FALSE FALSE
[8,] FALSE FALSE FALSE FALSE FALSE
[9,] FALSE FALSE FALSE FALSE FALSE
[10,] TRUE FALSE TRUE FALSE FALSE
> sum(col.nas)
[1] 10
> N<-dim(a)[1]
> N
[1] 10
> if ((sum(col.nas) == N) > 0) {
+ stop("Error: A column has all missing values!")
+ }
Error: Error: A column has all missing values!
I've edited it to the use the following rather ugly R code. It does checks
if a column contains only NA.
knnimpute<- function (data, k = 10) {
x <- data$x
p <- ncol(x) # P is the ncol of x
N <- nrow(x) # N is the nrow of x
col.nas <- apply(apply(x, 2, is.na), 2, sum)
if (sum(is.element(col.nas,nrow(x)))>0) stop("A column has all missing
values!")
if (sum(col.nas)/prod(dim(x))>0.15) stop("Greater than 15% missing
values in data") # Extra check
Regards
Aedin
More information about the Bioconductor
mailing list