[R] Removing "NA" from matrix

arun smartpink111 at yahoo.com
Fri Jun 14 16:40:14 CEST 2013


Probably, this also works:
dat2<-dat[,(colSums(dat)/dat[1,])!=nrow(dat)]
cor(dat2)

dat$NewCol<-5
 dat3<-dat[,(colSums(dat)/dat[1,])!=nrow(dat)]
 cor(dat3)
#           ABC         DEF         JKL        MNO
#ABC  1.0000000 -0.75600764  0.55245223 -0.2735585
#DEF -0.7560076  1.00000000 -0.06479082  0.2020781
#JKL  0.5524522 -0.06479082  1.00000000  0.4564568
#MNO -0.2735585  0.20207810  0.45645683  1.0000000
A.K.




----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: Katherine Gobin <katherine_gobin at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Friday, June 14, 2013 10:34 AM
Subject: Re: [R] Removing "NA" from matrix

HI,
Try:
dat1<-dat[sapply(dat,function(x) length(unique(x)))>1]

cor(dat1)
#           ABC         DEF         JKL        MNO
#ABC  1.0000000 -0.75600764  0.55245223 -0.2735585
#DEF -0.7560076  1.00000000 -0.06479082  0.2020781
#JKL  0.5524522 -0.06479082  1.00000000  0.4564568
#MNO -0.2735585  0.20207810  0.45645683  1.0000000


A.K.
From: Katherine Gobin <katherine_gobin at yahoo.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, June 14, 2013 10:03 AM
Subject: [R] Removing "NA" from matrix

Dear R forum,

I have a data frame 


dat = data.frame(
ABC = c(25.28000732,48.33857234,19.8013245,10.68361461),
DEF = c(14.02722251,10.57985168,11.81890316,21.40171514),
GHI = c(1,1,1,1),
JKL = c(45.96423231,44.52986236,16.56514176,32.14545122),
MNO = c(45.38438063,15.54338206,18.78444777,24.29486984))

> dat
       ABC      DEF GHI      JKL      MNO
1 25.28001 14.02722   1 45.96423 45.38438
2 48.33857 10.57985   1 44.52986 15.54338
3 19.80132 11.81890   1 16.56514 18.78445
4 10.68361 21.40172   1 32.14545 24.29487


When I try to find the correlation I get (which is obvious as my one column shows no variation)

dat_cor = cor(dat)


Warning message:
In cor(dat) : the standard deviation is zero
> dat_cor
           ABC         DEF GHI         JKL        MNO
ABC  1.0000000 -0.75600764  NA  0.55245223 -0.2735585
DEF -0.7560076  1.00000000  NA -0.06479082  0.2020781
GHI         NA          NA   1          NA         NA
JKL  0.5524522 -0.06479082  NA  1.00000000  0.4564568
MNO -0.2735585  0.20207810  NA  0.45645683  1.0000000


In reality I am dealing with about 300 variables and don't know which variables don't vary.

My query is how do I remove the columns and rows with NA's.

So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only.

Kindly guide.

Thanking in advance.

Regards

Katherine

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list