[R] Removing "NA" from matrix
arun
smartpink111 at yahoo.com
Fri Jun 14 16:40:14 CEST 2013
Probably, this also works:
dat2<-dat[,(colSums(dat)/dat[1,])!=nrow(dat)]
cor(dat2)
dat$NewCol<-5
dat3<-dat[,(colSums(dat)/dat[1,])!=nrow(dat)]
cor(dat3)
# ABC DEF JKL MNO
#ABC 1.0000000 -0.75600764 0.55245223 -0.2735585
#DEF -0.7560076 1.00000000 -0.06479082 0.2020781
#JKL 0.5524522 -0.06479082 1.00000000 0.4564568
#MNO -0.2735585 0.20207810 0.45645683 1.0000000
A.K.
----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: Katherine Gobin <katherine_gobin at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Friday, June 14, 2013 10:34 AM
Subject: Re: [R] Removing "NA" from matrix
HI,
Try:
dat1<-dat[sapply(dat,function(x) length(unique(x)))>1]
cor(dat1)
# ABC DEF JKL MNO
#ABC 1.0000000 -0.75600764 0.55245223 -0.2735585
#DEF -0.7560076 1.00000000 -0.06479082 0.2020781
#JKL 0.5524522 -0.06479082 1.00000000 0.4564568
#MNO -0.2735585 0.20207810 0.45645683 1.0000000
A.K.
From: Katherine Gobin <katherine_gobin at yahoo.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc:
Sent: Friday, June 14, 2013 10:03 AM
Subject: [R] Removing "NA" from matrix
Dear R forum,
I have a data frame
dat = data.frame(
ABC = c(25.28000732,48.33857234,19.8013245,10.68361461),
DEF = c(14.02722251,10.57985168,11.81890316,21.40171514),
GHI = c(1,1,1,1),
JKL = c(45.96423231,44.52986236,16.56514176,32.14545122),
MNO = c(45.38438063,15.54338206,18.78444777,24.29486984))
> dat
ABC DEF GHI JKL MNO
1 25.28001 14.02722 1 45.96423 45.38438
2 48.33857 10.57985 1 44.52986 15.54338
3 19.80132 11.81890 1 16.56514 18.78445
4 10.68361 21.40172 1 32.14545 24.29487
When I try to find the correlation I get (which is obvious as my one column shows no variation)
dat_cor = cor(dat)
Warning message:
In cor(dat) : the standard deviation is zero
> dat_cor
ABC DEF GHI JKL MNO
ABC 1.0000000 -0.75600764 NA 0.55245223 -0.2735585
DEF -0.7560076 1.00000000 NA -0.06479082 0.2020781
GHI NA NA 1 NA NA
JKL 0.5524522 -0.06479082 NA 1.00000000 0.4564568
MNO -0.2735585 0.20207810 NA 0.45645683 1.0000000
In reality I am dealing with about 300 variables and don't know which variables don't vary.
My query is how do I remove the columns and rows with NA's.
So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only.
Kindly guide.
Thanking in advance.
Regards
Katherine
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list