[R] correlation matrix only if enough non-NA values

Rui Barradas ruipbarradas at sapo.pt
Tue May 29 15:38:12 CEST 2012


Hello,

Instead of 'sum' use 'mean'

ok <- apply(tbl, 2, function(x) mean(!is.na(x)) >= 0.5)
cor(tbl[, ok], use="pairwise.complete.obs")

Hope this helps,

Rui Barradas

Em 29-05-2012 10:03, jeff6868 escreveu:
> Hi everybody.
>
> I'm trying to do a correlation matrix in a list of files. Each file contains
> 2 columns: "capt1" and "capt2". For the example, I merged all in one
> data.frame. My data also contains many missing data. The aim is to do a
> correlation matrix for the same data for course (one correlation matrix for
> capt1 and another for capt2).
> For the moment, I have a correlation matrix which works (for capt1 or
> capt2). But correlation coefficients of this matrix are calculated whatever
> the number of missing data per column.
> What I want to do is to have exactly the same correlation matrix, but only
> with coefficients calculated with at least half of non missing data in the
> column (in the example, at least 5 non NA values out of 10).
>
> table<- data.frame(ST1_capt1=rnorm(1:10),ST1_capt2=c(1,2,3,4,NA,NA,7:9,NA),
>    ST2_capt1=c(NA,NA,NA,NA,NA,6:10),ST2_capt2=c(21,NA,NA,NA,25:30),
>    ST3_capt1=c(1,NA,NA,4:10),ST3_capt2=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))
>
> cormatrix<- cor(table[,c(1,3,5)],use="pairwise.complete.obs")
>
> To solve this problem, I think  it would be useful to use a code like this
> before calculating the correlation matrix:
>
> if(sum(!is.na(table[1:10,]))>=5) then calculate the correlation
> coefficient, and else (if less than 5 non-NA values) put NA in the
> correlation matrix.
>
> I'm trying to combinate all this stuff but it doesn't work. Could somebody
> help me to do this?
> Many thanks!
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/correlation-matrix-only-if-enough-non-NA-values-tp4631666.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list