[R] NA values trimming
nyk
nick at nyk.ch
Mon Jul 6 00:12:56 CEST 2009
Thanks for your reply! This is what I was looking for!
I'm using
nas1 <- apply(data_matrix,1,function(x)sum(is.na(x))/nrow(data_matrix))
nas2 <- apply(data_matrix,2,function(x)sum(is.na(x))/ncol(data_matrix))
The thing about "significantly more" isn't really a helpful as I look at the
data now.
I better write a function that removes the row or column with the highest
fraction of NAs, which I'll repeat as many times as it takes to get useful
data. For example, I want to do heatmaps and dendrograms, but the data has
too many NA values, so I get "Error in hclustfun(distfun(x)) : NA/NaN/Inf
in foreign function call (arg 11)"
David Winsemius wrote:
>
>
> On Jul 4, 2009, at 9:22 PM, nyk wrote:
>
>>
>> I have a data matrix containing quite a lot of missing values (NA).
>> I know
>> how to remove all column or rows containing NA values, but is there
>> a some
>> standard method for removing not all NA containing rows/column, but
>> only
>> those which have significantly more NAs than others?
>
> You have not defined what you mean by "significantly more than the
> others" so perhaps all you want to know is haw to count the NA's in a
> vector:
>
> > x=c(1,2,3,NA, 5,6,NA)
> > sum(is.na(x))
> [1] 2
>>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context: http://www.nabble.com/NA-values-trimming-tp24339399p24347436.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list