[R] How to force aggregate to exclude NA ?
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Dec 7 13:43:16 CET 2008
Try
aggregate(m[, -(1:2)], m[1], sum, na.rm = TRUE)
aggregate(!is.na(m[, -(1:2)]), m[1], sum, na.rm = TRUE)
# or (this uses row names rather than a column for the group):
rowsum(m[, -(1:2)], m[,1], na.rm = TRUE)
rowsum(0+!is.na(m[, -(1:2)]), m[,1], na.rm = TRUE)
On Sun, Dec 7, 2008 at 7:06 AM, Daren Tan <daren76 at hotmail.com> wrote:
>
> The aggregate function does "almost" all that I need to summarize a datasets, except that I can't specify exclusion of NAs without a little bit of hassle.
>
>> set.seed(143)
>> m <- data.frame(A=sample(LETTERS[1:5], 20, T), B=sample(LETTERS[1:10], 20, T), C=sample(c(NA, 1:4), 20, T), D=sample(c(NA,1:4), 20, T))
>> m
> A B C D
> 1 E I 1 NA
> 2 A C NA NA
> 3 D I NA 3
> 4 C I 2 4
> 5 A C 3 2
> 6 E J 1 2
> 7 D J 2 2
> 8 C G 4 1
> 9 C D NA 3
> 10 B G 3 NA
> 11 C B 4 2
> 12 A B NA NA
> 13 E A NA 4
> 14 B B 3 3
> 15 E I 4 1
> 16 E J 3 1
> 17 B J 4 4
> 18 B J 1 3
> 19 D D 4 2
> 20 B B 4 3
>
>> aggregate(m[,-c(1:2)], by=list(m[,1]), sum)
> Group.1 C D
> 1 A NA NA
> 2 B 15 NA
> 3 C NA 10
> 4 D NA 7
> 5 E NA NA
>
>> aggregate(m[,-c(1:2)], by=list(m[,1]), length)
> Group.1 C D
> 1 A 3 3
> 2 B 5 5
> 3 C 4 4
> 4 D 3 3
> 5 E 5 5
>
> My own defined version of length and sum to exclude NA
>
>> mylength <- function(x) { sum(as.logical(x), na.rm=T) }
>> mysum <- function(x) {sum(x, na.rm=T)}
>
>> aggregate(m[,-c(1:2)], by=list(m[,1]), mysum) <----------------- this computes correctly.
> Group.1 C D
> 1 A 3 2
> 2 B 15 13
> 3 C 10 10
> 4 D 6 7
> 5 E 9 8
>
>> aggregate(m[,-c(1:2)], by=list(m[,1]), mylength) <----------------- this computes correctly.
> Group.1 C D
> 1 A 1 1
> 2 B 5 4
> 3 C 3 4
> 4 D 2 3
> 5 E 4 4
>
> There are other statistics I need to compute e.g. var, sd, and it is a hassle to create customized versions to exclude NA. Any alternative approaches ?
>
>
>
>
> _________________________________________________________________
> [[elided Hotmail spam]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list