[R] summary stats including NA's into new dataframe

Uwe Ligges ligges at statistik.uni-dortmund.de
Thu Dec 19 08:52:04 CET 2002


Alexander.Herr at csiro.au wrote:
> Thanks Uwe,
> Can't seem to get your formula to work...
> I should have made this clearer. I am after a listing of the number of NAs
> and Valid Ns (or total N)for export to csv,eg:
> Variable, mean, Missing Values, Valid N
> test,	6.00000,2,18
> bummer,5.44444,1,19
> 
> from:
> 
> x<-c(1,4,2,6,8,3,5,6,7,8,7,2,4,7,5,1,8,9,8,9)
> labl<-gl(2,2,length=20,labels=c("test","bummer"))
> x[3]<-NA
> x[5]<-NA
> x[6]<-NA
> 
> 
> aggregate(x,by=list(labl),mean, sum(is.na(x)))
> #  Group.1  x
> #1    test NA
> #2  bummer NA
> 
> aggregate(x,by=list(labl),mean, na.rm=T)
> #  Group.1           x
> #1    test 6.000000
> #2  bummer 5.444444
> 
> aggregate(x,by=list(labl),sum(is.na(x)))
> # Error in FUN(X[[1]], ...) : Argument "INDEX" is missing, with no default

You didn't read carefully enough:

aggregate(......., function(x) sum(is.na(x)))
                    ^^^^^^^^^^^^

Or instead of this anonymous function, you can do as well:

countna <- function(x) sum(is.na(x))
aggregate(......., countna)


> Cheers Herry
> 
> 
> --------------------------------------------
> Alexander Herr - Herry
> Northern Futures
> Davies Laboratory
> PMB, Aitkenvale, QLD 4814
> Phone (07) 4753 8510
> Fax   (07) 4753 8650
> Home: http://batcall.csu.edu.au/~aherr
> CSIRO Sustainable Ecosystems:
> http://www.cse.csiro.au/
> --------------------------------------------
> 
> 
> 
> -----Original Message-----
> From: Uwe Ligges [mailto:ligges at statistik.uni-dortmund.de]
> Sent: Wednesday, 18 December 2002 5:30 PM
> To: Alexander.Herr at csiro.au
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] summary stats including NA's into new dataframe
> 
> 
> Alexander.Herr at csiro.au wrote:
> 
>>List,
>>
>>I am trying to extract summary statistics from a data frame with several
>>variables (and NAs) into a dataframe with the columns: Variablename (ie
> 
> the
> 
>>colnames of original data), mean, stdev, max, min, Valid N, Missing
> 
> Values.
> 
>>Extracting the statistics is straightforward using stack and aggregate.
>>However, I haven't succeeded in obtaining the number of Missing Values. I
>>can extract these from describe (Hmisc library), but surely there is a
>>simpler way similar to obtaining the mean using aggregate?
> 
> 
> The similar way is:
> 
> aggregate(......., function(x) sum(is.na(x)))
> 
> Uwe Ligges
> 
> 
>>Suggestions are much appreciated
> 
> 
> 
> 	[[alternate HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help




More information about the R-help mailing list