[R] Summary

William Dunlap wdunlap at tibco.com
Tue Sep 29 19:28:30 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Henrique 
> Dallazuanna
> Sent: Tuesday, September 29, 2009 9:57 AM
> To: Ashta
> Cc: R help
> Subject: Re: [R] Summary
> 
> Try this:
> 
> sapply(xc, summary)

This fails if there are NA's in xc.  Try I think you need a custom summary-like
function for this.  E.g.,

> df<-data.frame(X1=sqrt(-1:5), X2=-1:5, X3=log(-1:5))
> sapply(df,quantile,na.rm=TRUE)
           X1   X2        X3
0%   0.000000 -1.0      -Inf
25%  1.103553  0.5 0.1732868
50%  1.573132  2.0 0.8958797
75%  1.933013  3.5 1.3143738
100% 2.236068  5.0 1.6094379
> sapply(df,function(x)c(quantile(x,na.rm=TRUE), "NA's"=sum(is.na(x))))
           X1   X2        X3
0%   0.000000 -1.0      -Inf
25%  1.103553  0.5 0.1732868
50%  1.573132  2.0 0.8958797
75%  1.933013  3.5 1.3143738
100% 2.236068  5.0 1.6094379
NA's 1.000000  0.0 1.0000000

The standard summary function for data.frames doesn't do this
because each type of column may have a different sort of summary
(quartiles + mean + possible NA count for numeric columns,
tables for factor columns, etc.). The above only works for
all-numeric data.frames.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

> 
> On Tue, Sep 29, 2009 at 12:42 PM, Ashta <sewashm at gmail.com> wrote:
> > My data is called  xc and has more than 15 variables.
> >
> >
> > When I used summary(xc)   it gave me the detail description of each
> > variable.
> >
> >
> >
> > Summary(xc)
> >
> >
> >
> >              Y1                x1                      x2
> >    x3 ..
> >
> >  Min.     :0.0000   Min.   : 1.000   Min.   : 1.000   Min.   : 1.000
> >
> >  1st Qu. :0.0000   1st Qu.: 1.000   1st Qu.: 1.000   1st Qu.: 2.000
> >
> >  Median :1.0000   Median : 1.000   Median : 1.000   Median : 3.000
> >
> >  Mean    :0.6505   Mean   : 2.816   Mean   : 3.542   Mean   : 3.433
> >
> >  3rd Qu. :1.0000   3rd Qu.: 4.000   3rd Qu.: 6.000   3rd Qu.: 5.000
> >
> >  Max.     :1.0000   Max.   :10.000   Max.   :10.000   Max.   :10.000
> >
> >
> >
> > But I want the output in the following way.
> >
> >
> >
> >               Y1            x1         x2        x3 ..
> >
> >  Min.     :0.0000    1.000    1.000    1.000
> >
> >  1st Qu. :0.0000    1.000    1.000    2.000
> >
> >  Median :1.0000   1.000    1.000    3.000
> >
> >  Mean    :0.6505   2.816    3.542    3.433
> >
> >  3rd Qu. :1.0000   4.000     6.000   5.000
> >
> >  Max.     :1.0000   10.000  10.000  :10.000
> >
> >
> > Is it possible to do it in R?
> >
> >
> > Thanks in advance
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 




More information about the R-help mailing list