[R] Summary
William Dunlap
wdunlap at tibco.com
Tue Sep 29 19:28:30 CEST 2009
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Henrique
> Dallazuanna
> Sent: Tuesday, September 29, 2009 9:57 AM
> To: Ashta
> Cc: R help
> Subject: Re: [R] Summary
>
> Try this:
>
> sapply(xc, summary)
This fails if there are NA's in xc. Try I think you need a custom summary-like
function for this. E.g.,
> df<-data.frame(X1=sqrt(-1:5), X2=-1:5, X3=log(-1:5))
> sapply(df,quantile,na.rm=TRUE)
X1 X2 X3
0% 0.000000 -1.0 -Inf
25% 1.103553 0.5 0.1732868
50% 1.573132 2.0 0.8958797
75% 1.933013 3.5 1.3143738
100% 2.236068 5.0 1.6094379
> sapply(df,function(x)c(quantile(x,na.rm=TRUE), "NA's"=sum(is.na(x))))
X1 X2 X3
0% 0.000000 -1.0 -Inf
25% 1.103553 0.5 0.1732868
50% 1.573132 2.0 0.8958797
75% 1.933013 3.5 1.3143738
100% 2.236068 5.0 1.6094379
NA's 1.000000 0.0 1.0000000
The standard summary function for data.frames doesn't do this
because each type of column may have a different sort of summary
(quartiles + mean + possible NA count for numeric columns,
tables for factor columns, etc.). The above only works for
all-numeric data.frames.
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
>
> On Tue, Sep 29, 2009 at 12:42 PM, Ashta <sewashm at gmail.com> wrote:
> > My data is called xc and has more than 15 variables.
> >
> >
> > When I used summary(xc) it gave me the detail description of each
> > variable.
> >
> >
> >
> > Summary(xc)
> >
> >
> >
> > Y1 x1 x2
> > x3 ..
> >
> > Min. :0.0000 Min. : 1.000 Min. : 1.000 Min. : 1.000
> >
> > 1st Qu. :0.0000 1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.: 2.000
> >
> > Median :1.0000 Median : 1.000 Median : 1.000 Median : 3.000
> >
> > Mean :0.6505 Mean : 2.816 Mean : 3.542 Mean : 3.433
> >
> > 3rd Qu. :1.0000 3rd Qu.: 4.000 3rd Qu.: 6.000 3rd Qu.: 5.000
> >
> > Max. :1.0000 Max. :10.000 Max. :10.000 Max. :10.000
> >
> >
> >
> > But I want the output in the following way.
> >
> >
> >
> > Y1 x1 x2 x3 ..
> >
> > Min. :0.0000 1.000 1.000 1.000
> >
> > 1st Qu. :0.0000 1.000 1.000 2.000
> >
> > Median :1.0000 1.000 1.000 3.000
> >
> > Mean :0.6505 2.816 3.542 3.433
> >
> > 3rd Qu. :1.0000 4.000 6.000 5.000
> >
> > Max. :1.0000 10.000 10.000 :10.000
> >
> >
> > Is it possible to do it in R?
> >
> >
> > Thanks in advance
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list