[R] Summary
Henrique Dallazuanna
wwwhsd at gmail.com
Tue Sep 29 19:32:07 CEST 2009
A alternative in cases where is there NA's shoul be:
sapply(sapply(df, summary), '[', 1:7)
On Tue, Sep 29, 2009 at 2:28 PM, William Dunlap <wdunlap at tibco.com> wrote:
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On Behalf Of Henrique
>> Dallazuanna
>> Sent: Tuesday, September 29, 2009 9:57 AM
>> To: Ashta
>> Cc: R help
>> Subject: Re: [R] Summary
>>
>> Try this:
>>
>> sapply(xc, summary)
>
> This fails if there are NA's in xc. Try I think you need a custom summary-like
> function for this. E.g.,
>
>> df<-data.frame(X1=sqrt(-1:5), X2=-1:5, X3=log(-1:5))
>> sapply(df,quantile,na.rm=TRUE)
> X1 X2 X3
> 0% 0.000000 -1.0 -Inf
> 25% 1.103553 0.5 0.1732868
> 50% 1.573132 2.0 0.8958797
> 75% 1.933013 3.5 1.3143738
> 100% 2.236068 5.0 1.6094379
>> sapply(df,function(x)c(quantile(x,na.rm=TRUE), "NA's"=sum(is.na(x))))
> X1 X2 X3
> 0% 0.000000 -1.0 -Inf
> 25% 1.103553 0.5 0.1732868
> 50% 1.573132 2.0 0.8958797
> 75% 1.933013 3.5 1.3143738
> 100% 2.236068 5.0 1.6094379
> NA's 1.000000 0.0 1.0000000
>
> The standard summary function for data.frames doesn't do this
> because each type of column may have a different sort of summary
> (quartiles + mean + possible NA count for numeric columns,
> tables for factor columns, etc.). The above only works for
> all-numeric data.frames.
>
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com
>
>>
>> On Tue, Sep 29, 2009 at 12:42 PM, Ashta <sewashm at gmail.com> wrote:
>> > My data is called xc and has more than 15 variables.
>> >
>> >
>> > When I used summary(xc) it gave me the detail description of each
>> > variable.
>> >
>> >
>> >
>> > Summary(xc)
>> >
>> >
>> >
>> > Y1 x1 x2
>> > x3 ..
>> >
>> > Min. :0.0000 Min. : 1.000 Min. : 1.000 Min. : 1.000
>> >
>> > 1st Qu. :0.0000 1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.: 2.000
>> >
>> > Median :1.0000 Median : 1.000 Median : 1.000 Median : 3.000
>> >
>> > Mean :0.6505 Mean : 2.816 Mean : 3.542 Mean : 3.433
>> >
>> > 3rd Qu. :1.0000 3rd Qu.: 4.000 3rd Qu.: 6.000 3rd Qu.: 5.000
>> >
>> > Max. :1.0000 Max. :10.000 Max. :10.000 Max. :10.000
>> >
>> >
>> >
>> > But I want the output in the following way.
>> >
>> >
>> >
>> > Y1 x1 x2 x3 ..
>> >
>> > Min. :0.0000 1.000 1.000 1.000
>> >
>> > 1st Qu. :0.0000 1.000 1.000 2.000
>> >
>> > Median :1.0000 1.000 1.000 3.000
>> >
>> > Mean :0.6505 2.816 3.542 3.433
>> >
>> > 3rd Qu. :1.0000 4.000 6.000 5.000
>> >
>> > Max. :1.0000 10.000 10.000 :10.000
>> >
>> >
>> > Is it possible to do it in R?
>> >
>> >
>> > Thanks in advance
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Henrique Dallazuanna
>> Curitiba-Paraná-Brasil
>> 25° 25' 40" S 49° 16' 22" O
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O
More information about the R-help
mailing list