[R] Summary statistics for matrix columns
arun
smartpink111 at yahoo.com
Sat Nov 24 18:11:46 CET 2012
Hi,
You are right. Range is supposed to be one value (i.e the
difference between largest and smallest). For some reason, the function
range(x) gives both the values.
The description for ?range() is:
"Description:
‘range’ returns a vector containing the minimum and maximum of all
the given arguments.
"
I looked for similar function in library(matrixStats) . There it was colRanges(), rowRanges().
set.seed(125)
x <- matrix(sample(1:80),nrow=8)
colnames(x)<- paste("Col",1:ncol(x),sep="")
apply(x,2,function(x) range(x))
# Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10
#[1,] 10 1 17 3 18 11 13 15 2 6
#[2,] 74 77 76 70 65 63 79 80 71 72
library(matrixStats)
colRanges(x)
# [,1] [,2]
#[1,] 10 74
#[2,] 1 77
#[3,] 17 76
-----------------
You could do this to get the range:
apply(x,2,function(x) diff(range(x)))
#Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10
# 64 76 59 67 47 52 66 65 69 66
#or i
diff(t(colRanges(x)))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 64 76 59 67 47 52 66 65 69 66
#or
rowDiffs(colRanges(x))
A.K.
----- Original Message -----
From: frespider <frespider at hotmail.com>
To: r-help at r-project.org
Cc:
Sent: Saturday, November 24, 2012 7:58 AM
Subject: Re: [R] Summary statistics for matrix columns
HI A.k,
I need one more question, if you can answer it please
M <- matrix(sample(1:8000),nrow=100)
colnames(M)<- paste("Col",1:ncol(M),sep="")
apply(M,2,function(x) c(Min=min(x),"1st Qu" =quantile(x, 0.25,names=FALSE),
Range = range(x),
Median = quantile(x, 0.5, names=FALSE),
Mean= mean(x),Std=sd(x),
"3rd Qu" = quantile(x,0.75,names=FALSE),
IQR=IQR(x),Max = max(x)))
why I get two range . isn't range mean the different between the max and min
Thanks
Date: Fri, 23 Nov 2012 16:08:12 -0800
From: ml-node+s789695n4650613h54 at n4.nabble.com
To: frespider at hotmail.com
Subject: Re: Summary statistics for matrix columns
Hi,
No problem.
There are a couple of other libraries which deal with summary statistics:
library(pastecs)
?stat.desc() #
library(matrixStats)
#Using the functions from package: matrixStats
fun1<-function(x){
res<-rbind(colMins(x),colQuantiles(x)[,2],colMedians(x),colMeans(x),colSds(x),colQuantiles(x)[,4],colIQRs(x),colMaxs(x))
row.names(res)<-c("Min.","1st Qu.","Median","Mean","sd","3rd Qu.","IQR","Max.")
res}
set.seed(125)
x <- matrix(sample(1:80),nrow=8)
colnames(x)<- paste("Col",1:ncol(x),sep="")
fun1(x)
# Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8
#Min. 10.00000 1.00000 17.00000 3.00000 18.00000 11.00000 13.00000 15.00000
#1st Qu. 24.75000 29.50000 26.00000 7.75000 40.00000 17.25000 27.50000 34.75000
#Median 34.00000 46.00000 42.50000 35.50000 49.50000 23.50000 51.50000 51.50000
#Mean 42.50000 42.75000 41.75000 35.75000 44.87500 26.87500 44.75000 50.12500
#sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995
#3rd Qu. 67.75000 58.50000 50.00000 63.25000 54.25000 30.25000 56.25000 70.50000
#IQR 43.00000 29.00000 24.00000 55.50000 14.25000 13.00000 28.75000 35.75000
#Max. 74.00000 77.00000 76.00000 70.00000 65.00000 63.00000 79.00000 80.00000
# Col9 Col10
#Min. 2.00000 6.00000
#1st Qu. 24.50000 12.50000
#Median 33.50000 48.00000
#Mean 34.87500 40.75000
#sd 24.39811 28.21727
#3rd Qu. 45.25000 63.00000
#IQR 20.75000 50.50000
#Max. 71.00000 72.00000
I thought this could be faster than the previous methods. But, it was the slowest.
set.seed(125)
x1 <- matrix(sample(1:800000),nrow=1000)
colnames(x)<- paste("Col",1:ncol(x1),sep="")
system.time(fun1(x1))
# user system elapsed
# 0.968 0.000 0.956
A.K.
________________________________
From: Fares Said <[hidden email]>
To: arun <[hidden email]>
Cc: Pete Brecknock <[hidden email]>; R help <[hidden email]>
Sent: Friday, November 23, 2012 10:23 AM
Subject: Re: [R] Summary statistics for matrix columns
Thank you all
Sent from my iPhone
On 2012-11-23, at 10:19, "arun" <[hidden email]> wrote:
> HI,
> You are right.
> It is slower when compared to Pete's solution:
> set.seed(125)
> x <- matrix(sample(1:800000),nrow=1000)
> colnames(x)<- paste("Col",1:ncol(x),sep="")
>
> system.time({
> res<-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x)))
> res1<-as.matrix(res)
> res2<-res1[c(1:4,7,5,8,6),] })
> # user system elapsed
> # 0.596 0.000 0.597
>
> system.time({
> res<-apply(x,2,function(x) c(Min=min(x),
> "1st Qu" =quantile(x, 0.25,names=FALSE),
> Median = quantile(x, 0.5, names=FALSE),
> Mean= mean(x),
> Sd=sd(x),
> "3rd Qu" = quantile(x,0.75,names=FALSE),
> IQR=IQR(x),
> Max = max(x))) })
> # user system elapsed
> # 0.384 0.000 0.384
>
>
> A.K.
>
>
>
> ----- Original Message -----
> From: Pete Brecknock <[hidden email]>
> To: [hidden email]
> Cc:
> Sent: Friday, November 23, 2012 8:42 AM
> Subject: Re: [R] Summary statistics for matrix columns
>
> frespider wrote
>> Hi,
>>
>> it is possible. but don't you think it will slow the code if you convert
>> to data.frame?
>>
>> Thanks
>>
>> Date: Thu, 22 Nov 2012 18:31:35 -0800
>> From:
>
>> ml-node+s789695n4650500h51 at .nabble
>
>> To:
>
>> frespider@
>
>> Subject: RE: Summary statistics for matrix columns
>>
>>
>>
>> HI,
>>
>> Is it possible to use as.matrix()?
>>
>> res<-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x)))
>>
>> res1<-as.matrix(res)
>>
>> is.matrix(res1)
>>
>> #[1] TRUE
>>
>> res1[c(1:4,7,5,8,6),]
>>
>> # Col1 Col2 Col3 Col4 Col5 Col6 Col7
>> Col8
>>
>> #Min. 10.00000 1.00000 17.00000 3.00000 18.00000 11.00000 13.00000
>> 15.00000
>>
>> #1st Qu. 24.75000 29.50000 26.00000 7.75000 40.00000 17.25000 27.50000
>> 34.75000
>>
>> #Median 34.00000 46.00000 42.50000 35.50000 49.50000 23.50000 51.50000
>> 51.50000
>>
>> #Mean 42.50000 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000
>> 50.12000
>>
>> #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239
>> 25.51995
>>
>> #3rd Qu. 67.75000 58.50000 50.00000 63.25000 54.25000 30.25000 56.25000
>> 70.50000
>>
>> #IQR 43.00000 29.00000 24.00000 55.50000 14.25000 13.00000 28.75000
>> 35.75000
>>
>> #Max. 74.00000 77.00000 76.00000 70.00000 65.00000 63.00000 79.00000
>> 80.00000
>>
>> # Col9 Col10
>>
>> #Min. 2.00000 6.00000
>>
>> #1st Qu. 24.50000 12.50000
>>
>> #Median 33.50000 48.00000
>>
>> #Mean 34.88000 40.75000
>>
>> #sd 24.39811 28.21727
>>
>> #3rd Qu. 45.25000 63.00000
>>
>> #IQR 20.75000 50.50000
>>
>> #Max. 71.00000 72.00000
>>
[[elided Hotmail spam]]
>>
>> A.K.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650500.html
>>
>>
>>
>> To unsubscribe from Summary statistics for matrix columns, click here.
>>
>> NAML
>
> Then maybe ....
>
> x <- matrix(sample(1:8000),nrow=100)
> colnames(x)<- paste("Col",1:ncol(x),sep="")
>
> apply(x,2,function(x) c(Min=min(x),
> "1st Qu" =quantile(x, 0.25,names=FALSE),
> Median = quantile(x, 0.5, names=FALSE),
> Mean= mean(x),
> Sd=sd(x),
> "3rd Qu" = quantile(x,0.75,names=FALSE),
> IQR=IQR(x),
> Max = max(x)))
>
> HTH
>
> Pete
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650547.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650613.html
To unsubscribe from Summary statistics for matrix columns, click here.
NAML
--
View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650643.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list