[R] Percentiles with R for a big data.frame

David Winsemius dwinsemius at comcast.net
Tue Jan 22 15:51:27 CET 2013


On Jan 22, 2013, at 5:58 AM, Simonas Kecorius wrote:

> Hey Duncan,
>
> Neither me do imagine what formula OpenOffice uses for quantiles. I  
> have
> checked a data string, 24 values, to calculate a quantiles with  
> OpenOffice
> and R. The result is identical. The problem arises when I try to  
> implement
> quantile calculation in this form:
> dat2<-with(dat1,aggregate(cbind(dat1[, 
> 1:71]),by=list(newID),quantiles,0.1,type=4))
> . This code does not generate an error, but I guess neither a right  
> result.

You guess? What result and what is "right"?

> So my question would be:
> How I could calculate quantiles for a big data.frame in R (71  
> columns and
> 288 rows). I need to take 24 rows, calculate quantiles, then take  
> another

> 24 rows etc..for 71 columns.
>

You have already been told that you are misspelling the name of the R  
function.

The other open question in my mind is whether you were hoping for  
something other than a single quantile (in this case the 10th  
percentile, or perhaps wanted the quantiles that would divide your  
data into deciles?

If you want to do the calculation within groups then the second  
argument to `aggregate` must specify the grouping. By design  
`aggregate` will apply the function on all columns.
-- 
David.

> Thanks in advance.
>
>
>
>
> 2013/1/22 Duncan Murdoch <murdoch.duncan at gmail.com>
>
>> On 13-01-21 6:41 PM, Simonas Kecorius wrote:
>>
>>> Dear R users,
>>>
>>> I came up to a problem dealing with percentiles in R.
>>>
>>> From my previous questions: I do have a big data.frame, with lots of
>>>>
>>> columns and rows. The following command enables me to calculate  
>>> means for
>>> all data frame.
>>>
>>> dat1$newID<-rep(1:(nrow(dat1)/**12),each=12) #if nrow(dat1)/12 is  
>>> integer
>>>
>>> dat2<-with(dat1,aggregate(**cbind(dat1[, 
>>> 1:71]),by=list(**newID),mean))
>>>
>>> What I need is to calculate percentiles for each group (there are 12
>>> values
>>> in a group). I tried the following:
>>>
>>> duomenai<-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**
>>> newID),quantiles,0.1,type=4))
>>>
>>
>> You didn't define quantiles, so that won't work.  Assuming that's a  
>> typo,
>> and you meant quantile...
>>
>>
>>>
>>> First, is the following syntax is right?
>>> Secondly, I tried to calculate percentiles using OpenOffice and  
>>> there is
>>> disagreement between values. If I do calculation for some number  
>>> row, than
>>> R and OpenOffice numbers coincide, but for a data.frame it seams  
>>> that
>>> something goes wrong.
>>>
>>
>> There are lots of different formulas for empirical quantiles.  The  
>> ones
>> available in R are described in the ?quantile help topic.  What  
>> formula
>> does OpenOffice use?
>>
>> Duncan Murdoch
>>
>>
>
>
> -- 
> Simonas Kecorius
> **
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA



More information about the R-help mailing list