[R] median by geometric mean -- are we missing what's important?

Bert Gunter gunter.berton at gene.com
Mon Jan 17 18:23:29 CET 2011


Folks:

I know this may be overreaching, but are we missing what's important?
WHY do the zeros occur? Are they values less then a known or unknown
LOD? -- and/or is there positive mass on zero? In either case, using
logs to calculate a geometric mean may not make sense. Paraphrasing
Greg Snow, what is the scientific question? What is the model?

Cheers,
Bert



On Mon, Jan 17, 2011 at 9:13 AM, Keith Jewell <k.jewell at campden.co.uk> wrote:
> Just in case some of x are negative (the desired median still exists, as
> long as the two middle values are non -ve), how about:
>
> x <- runif(20, -1, 100)
> exp(median(log(pmax(0,x))))
>
> It'll give -Inf if the two middle values are negative, when I guess we
> should get NaN, but I can't see a 1-line way to handle that!
>
> Keith J
>
> "Peter Ehlers" <ehlers at ucalgary.ca> wrote in message
> news:4D3468EF.5010601 at ucalgary.ca...
>> I've been reminded by Prof. Brian Ripley that R's
>> log() function will indeed handle zeros appropriately.
>>
>> Apologies to S Ellison and Hadley Wickham.
>>
>> Peter Ehlers
>>
>> On 2011-01-17 06:55, Peter Ehlers wrote:
>>> On 2011-01-17 02:19, S Ellison wrote:
>>>> Will this do?
>>>>
>>>> x<- runif(20, 1, 100)
>>>>
>>>> exp( median( log( x) ) )
>>>>
>>>> S Ellison
>>>>
>>>>
>>> That's what Hadley proposed, too. It's fine for
>>> your example, but there is potentially a small
>>> problem with this method: the data must be positive.
>>> Since it's not unusual to see data with some zeros,
>>> the log() would fail.
>>>
>>> Depending on what type of data I was going to use
>>> this modification of the median for, I would consider
>>> modifying the (quite short) median.default function,
>>> with appropriate additional data checks.
>>>
>>> Peter Ehlers
>>>
>>>>
>>>>>>> Skull Crossbones<witch.of.agnessi at gmail.com>   15/01/2011 16:26>>>
>>>> Hi All,
>>>>
>>>> I need to calculate the median for even number of data points.However
>>>> instead of calculating
>>>> the arithmetic mean of the two middle values,I need to calculate their
>>>> geometric mean.
>>>>
>>>> Though I can code this in R, possibly in a few lines, but wondering if
>>>> there
>>>> is
>>>> already some built in function.
>>>>
>>>> Can somebody give a hint?
>>>>
>>>> Thanks in advance
>>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics



More information about the R-help mailing list