[R] boxplot with log="y" and values starting at 0

Allan Engelhardt allane at cybaea.com
Thu Aug 20 15:40:09 CEST 2009


On 20/08/09 14:15, Anne Skoeries wrote:
> Hi,
>
> I'm working with a data.frame containing values between 0 and 22000.
> Most of the values are actually between 0 and 50 and the high ones are
> outliers.
> I want to generate a boxplot and since the outliers are extremely
> high, I  need to scale the y scale logarithmically. Otherwise one
> wouldn't really see the boxes of the boxplot.
>
> boxplot(dat, log="y", ylim=c(0, max(dat)))
>
> Trying the above doesn't work, since the y scale has to be positive.
>
> But when I generate the boxplot with
> ylim=c(1, max(dat))
> it doesn't properly generate the whiskers or beginning of the boxes,
> because some of the mins and first quantiles are 0.
>
> Can anybody help and tell me how I can generate a logarithmic y scale
> starting at 0?
>    
I think that is impossible, unless you redefine mathematics and 
geometry.  Sadly R only supports a relatively usual form of mathematics 
where log(0) is by convention -Inf, and the graphics is basically 
Euclidean so you can't draw infinities easily.  You could try filing a 
bug report....

What is min(dat)?  If that is zero, then you can't use a log scale.  If 
it is small but positive, then you can use that for your ylim.

But your data set is a large range and therefore intrinsically hard to 
visualize.  Consider some other way of presenting the data.  What is the 
reader supposed to learn from / do with the data you show?

Hope this helps a little.

Allan




More information about the R-help mailing list