# [R] sciplot question

Frank E Harrell Jr f.harrell at vanderbilt.edu
Tue May 26 15:02:58 CEST 2009

```Jarle Bjørgeengen wrote:
>
> On May 26, 2009, at 4:37 , Frank E Harrell Jr wrote:
>
>> Manuel Morales wrote:
>>> On Mon, 2009-05-25 at 06:22 -0500, Frank E Harrell Jr wrote:
>>>> Jarle Bjørgeengen wrote:
>>>>> On May 24, 2009, at 4:42 , Frank E Harrell Jr wrote:
>>>>>
>>>>>> Jarle Bjørgeengen wrote:
>>>>>>> On May 24, 2009, at 3:34 , Frank E Harrell Jr wrote:
>>>>>>>> Jarle Bjørgeengen wrote:
>>>>>>>>> Great,
>>>>>>>>> thanks Manuel.
>>>>>>>>> Just for curiosity, any particular reason you chose standard
>>>>>>>>> error , and not confidence interval as the default (the naming
>>>>>>>>> of the plotting functions associates closer to the confidence
>>>>>>>>> interval .... ) error indication .
>>>>>>>>> - Jarle Bjørgeengen
>>>>>>>>> On May 24, 2009, at 3:02 , Manuel Morales wrote:
>>>>>>>>>> You define your own function for the confidence intervals. The
>>>>>>>>>> function
>>>>>>>>>> needs to return the two values representing the upper and
>>>>>>>>>> lower CI
>>>>>>>>>> values. So:
>>>>>>>>>>
>>>>>>>>>> qt.fun <- function(x)
>>>>>>>>>> qt(p=.975,df=length(x)-1)*sd(x)/sqrt(length(x))
>>>>>>>>>> my.ci <- function(x) c(mean(x)-qt.fun(x), mean(x)+qt.fun(x))
>>>>>>>> Minor improvement: mean(x) + qt.fun(x)*c(-1,1) but in general
>>>>>>>> confidence limits should be asymmetric (a la bootstrap).
>>>>>>> Thanks,
>>>>>>> if the date is normally distributed , symmetric confidence
>>>>>>> interval should be ok , right ?
>>>>>> Yes; I do see a normal distribution about once every 10 years.
>>>>> Is it not true that the students-T (qt(... and so on) confidence
>>>>> intervals is quite robust against non-normality too ?
>>>>>
>>>>> A teacher told me that, the students-T symmetric confidence
>>>>> intervals will give a adequate picture of the variability of the
>>>>> data in this particular case.
>>>> Incorrect.  Try running some simulations on highly skewed data.  You
>>>> will find situations where the confidence coverage is not very close
>>>> of the stated level (e.g., 0.95) and more situations where the
>>>> overall coverage is 0.95 because one tail area is near 0 and the
>>>> other is near 0.05.
>>>>
>>>> The larger the sample size, the more skewness has to be present to
>>>> cause this problem.
>>> OK - I'm convinced. It turns out that the first change I made to sciplot
>>> was to allow for asymmetric error bars. Is there an easy way (i.e.,
>>> existing package) to bootstrap confidence intervals in R. If so, I'll
>>> try to incorporate this as an option in sciplot.
>>
>> library(Hmisc)
>> ?smean.cl.boot
>
>
> H(arrel)misc :-)
>
> Thanks for valuable input Frank.
>
> This seems to work fine. (slightly more time consuming , but what do we
> have CPU power for )
>
> library(Hmisc)
> library(sciplot)
> my.ci <- function(x) c(smean.cl.boot(x)[2],smean.cl.boot(x)[3])

Don't double the executing time by running it twice!  And this way you
might possibly get an upper confidence interval that is lower than the
lower one.  Do function(x) smean.cl.boot(x)[-1]

>
> lineplot.CI(V1,V2,data=d,col=c(4),err.col=c(1),err.width=0.02,legend=FALSE,xlab="Timeofday",ylab="IOPS",ci.fun=my.ci,cex=0.5,lwd=0.7)
>
>
> Have I understood you correct in that this is a more accurate way of
> visualizing variability in any dataset , than the students T confidence
> intervals, because it does not assume normality  ?

Yes but instead of saying variability (which quantiles are good at) we
are talking about the precision of the mean.

>
> Can you explain the meaning of B, and how to find a sensible value (if
> not the default is sufficient) ?

For most purposes the default is sufficient.  There are great books and
the simple bootstrap percentile confidence interval used here.

Frank

>
> Best regards
> Jarle Bjørgeengen
>
>
>
>

--
Frank E Harrell Jr   Professor and Chair           School of Medicine
Department of Biostatistics   Vanderbilt University

```