[R] construct boxplots from data with varying column widths

Rory Campbell-Lange rory at campbell-lange.net
Sat Jul 16 18:15:10 CEST 2011


On 16/07/11, David Winsemius (dwinsemius at comcast.net) wrote:
> 
> On Jul 16, 2011, at 11:19 AM, Rory Campbell-Lange wrote:
> 
> >I'm an R beginner, and I would like to construct a set of boxplots
> >showing database function runtimes.

> >I can easily reformat the base data to provide it to R in a format
> >such as:
> >
> >   function1,12.5
> >   function1,13.11
> >   function1,35.2
> >   ...

> That is definitely to be preferred. Read that into R and show us the
> results of str on your R data object.

Thanks for your suggestion.

    > str(data2)
    'data.frame':   1940170 obs. of  2 variables:
     $ function.: Factor w/ 127 levels "fn_activities01_list",..: 102 102 102 102 102 102 102 102 102 102 ...
     $ runtime  : num  38.1 32.4 41.2 92.9 130.5 ..

    > head(data2)
               function. runtime
    1 fn_slot03_byperson  38.083
    2 fn_slot03_byperson  32.396
    3 fn_slot03_byperson  41.246
    4 fn_slot03_byperson  92.904
    5 fn_slot03_byperson 130.512
    6 fn_slot03_byperson 113.853

    tmp <- data2[data2$dbfunc=='fn_slot03_byperson',]
    > length(tmp$runtime)
    [1] 24004
    > ave(tmp$runtime)[1]
    [1] 41.8108

> > I don't know how to read this data into R so that I can construct the
> > boxplots. I'd be also grateful for advice on how to filter the
> > output of the boxplot to show only the top 20.
> 
> Oh. That is material covered in introductory texts, of which there
> are many, in the contributed documentation at the CRAN website.
> There is also an Import/Export Manual.

Thank you for your note. I was having specific trouble reading data with
different column lengths. I managed to do it with fill=TRUE, but your approach
seems to work better.

However, you are right, I need to find out how to reveal runtime by function,
and I need to do more research on that.

Many thanks
Rory



More information about the R-help mailing list