[R] construct boxplots from data with varying column widths

David Winsemius dwinsemius at comcast.net
Sat Jul 16 17:47:22 CEST 2011


On Jul 16, 2011, at 11:19 AM, Rory Campbell-Lange wrote:

> I'm an R beginner, and I would like to construct a set of boxplots
> showing database function runtimes.
>
> The data I have is currently is in the following format:
>
>    function1,12.5,13.11,35.2,11.1.....n
>    function2,21.5,42.22,17.3,14.2....................n
>    ...
>
> this is the function name followed by somewhere between 1 and 10,000
> runtimes for each function. The number of runtimes is in milliseconds.
>
> I can easily reformat the base data to provide it to R in a format  
> such
> as:
>
>    function1,12.5
>    function1,13.11
>    function1,35.2
>    ...

That is definitely to be preferred. Read that into R and show us the  
results of str on your R data object.

>
> There are about 120 individual functions. I wish to show the top 20
> functions by average runtime (ideally sorted by average runtime
> descending). Using a boxplot will help show the variation in runtime  
> for
> each function.
>
> I don't know how to read this data into R so that I can construct the
> boxplots. I'd be also grateful for advice on how to filter the  
> output of
> the boxplot to show only the top 20.

Oh. That is material covered in introductory texts, of which there are  
many, in the contributed documentation at the CRAN website. There is  
also an Import/Export Manual.

After it's in an R workspace, you may want to look at the ave or  
aggregate functions to compute mean runtime by group.

Rhelp is not set up as a tutorial service. The format laid out in the  
Posting Guide is User(reads pages and pages of documentation),  
User(makes effort, encounters difficulty with R code), User(constructs  
detailed posting with code , data and verbatim error messages).

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list