[R] Using R for processing computer performance data

Peter Gallanis peter at gallanis.com
Sun Apr 29 17:20:56 CEST 2001


I'm relatively new to R, and have looked through the documentation and FAQ, and have not been able to find out how I can accomplish something.  If someone can point me in the right direction, I would greatly appreciate it.

I am doing system performance tests, and end up with large volumes of data that consists of transaction timings, coupled with system timings.  The analysis process is iterative, and I've been using Excel (too many limitations, but the pivot tables have been extremely useful) to do some of the drilldown processing, such as identifyng bottlenecks.

To generalize the problem with an example, I have a series of data for which I would like to be able to produce a box plot.  However, much of the data varies from run to run or from system to system.  For example, the UNIX sar utility can produce a snapshot of disk activity for each disk in the system.  Each snapshot lists a number of statistics for each disk, and after some cleanup with some utilities, you end up with something like:

Time        Device        Busy        Queue        AvServ
10:00:00  d1               0.0           0.0            8.3
10:00:00  d2               35.5         5.6            37.8
10:00:00  d3               10.5         0.8            16.0
10:00:30  d1               0.8           0.0            10.2
10:00:30  d2               42.1         5.9            42.5
10:00:30  d3               3.2           0.1            12.0
........

Each set of statistics for each disk (d1-d3) are repeated for each time snapshot. I'd like to be able to have a boxplot where I get the any of the statistics for each disk.  Such that, I can have a box plot of the percent busy for each disk, or the average service time for each disk, etc. The basic problem I am having is how can I do this in an automated fashion, without knowing the names of the disks.  I built a data.frame using read.csv (is this even the correct terminology), and tried using unique() to identify the names of the disks, but then I got all caught up in trying to build vectors of data for each disk on the specified column.  And even then, if I did accomplish this, I couldn't figure out how to pas a variable number of vectors to boxplot.  

If someone can point me in the right direction, I can apply the concepts to other tasks I would like to accomplish.

Thank you



-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://stat.ethz.ch/pipermail/r-help/attachments/20010429/d936b5c9/attachment.html


More information about the R-help mailing list