<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 5.50.4611.1300" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>I'm relatively new to R, and have looked through
the documentation and FAQ, and have not been able to find out how I can
accomplish something. If someone can point me in the right direction, I
would greatly appreciate it.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>I am doing system performance tests, and end up
with large volumes of data that consists of transaction timings, coupled with
system timings. The analysis process is iterative, and I've been using
Excel (too many limitations, but the pivot tables have been extremely useful) to
do some of the drilldown processing, such as identifyng
bottlenecks.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>To generalize the problem with an example, I have a
series of data for which I would like to be able to produce a box
plot. However, much of the data varies from run to run or from
system to system. For example, the UNIX sar utility can produce a snapshot
of disk activity for each disk in the system. Each snapshot lists a number
of statistics for each disk, and after some cleanup with some utilities, you end
up with something like:</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Time
Device Busy
Queue
AvServ</FONT></DIV>
<DIV><FONT face=Arial
size=2>10:00:00 d1
0.0 0.0
8.3</FONT></DIV>
<DIV><FONT face=Arial size=2>10:00:00
d2
35.5 5.6 37.8</FONT></DIV>
<DIV><FONT face=Arial size=2>10:00:00
d3
10.5
0.8 16.0</FONT></DIV>
<DIV><FONT face=Arial
size=2>10:00:30 d1 0.8
0.0
10.2</FONT></DIV>
<DIV><FONT face=Arial size=2>10:00:30
d2
42.1 5.9 42.5</FONT></DIV>
<DIV><FONT face=Arial
size=2>10:00:30 d3
3.2
0.1
12.0</FONT></DIV>
<DIV><FONT face=Arial size=2>........</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Each set of statistics for each disk (d1-d3) are
repeated for each time snapshot. I'd like to be able to have a boxplot where I
get the any of the statistics for each disk. Such that, I can have a box
plot of the percent busy for each disk, or the average service time for each
disk, etc. The basic problem I am having is how can I do this in an automated
fashion, without knowing the names of the disks. I built a data.frame
using read.csv (is this even the correct terminology), and tried using unique()
to identify the names of the disks, but then I got all caught up in trying to
build vectors of data for each disk on the specified column. And even
then, if I did accomplish this, I couldn't figure out how to pas a variable
number of vectors to boxplot. </FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>If someone can point me in the right
direction, I can apply the concepts to other tasks I would like to
accomplish.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Thank you</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV></BODY></HTML>