[R] Advice on data format

Trent Yarwood trentyarwood at gmail.com
Wed Aug 5 08:30:13 CEST 2015


Hi all,

I'm responsible for collating data on antibiotic use at my local group of
hospitals.  I have data for five different hospitals, about 40 different
antibiotics and monthly data going back to 2006.

At the moment, I have this stored in 5 datafiles, one for each hospital,
formatted as follows:

date, antibiotic1, antibiotic2, antibiotic3....
1-mmm-yy, ab11, ab21, ab31....
1-mmm-yy, ab12, ab22, ab32...

This works most of the time for me, because the most common thing I need to
do is to track a particular hospital's antibiotic use over time (sum of
columns, as a time series by row).

What I would like to do is to amalgamate the data so instead of analysing
an individual hospital (ie a datasheet in the current format) is to be able
to look at a particular antibiotic across the five hospitals.

The best way I can visualise this is having the data in a data cube, with
each hospital as a single plane. Currently, my hospitals are (x,y,1),
(x,y,2) etc. What I'd like to do is look at (2,y,z) - for example, the sum
of antibiotic1 in all hospitals.

I imagine one way of doing this is having a hospital column in the data:

date, hospital, antibiotic1, antibiotic2, antibiotic3...
1-mmm-yy, hospital1, a11, a21, a31...
1-mmm-yy, hospital2, a11, a21, a31... etc

Two questions:

1) Is there a better way of storing the data than this?
2) Is there an easy way to turn what I have into what I want?

I know that once I have the data sorted, I'll be able to dpyl it into the
categories I currently use - it's the getting from here to there I need
help with, please.

Cheers,

Trent.






-- 
-- 
Trent Yarwood
trentyarwood at gmail.com

	[[alternative HTML version deleted]]



More information about the R-help mailing list