[R] Advice on data format
Trent Yarwood
trentyarwood at gmail.com
Wed Aug 5 08:30:13 CEST 2015
Hi all,
I'm responsible for collating data on antibiotic use at my local group of
hospitals. I have data for five different hospitals, about 40 different
antibiotics and monthly data going back to 2006.
At the moment, I have this stored in 5 datafiles, one for each hospital,
formatted as follows:
date, antibiotic1, antibiotic2, antibiotic3....
1-mmm-yy, ab11, ab21, ab31....
1-mmm-yy, ab12, ab22, ab32...
This works most of the time for me, because the most common thing I need to
do is to track a particular hospital's antibiotic use over time (sum of
columns, as a time series by row).
What I would like to do is to amalgamate the data so instead of analysing
an individual hospital (ie a datasheet in the current format) is to be able
to look at a particular antibiotic across the five hospitals.
The best way I can visualise this is having the data in a data cube, with
each hospital as a single plane. Currently, my hospitals are (x,y,1),
(x,y,2) etc. What I'd like to do is look at (2,y,z) - for example, the sum
of antibiotic1 in all hospitals.
I imagine one way of doing this is having a hospital column in the data:
date, hospital, antibiotic1, antibiotic2, antibiotic3...
1-mmm-yy, hospital1, a11, a21, a31...
1-mmm-yy, hospital2, a11, a21, a31... etc
Two questions:
1) Is there a better way of storing the data than this?
2) Is there an easy way to turn what I have into what I want?
I know that once I have the data sorted, I'll be able to dpyl it into the
categories I currently use - it's the getting from here to there I need
help with, please.
Cheers,
Trent.
--
--
Trent Yarwood
trentyarwood at gmail.com
[[alternative HTML version deleted]]
More information about the R-help
mailing list