[R] Advice on data format
Michael Dewey
lists at dewey.myzen.co.uk
Wed Aug 5 14:50:17 CEST 2015
Dear Trent
If you want them side-by-side in one data frame then you could use merge
making sure it only merges by date. I would use sub to change all the
anitbiotic names by adding "h1" "h2" and so on. Then you can sum
antibiotic over hospital by using grep to select all the columns
containing antibiotic1. The side-by-side solution has some advantages
over stacking them vertically and some disadvantages. You may need to do
both for different purposes.
You would need to learn about regular expressions if they are not
already familiar to you to get the best out of sub and grep.
On 05/08/2015 07:30, Trent Yarwood wrote:
> Hi all,
>
> I'm responsible for collating data on antibiotic use at my local group of
> hospitals. I have data for five different hospitals, about 40 different
> antibiotics and monthly data going back to 2006.
>
> At the moment, I have this stored in 5 datafiles, one for each hospital,
> formatted as follows:
>
> date, antibiotic1, antibiotic2, antibiotic3....
> 1-mmm-yy, ab11, ab21, ab31....
> 1-mmm-yy, ab12, ab22, ab32...
>
> This works most of the time for me, because the most common thing I need to
> do is to track a particular hospital's antibiotic use over time (sum of
> columns, as a time series by row).
>
> What I would like to do is to amalgamate the data so instead of analysing
> an individual hospital (ie a datasheet in the current format) is to be able
> to look at a particular antibiotic across the five hospitals.
>
> The best way I can visualise this is having the data in a data cube, with
> each hospital as a single plane. Currently, my hospitals are (x,y,1),
> (x,y,2) etc. What I'd like to do is look at (2,y,z) - for example, the sum
> of antibiotic1 in all hospitals.
>
> I imagine one way of doing this is having a hospital column in the data:
>
> date, hospital, antibiotic1, antibiotic2, antibiotic3...
> 1-mmm-yy, hospital1, a11, a21, a31...
> 1-mmm-yy, hospital2, a11, a21, a31... etc
>
> Two questions:
>
> 1) Is there a better way of storing the data than this?
> 2) Is there an easy way to turn what I have into what I want?
>
> I know that once I have the data sorted, I'll be able to dpyl it into the
> categories I currently use - it's the getting from here to there I need
> help with, please.
>
> Cheers,
>
> Trent.
>
>
>
>
>
>
--
Michael
http://www.dewey.myzen.co.uk/home.html
More information about the R-help
mailing list