[R] Problem with aggregating data across time points

Chris Beeley chris.beeley at gmail.com
Fri Jul 2 17:21:57 CEST 2010


Hello-

I have a dataset which basically looks like this:

Location   Sex       Date          Time   Verbal    Self harm
Violence_objects   Violence
  A             1      1-4-2007       1800      3             0
            1                       3
  A             1      1-4-2007       1230      2            1
           2                       4
  D             2      2-4-2007       1100      0            4
           0                       0
...

I've put a dput of the first section of the data at the end of this
email. Basically I have these data for several days across all of the
dates, so 2 or more on 1-4-2007, 2 or more on 2-4-2007, and so on
until 31-12-2009. The last four variables which you can see at the end
of the email are my dependent variables, they are different types of
violent and self harming behaviour shown by patients in a psychiatric
hospital.

What I want to do is:

A) sum each of the dependent variables for each of the dates (so e.g.
in the example above for 1-4-2007 it would be 3+2=5, 0+1=1, 1+2=3, and
3+4=7 for each of the variables)

B) do this sum, but only in each location this time (location is the
first variable)- so the sum for 1-4-2007 in location A, sum for
1-4-2007 in location B, and so on and so on. Because this is divided
across locations, some dates will have no data going into them and
will return 0 sums. Crucially I still want these dates to appear- so
e.g. 21-5-2008 would appear as 0 0 0 0, then 22-5-2008 might have 1 2
0 0, then 23-5-2008 0 0 0 0 again, and etc.

I've had several abortive attempts and done some Googling but have got
nowhere. I'd greatly appreciate any advice.

Many thanks,
Chris Beeley
(Institute of Mental Health, UK)


structure(list(Location = structure(c(1L, 2L, 2L, 1L, 3L, 5L,
5L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 5L, 5L, 5L, 5L, 6L,
1L, 2L, 3L, 5L, 6L, 6L, 6L, 7L, 7L, 5L, 5L, 4L, 4L, 4L, 3L, 3L,
3L, 2L, 2L, 2L, 2L, 7L, 7L, 7L, 6L, 5L, 4L, 4L, 6L, 5L, 2L, 2L,
3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 5L, 5L, 3L, 3L, 4L,
4L, 4L, 4L), .Label = c("", "A", "B", "C", "D", "E", "F"), class = "factor"),
    Sex = c(NA, 1L, NA, NA, NA, 1L, 2L, NA, NA, 2L, 2L, NA, 2L,
    2L, 1L, 1L, NA, 2L, 2L, 2L, 1L, NA, NA, 1L, 1L, 1L, 1L, 2L,
    1L, 2L, NA, 1L, 1L, NA, 1L, NA, NA, 2L, 1L, 1L, 2L, 2L, 2L,
    2L, 1L, 2L, 2L, 2L, 2L, NA, 1L, 2L, NA, 1L, 1L, NA, 1L, NA,
    1L, 2L, NA, 1L, 1L, NA, 1L, 1L, 1L, NA, 2L, 2L, 1L, 2L, 1L
    ), Date = structure(c(1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L,
    2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L,
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 3L, 1L, 1L, 3L, 3L,
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 4L, 1L, 4L,
    4L, 1L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 4L, 1L, 4L,
    4L, 4L, 4L, 4L), .Label = c("", "01/04/07", "02/04/07", "03/04/07"
    ), class = "factor"), Time = structure(c(1L, 28L, 1L, 1L,
    1L, 1L, 20L, 1L, 1L, 37L, 37L, 2L, 13L, 31L, 1L, 17L, 1L,
    34L, 38L, 39L, 23L, 1L, 1L, 24L, 14L, 16L, 1L, 33L, 30L,
    10L, 1L, 6L, 8L, 1L, 26L, 1L, 1L, 13L, 3L, 4L, 1L, 1L, 35L,
    36L, 25L, 9L, 11L, 5L, 22L, 1L, 10L, 30L, 1L, 19L, 15L, 1L,
    29L, 1L, 27L, 10L, 2L, 21L, 18L, 1L, 23L, 32L, 36L, 1L, 30L,
    7L, 12L, 1L, 15L), .Label = c("", " ", "02:24:00", "03:44:00",
    "04:30:00", "07:00:00", "08:35:00", "09:20:00", "09:30:00",
    "10:00:00", "10:15:00", "10:45:00", "11:00:00", "11:20:00",
    "11:30:00", "11:35:00", "11:50:00", "12:00:00", "12:25:00",
    "12:30:00", "12:45:00", "15:00:00", "15:15:00", "15:30:00",
    "15:35:00", "17:15:00", "17:50:00", "18:00:00", "19:00:00",
    "19:30:00", "19:50:00", "20:00:00", "20:30:00", "20:55:00",
    "22:15:00", "22:30:00", "22:35:00", "22:40:00", "23:10:00"
    ), class = "factor"), verbal = c(NA, 3L, NA, NA, NA, 3L,
    0L, NA, NA, 0L, 0L, NA, 0L, 0L, 0L, 4L, NA, 0L, 0L, 0L, 4L,
    NA, NA, 4L, 3L, 0L, 4L, 0L, 0L, 0L, NA, 0L, 0L, NA, 0L, NA,
    NA, 4L, 0L, 4L, 0L, 0L, 4L, 1L, 4L, 3L, 0L, 0L, 0L, NA, 4L,
    0L, NA, 0L, 3L, NA, 1L, NA, 0L, 3L, NA, 1L, 4L, NA, 4L, 0L,
    0L, NA, 0L, 0L, 0L, 0L, 1L), self.harm = c(NA, 0L, NA, NA,
    NA, 0L, 0L, NA, NA, 0L, 1L, NA, 2L, 0L, 0L, 2L, NA, 2L, 0L,
    2L, 0L, NA, NA, 0L, 0L, 2L, 0L, 1L, 2L, 1L, NA, 0L, 0L, NA,
    0L, NA, NA, 0L, 2L, 0L, 1L, 1L, 0L, 2L, 0L, 0L, 0L, 0L, 0L,
    NA, 0L, 2L, NA, 0L, 0L, NA, 0L, NA, 4L, 0L, NA, 1L, 0L, NA,
    1L, 3L, 1L, NA, 0L, 0L, 0L, 1L, 0L), violence_objects = c(NA,
    0L, NA, NA, NA, 0L, 0L, NA, NA, 0L, 0L, NA, 0L, 0L, 0L, 3L,
    NA, 0L, 0L, 0L, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, NA,
    0L, 0L, NA, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
    4L, 0L, 4L, NA, 0L, 0L, NA, 0L, 0L, NA, 0L, NA, 0L, 0L, NA,
    0L, 0L, NA, 0L, 0L, 0L, NA, 0L, 0L, 0L, 0L, 0L), violence = c(NA,
    0L, NA, NA, NA, 0L, 1L, NA, NA, 3L, 0L, NA, 0L, 1L, 1L, 1L,
    NA, 1L, 1L, 0L, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, NA,
    3L, 3L, NA, 2L, NA, NA, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L,
    0L, 3L, 0L, NA, 0L, 0L, NA, 2L, 0L, NA, 0L, NA, 0L, 0L, NA,
    0L, 0L, NA, 0L, 0L, 0L, NA, 3L, 3L, 2L, 0L, 0L)), .Names = c("Location",
"Sex", "Date", "Time", "verbal", "self.harm", "violence_objects",
"violence"), class = "data.frame", row.names = c(NA, -73L))



More information about the R-help mailing list