[R] max & min values within dataframe
Joshua Wiley
jwiley.psych at gmail.com
Mon Nov 14 17:32:52 CET 2011
Hi Laura,
You were close. Just use range() instead of min/max:
## your data (read in and then pasted the output of dput() to make it easy)
dat <- structure(list(Patient = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 6L, 6L), Region = structure(c(1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("X",
"Y"), class = "factor"), Score = c(19L, 20L, 22L, 25L, 12L, 12L,
25L, 26L, 6L, 6L, 21L, 22L, 23L, 24L, 21L, 22L, 23L, 24L, 25L,
6L, 22L, 23L, 24L, 23L, 24L, 23L, 24L, 25L, 26L, 27L, 24L, 32L
), Time = c(28L, 126L, 100L, 191L, 1L, 2L, 4L, 7L, 1L, 4L, 31L,
68L, 31L, 38L, 15L, 24L, 15L, 243L, 77L, 5L, 28L, 75L, 19L, 3L,
1L, 33L, 13L, 42L, 21L, 4L, 4L, 8L)), .Names = c("Patient", "Region",
"Score", "Time"), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
"15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25",
"26", "27", "28", "29", "30", "31", "32"))
tmp <- with(dat, aggregate(Score, list(Patient), range))
tmpreg <- with(dat, Region[!duplicated(Patient)])
results <- data.frame(tmp$Group.1, tmpreg, tmp$x)
colnames(results) <- c("Patient", "Region", "Min", "Max")
Note it is a little tricky to get the results in a data frame, because
tmp is a bit of an odd data frame---due to the way aggregate works,
the the first column of the data frame is a regular vector, but the
second column actually contains a two column matrix. To get it into
regular form, I extracted them separately when creating 'results'.
Cheers,
Josh
On Mon, Nov 14, 2011 at 8:10 AM, B Laura <gm.spam2011 at gmail.com> wrote:
> dear R-team
>
> I need to find the min, max values for each patient from dataset and keep
> the output of it as a dataframe with the following columns
> - Patient nr
> - Region (remains same per patient)
> - Min score
> - Max score
>
>
> Patient Region Score Time
> 1 1 X 19 28
> 2 1 X 20 126
> 3 1 X 22 100
> 4 1 X 25 191
> 5 2 Y 12 1
> 6 2 Y 12 2
> 7 2 Y 25 4
> 8 2 Y 26 7
> 9 3 X 6 1
> 10 3 X 6 4
> 11 3 X 21 31
> 12 3 X 22 68
> 13 3 X 23 31
> 14 3 X 24 38
> 15 3 X 21 15
> 16 3 X 22 24
> 17 3 X 23 15
> 18 3 X 24 243
> 19 3 X 25 77
> 20 4 Y 6 5
> 21 4 Y 22 28
> 22 4 Y 23 75
> 23 4 Y 24 19
> 24 5 Y 23 3
> 25 5 Y 24 1
> 26 5 Y 23 33
> 27 5 Y 24 13
> 28 5 Y 25 42
> 29 5 Y 26 21
> 30 5 Y 27 4
> 31 6 Y 24 4
> 32 6 Y 32 8
>
> So far I could find the min and max values for each patient, but the output
> of it is not (yet) what I need.
>
>> Patient.nr = unique(Patient)
>> aggregate(Score, list(Patient), max)
> Group.1 x
> 1 1 25
> 2 2 26
> 3 3 25
> 4 4 24
> 5 5 27
> 6 6 32
>
>> aggregate(Score, list(Patient), min)
> Group.1 x
> 1 1 19
> 2 2 12
> 3 3 6
> 4 4 6
> 5 5 23
> 6 6 24
> I would like to do same but writing this new information (min, max values)
> in a dataframe with following columns
> - Patient nr
> - Region (remains same per patient)
> - Min score
> - Max score
>
> Can anybody help me with this?
>
> Thanks
> Laura
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
More information about the R-help
mailing list