[R] max & min values within dataframe

Joshua Wiley jwiley.psych at gmail.com
Mon Nov 14 17:32:52 CET 2011


Hi Laura,

You were close.  Just use range() instead of min/max:

## your data (read in and then pasted the output of dput() to make it easy)
dat <- structure(list(Patient = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 6L, 6L), Region = structure(c(1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("X",
"Y"), class = "factor"), Score = c(19L, 20L, 22L, 25L, 12L, 12L,
25L, 26L, 6L, 6L, 21L, 22L, 23L, 24L, 21L, 22L, 23L, 24L, 25L,
6L, 22L, 23L, 24L, 23L, 24L, 23L, 24L, 25L, 26L, 27L, 24L, 32L
), Time = c(28L, 126L, 100L, 191L, 1L, 2L, 4L, 7L, 1L, 4L, 31L,
68L, 31L, 38L, 15L, 24L, 15L, 243L, 77L, 5L, 28L, 75L, 19L, 3L,
1L, 33L, 13L, 42L, 21L, 4L, 4L, 8L)), .Names = c("Patient", "Region",
"Score", "Time"), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
"15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25",
"26", "27", "28", "29", "30", "31", "32"))

tmp <- with(dat, aggregate(Score, list(Patient), range))
tmpreg <-  with(dat, Region[!duplicated(Patient)])

results <- data.frame(tmp$Group.1, tmpreg, tmp$x)
colnames(results) <- c("Patient", "Region", "Min", "Max")

Note it is a little tricky to get the results in a data frame, because
tmp is a bit of an odd data frame---due to the way aggregate works,
the the first column of the data frame is a regular vector, but the
second column actually contains a two column matrix.  To get it into
regular form, I extracted them separately when creating 'results'.

Cheers,

Josh

On Mon, Nov 14, 2011 at 8:10 AM, B Laura <gm.spam2011 at gmail.com> wrote:
> dear R-team
>
> I need to find the min, max values for each patient from dataset and keep
> the output of it as a dataframe with the following columns
>  - Patient nr
>  - Region (remains same per patient)
>  - Min score
>  - Max score
>
>
>    Patient Region Score Time
> 1        1      X    19   28
> 2        1      X    20  126
> 3        1      X    22  100
> 4        1      X    25  191
> 5        2      Y    12    1
> 6        2      Y    12    2
> 7        2      Y    25    4
> 8        2      Y    26    7
> 9        3      X     6    1
> 10       3      X     6    4
> 11       3      X    21   31
> 12       3      X    22   68
> 13       3      X    23   31
> 14       3      X    24   38
> 15       3      X    21   15
> 16       3      X    22   24
> 17       3      X    23   15
> 18       3      X    24  243
> 19       3      X    25   77
> 20       4      Y     6    5
> 21       4      Y    22   28
> 22       4      Y    23   75
> 23       4      Y    24   19
> 24       5      Y    23    3
> 25       5      Y    24    1
> 26       5      Y    23   33
> 27       5      Y    24   13
> 28       5      Y    25   42
> 29       5      Y    26   21
> 30       5      Y    27    4
> 31       6      Y    24    4
> 32       6      Y    32    8
>
> So far I could find the min and max values for each patient, but the output
> of it is not (yet) what I need.
>
>> Patient.nr = unique(Patient)
>> aggregate(Score, list(Patient), max)
>  Group.1  x
> 1       1 25
> 2       2 26
> 3       3 25
> 4       4 24
> 5       5 27
> 6       6 32
>
>> aggregate(Score, list(Patient), min)
>  Group.1  x
> 1       1 19
> 2       2 12
> 3       3  6
> 4       4  6
> 5       5 23
> 6       6 24
> I would like to do same but writing this new information (min, max values)
> in a dataframe with following columns
>  - Patient nr
> - Region (remains same per patient)
> - Min score
> - Max score
>
> Can anybody help me with this?
>
> Thanks
> Laura
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/



More information about the R-help mailing list