[R] max & min values within dataframe
R. Michael Weylandt
michael.weylandt at gmail.com
Mon Nov 14 17:36:29 CET 2011
I took a stab at this using ddply() from the plyr package. How's this
look to you?
x<- textConnection("Col Patient Region Score Time
1 1 X 19 28
2 1 X 20 126
3 1 X 22 100
4 1 X 25 191
5 2 Y 12 1
6 2 Y 12 2
7 2 Y 25 4
8 2 Y 26 7
9 3 X 6 1
10 3 X 6 4
11 3 X 21 31
12 3 X 22 68
13 3 X 23 31
14 3 X 24 38
15 3 X 21 15
16 3 X 22 24
17 3 X 23 15
18 3 X 24 243
19 3 X 25 77
20 4 Y 6 5
21 4 Y 22 28
22 4 Y 23 75
23 4 Y 24 19
24 5 Y 23 3
25 5 Y 24 1
26 5 Y 23 33
27 5 Y 24 13
28 5 Y 25 42
29 5 Y 26 21
30 5 Y 27 4
31 6 Y 24 4
32 6 Y 32 8")
V = read.table(x, header = T)[,-1]
closeAllConnections()
rm("x")
# Everything above is just stuff to get the data in.
R <- ddply(V, c("Patient","Region"), function(d) {c(max =
max(d$Score),min = min(d$Score))})
Patient Region max min
1 1 X 25 19
2 2 Y 26 12
3 3 X 25 6
4 4 Y 24 6
5 5 Y 27 23
6 6 Y 32 24
Michael
On Mon, Nov 14, 2011 at 11:32 AM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
> Hi Laura,
>
> You were close. Just use range() instead of min/max:
>
> ## your data (read in and then pasted the output of dput() to make it easy)
> dat <- structure(list(Patient = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 6L, 6L), Region = structure(c(1L, 1L, 1L,
> 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("X",
> "Y"), class = "factor"), Score = c(19L, 20L, 22L, 25L, 12L, 12L,
> 25L, 26L, 6L, 6L, 21L, 22L, 23L, 24L, 21L, 22L, 23L, 24L, 25L,
> 6L, 22L, 23L, 24L, 23L, 24L, 23L, 24L, 25L, 26L, 27L, 24L, 32L
> ), Time = c(28L, 126L, 100L, 191L, 1L, 2L, 4L, 7L, 1L, 4L, 31L,
> 68L, 31L, 38L, 15L, 24L, 15L, 243L, 77L, 5L, 28L, 75L, 19L, 3L,
> 1L, 33L, 13L, 42L, 21L, 4L, 4L, 8L)), .Names = c("Patient", "Region",
> "Score", "Time"), class = "data.frame", row.names = c("1", "2",
> "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
> "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25",
> "26", "27", "28", "29", "30", "31", "32"))
>
> tmp <- with(dat, aggregate(Score, list(Patient), range))
> tmpreg <- with(dat, Region[!duplicated(Patient)])
>
> results <- data.frame(tmp$Group.1, tmpreg, tmp$x)
> colnames(results) <- c("Patient", "Region", "Min", "Max")
>
> Note it is a little tricky to get the results in a data frame, because
> tmp is a bit of an odd data frame---due to the way aggregate works,
> the the first column of the data frame is a regular vector, but the
> second column actually contains a two column matrix. To get it into
> regular form, I extracted them separately when creating 'results'.
>
> Cheers,
>
> Josh
>
> On Mon, Nov 14, 2011 at 8:10 AM, B Laura <gm.spam2011 at gmail.com> wrote:
>> dear R-team
>>
>> I need to find the min, max values for each patient from dataset and keep
>> the output of it as a dataframe with the following columns
>> - Patient nr
>> - Region (remains same per patient)
>> - Min score
>> - Max score
>>
>>
>> Patient Region Score Time
>> 1 1 X 19 28
>> 2 1 X 20 126
>> 3 1 X 22 100
>> 4 1 X 25 191
>> 5 2 Y 12 1
>> 6 2 Y 12 2
>> 7 2 Y 25 4
>> 8 2 Y 26 7
>> 9 3 X 6 1
>> 10 3 X 6 4
>> 11 3 X 21 31
>> 12 3 X 22 68
>> 13 3 X 23 31
>> 14 3 X 24 38
>> 15 3 X 21 15
>> 16 3 X 22 24
>> 17 3 X 23 15
>> 18 3 X 24 243
>> 19 3 X 25 77
>> 20 4 Y 6 5
>> 21 4 Y 22 28
>> 22 4 Y 23 75
>> 23 4 Y 24 19
>> 24 5 Y 23 3
>> 25 5 Y 24 1
>> 26 5 Y 23 33
>> 27 5 Y 24 13
>> 28 5 Y 25 42
>> 29 5 Y 26 21
>> 30 5 Y 27 4
>> 31 6 Y 24 4
>> 32 6 Y 32 8
>>
>> So far I could find the min and max values for each patient, but the output
>> of it is not (yet) what I need.
>>
>>> Patient.nr = unique(Patient)
>>> aggregate(Score, list(Patient), max)
>> Group.1 x
>> 1 1 25
>> 2 2 26
>> 3 3 25
>> 4 4 24
>> 5 5 27
>> 6 6 32
>>
>>> aggregate(Score, list(Patient), min)
>> Group.1 x
>> 1 1 19
>> 2 2 12
>> 3 3 6
>> 4 4 6
>> 5 5 23
>> 6 6 24
>> I would like to do same but writing this new information (min, max values)
>> in a dataframe with following columns
>> - Patient nr
>> - Region (remains same per patient)
>> - Min score
>> - Max score
>>
>> Can anybody help me with this?
>>
>> Thanks
>> Laura
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, ATS Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list