[R] Normalizing grouped data in a data frame

Greg Snow Greg.Snow at intermountainmail.org
Fri Nov 9 17:22:58 CET 2007


Here is another approach using transform and ave which I think is a
little simpler than the others suggested:

> new.data <- transform( iris, 
+   normSW = Sepal.Width / ave(Sepal.Width, Species, FUN=max),
+   normSL = Sepal.Length / ave(Sepal.Length, Species, FUN=max)
+  )

You can adjust it for your data.  Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Sandy Small
> Sent: Friday, November 09, 2007 3:57 AM
> To: r-help at r-project.org
> Subject: [R] Normalizing grouped data in a data frame
> 
> Hi
> I am a newbie to R but have tried a number of ways in R to do 
> this and can't find a good solution. (I could do it out of R 
> in perl or awk but would like to know how to do this in R).
> 
> I have a large data frame 49 variables and 7000 observations 
> however for simplicity I can express it in the following data frame
> 
> Base, Image, LVEF, ES_Time
> A, 1,  4.32, 0.89
> A, 2, 4.98, 0.67
> A, 3, 3.7, 0.5
> A, 3. 4.1, 0.8
> B, 1, 7.4, 0.7
> B, 3, 7.2, 0.8
> B, 4, 7.8, 0.6
> C, 1, 5.6, 1.1
> C, 4, 5.2, 1.3
> C, 5, 5.9, 1.2
> C, 6, 6.1, 1.2
> C, 7. 3.2, 1.1
> 
> For each value of LVEF and ES_Time I would like to normalise 
> the value to the maximum for that factor grouped by Base or 
> Image number, adding an extra column to the data frame with 
> the normalised value in it.
> 
> So for the Base = B group in the data frame (the data frame 
> should have the same length I'm just showing the B part) I 
> would get a modified data frame as follows.
> 
> Base, Image, LVEF, ES_Time, Norm_LVEF, Norm_ES_Time ...
> B,1,7.4, 0.7, 7.4/7.8, 0.7/0.8
> B, 3, 7.2, 0.8, 7.2/7.8, 0.8/0.8
> B, 4, 7.8, 0.6, 7.8/7.8, 0.6/0.8
> ...
> 
> Where the results of the division would replace the division 
> shown here.
> I hope this makes sense.
> If anyone can help I would be very grateful.
> 
> Sandy Small
> NHS Glasgow, UK
> 
> 
> **********************************************************************
> This message  may  contain  confidential  and  privileged information.
> If you are not  the intended  recipient please  accept our  apologies.
> Please do not disclose, copy or distribute  information in 
> this e-mail or take any  action in reliance on its  contents: 
> to do so is strictly prohibited and may be unlawful. Please 
> inform us that this message has gone  astray  before  
> deleting it.  Thank  you for  your co-operation.
> 
> NHSmail is used daily by over 100,000 staff in the NHS. Over 
> a million messages  are sent every day by the system.  To 
> find  out why more and more NHS personnel are  switching to  
> this NHS  Connecting  for Health system please visit 
> www.connectingforhealth.nhs.uk/nhsmail
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list