[R] Aggregating data (with more than one function)

Robin Schroeder Robin.Schroeder at asu.edu
Tue Mar 29 23:57:39 CEST 2005


Dear list & Andy, 

I am hopelessly stumped, how would one add the department names as a variable?

Robin

> Robin Tori Schroeder
> International Institute for Sustainability 
> P.O. Box 873211
> Arizona State University
> Tempe, Arizona 85287-3211
> Phone: (480) 727-7290
> 
> 


-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Liaw, Andy
Sent: Monday, March 28, 2005 6:45 PM
To: 'Sivakumaran Raman'; r-help at stat.math.ethz.ch
Subject: RE: [R] Aggregating data (with more than one function)


Here's one possible way, using the data you supplied:

> dat <- read.table("clipboard", header=T, row=1)
> do.call("rbind",by(dat$Salary, dat$Department, function(x) c(mean=mean(x),
total=sum(x))))
            mean  total
Finance 83925.67 251777
HR      63333.33 190000
IT      59928.67 179786
Sales   62481.67 187445

If you need the department names as a variable, you can add that easily.

HTH,
Andy

> From: Sivakumaran Raman
> 
> I have the data similar to the following in a data frame:
>     LastName   Department  Salary
> 1   Johnson    IT          56000
> 2   James      HR          54223
> 3   Howe       Finance     80000
> 4   Jones      Finance     82000
> 5   Norwood    IT          67000
> 6   Benson     Sales       76000
> 7   Smith      Sales       65778
> 8   Baker      HR          56778
> 9   Dempsey    HR          78999
> 10  Nolan      Sales       45667
> 11  Garth      Finance     89777
> 12  Jameson    IT          56786
> 
> I want to calculate both the mean salary broken down by 
> Department and 
> also the
> total amount paid out per department i.e. I want both sum(Salary) and
> mean(Salary) for each Department. Right now, I am using 
> aggregate.data.frame
> twice, creating two data frames, and then combining them 
> using data.frame.
> However, this seems to be very memory and processor intensive and is 
> taking a
> very long time on my data set. Is there a quicker way to do this?
> 
> Thanks in advance,
> Siv Raman
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list