[R] Data manipulation

jim holtman jholtman at gmail.com
Sat Feb 12 19:34:47 CET 2011


Will this do it for you:

> x <- read.table(textConnection(" A  B  C
+ 1 1  a  1999
+ 2 1  b  1999
+ 3 1  c  1999
+ 4 2  c  2001
+ 5 2  d  2001
+ 6 3  a  2004
+ 7 3  b  2004"), header = TRUE)
> closeAllConnections()
> # add a tenure column
> x$tenure <- ave(x$C, x$B, FUN = function(yr) yr - min(yr))
> x
  A B    C tenure
1 1 a 1999      0
2 1 b 1999      0
3 1 c 1999      0
4 2 c 2001      2
5 2 d 2001      0
6 3 a 2004      5
7 3 b 2004      5
> # compute tenure on project
> aggregate(x$tenure, list(project = x$A), mean)
  project x
1       1 0
2       2 1
3       3 5


On Sat, Feb 12, 2011 at 9:00 AM, mathijsdevaan <mathijsdevaan at gmail.com> wrote:
>
> Hi,
>
> I have a dataset with info on individuals (B) that have been involved in
> projects (A) during multiple years (C). The dataset contains three columns:
> A, B, C. Example:
>   A  B  C
> 1 1  a  1999
> 2 1  b  1999
> 3 1  c  1999
> 4 2  c  2001
> 5 2  d  2001
> 6 3  a  2004
> 7 3  b  2004
>
> I am interested in the average tenure of all individuals for each project
> (assuming that the tenure of an individual = 0 in the first project this
> individual is involved in). So based on the data above:
>  A  D
> 1 1  0
> 2 2  1
> 3 3  5
>
> where D = average project tenure. How do I do this?
>
> Your help is very much appreciated. Thanks!
> --
> View this message in context: http://r.789695.n4.nabble.com/Data-manipulation-tp3302717p3302717.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list