[R] recode Variable in dependence of values of two other variables

Fri Aug 12 21:49:22 CEST 2011

Hi:

Here are several equivalent ways to produce your desired output:

# Base package: transform()

df <- transform(df, mean = ave(x, id, FUN = mean))

# plyr package
library('plyr')
ddply(df, .(id), transform, mean = mean(x))

# data.table package
library('data.table')
dt <- data.table(df, key = 'id')
dt[, list(x, mean = mean(x)), by = 'id']

# doBy package
library('doBy')
transformBy(~ id, data = df, mean = mean(x))

HTH,
Dennis

On Fri, Aug 12, 2011 at 8:10 AM, Julia Moeller
<julia.moeller at uni-erfurt.de> wrote:
> Hi,
>
> as an R-beginner, I have a recoding problem and hope you can help me:
>
> I am working on a SPSS dataset, which I loaded into R (load("C:/...)
>
> I have  2 existing Variables: "ID" and "X" ,
> and one variable to be computed: meanX.dependID (=mean of X for all rows in
> which ID has the same value)
>
> ID = subject ID.  Since it is a longitudinal dataset, there are repeated
> measurement points for each subject, each of which appears in a new row. So,
> each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 in rows
> 6:8 etc).
>
>
> Now: For all rows, in which ID has a certain value, meanX.dependID shall be
> the mean of X in for these rows. How can I automatisize that, without having
> to specify the number of the rows each time?
>
> e.g.
>
>
> ID    X    meanX.dependID
> 1    2    2.25
> 1    3    2.25
> 1    1    2.25
> 1    3    2.25
> 2    5    3.3
> 2    2    3.3
> 2    3    3.3
> 3    4    3
> 3    1    3
> 3    2    3
> 3    3    3
> 3    4    3
> 3    5    3
>
>
> Thanks a lot! Hope this is the right place to post, if not, please tell me!
> best,
> Julia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>