[R] Averaging column scores when participants vary in number of observations

Thu Feb 19 22:58:55 CET 2015

> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Chad
> Danyluck
> Sent: Thursday, February 19, 2015 12:33 PM
> To: r-help at r-project.org
> Subject: Re: [R] Averaging column scores when participants vary in
> number of observations
> 
> I have a data set that includes the identity of a number of Video
> Coders
> who scored participants' behaviors in a video. Every participant was
> scored
> once, but some participants were randomly assigned to have their data
> scored twice so I could calculate inter-rater reliabilities. I have
> completed the reliability analyses and want to use the average score
> for
> participants who had their behavior coded twice. I'd like to create a
> 'for
> loop' or function that allows me to calculate these column means
> iteratively because the number of observations is quite large (*N* =
> 168). Given the organization of the data, with some participants on
> multiple rows, I am not sure how to proceed.
> 
> The original data looks something like this:
> 
>                         Participant ID Video Coder Score
> Observation A                  1            Donald       4
> Observation B                  1              Tracy       5
> Observation C                  2            Donald       6
> Observation D                  3                Sam       2
> Observation E                  3              Tracy       3
> Observation F                  4            Donald       2
> Observation G                  4              Tracy       1
> Observation H                  5               Sam       8
> 
> When the data processing is completed, I would like the new data set to
> look like this:
> 
> Participant ID   Score
>                  1          4.5
>                  2            6
>                  3          2.5
>                  4          1.5
>                  5            8
> 
> Any tips or suggestions would be appreciated.
> 
> Kind regards,
> 
> Chad

How about something like

aggregate(Score ~ Participant_ID, data=rating, mean)

hope this is helpful,

Dan

Daniel J. Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services