[R] Need to calculate standard deviation by groups

Gerrit Eichner Gerrit.Eichner at math.uni-giessen.de
Fri Dec 9 10:17:51 CET 2011


Hello, Zsuzsa,

does ave() with its argument FUN supplied with sd not do what you want?
Something like

with( Dataset, ave( x = B, C, D, FUN = sd))

should do it.

Hth  --  Gerrit


On Fri, 9 Dec 2011, Zsuzsanna Papp wrote:

> Hello,
>
> please help me with this basic question, I already spent two days on the
> internet and textbooks trying to come up with an answer...
> I will simplify my question to an example, rather than base it on the
> original variable names.
> I have a Dataset with 4 variables, 20000 cases. Variable A is an ID.
> Variable B is a continuous numerical variable, unique to each A.
> Variable C is categorical factor, has 6 possible levels. Variable D is
> also categorical factor, has 300 different levels.
>
> I would like to create a new variable=E, which is the standard deviation
> of B around the group means of B, groups defined by C and D.
>
> I had no problem creating such column to get group means (with the ave()
> function), but can not find a solution for another function like sd that
> would assign proper group value to each case.
>
> I tried
>
> Dataset$E <- with(Dataset, tapply(B, list(C,D),FUN=sd))
>
> but it is wrong, as it takes the 1800 different SD values, puts them in
> column E, then puts the same array of numbers there below it, repeats as
> many times as possible until the column is filled. The SD values are not
> corresponding to the proper groups.
>
> How can I match these data (1800 different SD values) to their
> corresponding cases in my original data?
> Is there a shortcut to do this all in one line, as for the means with
> the ave() function?
>
> I also tried ddply but I am doing something wrong (my R is on Linux and
> do not yet know how to get error messages, so I do not know what is
> wrong with my lines).
>
> Thank you for any help! Please give me as detailed script as possible.
>
> Zsuzsa
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------
Dr. Gerrit Eichner                   Mathematical Institute, Room 212
gerrit.eichner at math.uni-giessen.de   Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109        http://www.uni-giessen.de/cms/eichner



More information about the R-help mailing list