[R] Merge partially duplicated rows
David Winsemius
dwinsemius at comcast.net
Tue Aug 4 02:02:36 CEST 2009
On Aug 3, 2009, at 7:12 PM, David Winsemius wrote:
>
> On Aug 3, 2009, at 9:24 AM, Rnewbie wrote:
>
>>
>> Dear all,
>>
>> I have a dataset, and I wanted to merge the rows with duplicated
>> IDs by
>> calculating the means or medians from the duplicate rows. I tried
>> using the
>> command duplicated(x), but it only tells where the duplicated rows
>> are.
>
> You might want to look at the ave function. It will calculate a
> function within IDs and you can assign that as another row in the
> datafrme before you exclude the duplicates.
^^^^^^
err... I meant to say another column.
> tst <- data.frame(ID = sample(c("1234", "4567", "2346"), 10,
replace=TRUE), val=rnorm(10))
> tst
ID val
1 2346 0.22659389
2 2346 0.46835154
3 2346 -0.53702251
4 2346 -1.00187606
5 1234 0.90843566
6 2346 -0.59654370
7 4567 -0.04355647
8 1234 0.65332120
9 4567 -2.22517105
10 1234 -0.26911187
> tst$IDmn <- ave(tst$val, tst$ID) #default function for ave is mean
but others can be used
> tst
ID val IDmn
1 2346 0.22659389 -0.2880994
2 2346 0.46835154 -0.2880994
3 2346 -0.53702251 -0.2880994
4 2346 -1.00187606 -0.2880994
5 1234 0.90843566 0.4308817
6 2346 -0.59654370 -0.2880994
7 4567 -0.04355647 -1.1343638
8 1234 0.65332120 0.4308817
9 4567 -2.22517105 -1.1343638
10 1234 -0.26911187 0.4308817
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list