[R] summing up colum values for unique IDs when multiple ID's exist in data frame
Seth Falcon
sfalcon at fhcrc.org
Tue May 29 23:47:38 CEST 2007
"Young Cho" <young.stat at gmail.com> writes:
> I have data.frame's with IDs and multiple columns. B/c some of IDs
> showed up more than once, I need sum up colum values to creat a new
> dataframe with unique ids.
>
> I hope there are some cheaper ways of doing it... Because the
> dataframe is huge, it takes almost an hour to do the task. Thanks
> so much in advance!
Does this do what you want in a faster way?
sum_dup <- function(df) {
idIdx <- split(1:nrow(df), as.character(df$ID))
whID <- match("ID", names(df))
colNms <- names(df)[-whID]
ans <- lapply(colNms, function(cn) {
unlist(lapply(idIdx,
function(x) sum(df[[cn]][x])),
use.names=FALSE)
})
attributes(ans) <- list(names=colNms,
row.names=names(idIdx),
class="data.frame")
ans
}
--
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org
More information about the R-help
mailing list