[R] how to make row.names based on column1 with duplicated values

William Dunlap wdunlap at tibco.com
Thu Mar 1 17:01:34 CET 2018


You can do this with ave():

gene <- c("a","b","c","d","c","d","c","f")
ave(gene, gene, FUN=function(x)if(length(x)>1)paste(x,seq_along(x),sep="-")
else x)
# [1] "a"   "b"   "c-1" "d-1" "c-2" "d-2" "c-3" "f"

You can probably speed it up a bit by pulling the paste() out of FUN
and doing it later.  It would be simpler if you put the '-N' after all
genes,
not just the ones that were not repeated.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Feb 28, 2018 at 10:18 PM, Stephen HonKit Wong <stephen66 at gmail.com>
wrote:

> Dear All,
> Suppose I have a dataframe like this with many thousands rows all with
> different names:
> data.frame(gene=c("a","b","c","d","c","d","c","f"),value=c(
> 20,300,48,55,9,2,100,200)),
>
> I want to set column "gene" as row.names, but there are duplicates (c, d),
> which I want to transform into this as row names: a, b, c-1, d-1, c-2, d-2,
> c-3, f
>
> Many thanks!
>
> Stephen
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list