[R] splitting a dataframe in R based on multiple gene names in a specific column
Bogdan Tanasa
tanasa at gmail.com
Wed Aug 23 01:57:19 CEST 2017
I would appreciate please a suggestion on how to do the following :
i'm working with a dataframe in R that contains in a specific column
multiple gene names, eg :
> df.sample.gene[15:20,2:8]
Chr Start End Ref Alt Func.refGene
Gene.refGene284 chr2 16080996 16080996 C T ncRNA_exonic
GACAT3448 chr2 113979920 113979920 C T ncRNA_exonic
LINC01191,LOC100499194465 chr2 131279347 131279347 C G
ncRNA_exonic LOC440910525 chr2 223777758 223777758 T
A exonic AP1S3626 chr3 99794575 99794575 G
A exonic COL8A1643 chr3 132601066 132601066 A
G exonic ACKR4
How could I obtain a dataframe where each line that has multiple gene names
(in the field Gene.refGene) is replicated with only one gene name ? i.e.
for the second row :
448 chr2 113979920 113979920 C T ncRNA_exonic LINC01191,LOC100499194
we shall get in the final output (that contains all the rows) :
448 chr2 113979920 113979920 C T ncRNA_exonic LINC01191
448 chr2 113979920 113979920 C T ncRNA_exonic LOC100499194
thanks a lot !
-- bogdan
[[alternative HTML version deleted]]
More information about the R-help
mailing list