[R] Filtering a dataset's columns by another dataset's column names
David Winsemius
dwinsemius at comcast.net
Fri Feb 27 18:41:49 CET 2009
So you want the data that is in Dataset 1 but only the column names
that are also in Dataset 2:
How about:
subset(DS1, select = names(DS1) %in% names(DS2) )
> DS1 <-read.table(textConnection("Individual SNP1 SNP2
SNP3 SNP4 SNP5
+ 1 A G T C A
+ 2 T C A G T
+ 3 A C T C A"),header=TRUE)
> DS2 <-read.table(textConnection("Individual SNP1 SNP3
SNP5 SNP6 SNP7
+ 4 A T T G C
+ 5 T A A G G
+ 6 A A T C G"),header=TRUE)
> subset(DS1, select= names(DS1) %in% names(DS2) )
Individual SNP1 SNP3 SNP5
1 1 A T A
2 2 T A T
3 3 A T A
Tested!
--
David Winsemius
Heritage Labs
On Feb 27, 2009, at 12:27 PM, Josh B wrote:
> Hello all,
>
> I hope some of you can come to my rescue, yet again.
>
> I have two genetic datasets, and I want one of the datasets to have
> only the columns that are in common with the other dataset.
> Here is a toy example (my real datasets have hundreds of columns):
>
> Dataset 1:
>
> Individual SNP1 SNP2 SNP3 SNP4 SNP5
> 1 A G T C A
> 2 T C A G T
> 3 A C T C A
>
> Dataset 2:
>
> Individual SNP1 SNP3 SNP5 SNP6 SNP7
> 4 A T T G C
> 5 T A A G G
> 6 A A T C G
>
> I want Dataset1 to have only columns that are also represented in
> Dataset 2, i.e., I want to generate a new Dataset 3 that looks like
> this:
>
> Individual SNP1 SNP3 SNP5
> 1 A T A
> 2 T A T
> 3 A T A
>
> Does anyone know how I could do this? Keep in mind that this is not
> a simple merge, as in the "merge" function.
>
> Thanks very much for your help everyone.
> Josh B.
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list