[BioC] Problems selecting rows from dataframe (exprs) of GNF Atlas data....

Bas Jansen bjhjansen at gmail.com
Tue Jan 3 10:46:19 CET 2012


Dear fellow Bioconductor users:

Happy New Year!
At the moment I am analyzing the GNF Atlas data. I retrieved the data
from the Gene Expression Omnibus using the package GEOquery, converted
it to an expressionSet and extracted the expression values. So now I
have a data frame from which I would like to extract the expression
values of > 100 probe IDs for 79 tissues. Thing is, if I use a single
probe ID, things go fine. However, whenever I use a string of probe
IDs, things go awry.

See below:

***
> exprs[c("gnf1h00499_at"),]
              GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774
gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488
(abbreviated for reasons of clarity)
***

As stated above: whenever I use a string of probe IDs (say, like 2
probe IDs), things go awry:

***
> exprs[c("gnf1h00499_at","gnf1h500_at"),]
              GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774
gnf1h00499_at 5.770829 7.708739 5.161888 7.459432 6.332708 6.902074 4.472488
NA                  NA       NA       NA       NA       NA       NA       NA
etc.
***

The gnf1h00500 probe is reported as NA, and I'm pretty sure it has
real expression values associated with it.
The following just works fine:

***
> exprs[c(1:20,30:70),]
            GSM18768 GSM18769 GSM18756 GSM18757 GSM18780 GSM18781 GSM18774
200000_s_at        0        0        0        0        0        0        0
200001_at          0        0        0        0        0        0        0
200002_at          0        0        0        0        0        0        0
200003_s_at        0        0        0        0        0        0        0
etc.
***

So, how do I select rows on the basis of probe IDs? Or better yet:
what am I overlooking????

Thanks & kind regards,
Bas



More information about the Bioconductor mailing list