[BioC] GEOquery

Sean Davis seandavi at gmail.com
Thu Apr 1 16:07:51 CEST 2010


On Thu, Apr 1, 2010 at 9:51 AM, Zhu, Julie <Julie.Zhu at umassmed.edu> wrote:
> Hi Sean,
>
> Thanks for the quick response!
>
> Looks like the expressionSet has been log2 transformed and summarized. Is it
> correct? Did you use rma method to summarize the probe level data? I am
> wondering whether expression value in expressionSet in gds[[1]] is also
> background corrected and normalized. If yes, what methods have been applied?
> Thanks!

The data are downloaded from GEO.  No numerical processing is applied.
 GEO simply stores the data as supplied by the submitter and GEOquery
just parses these data into bioconductor objects.  There are no .CEL
files involved at all.

> With CEL files at hand, I usually do the following,
>
> require(affy)
> Data = ReadAffy(celfile.path="thepath")
> eset<-rma(Data)
> library("simpleaffy")
> Data.qc <- qc(Data)
> avbg(Data.qc)
> percent.present(Data.qc)
> ......
>
> I am wondering what would be the equivalent steps to apply to the
> expressionSet obtained from getGEO method? Thanks so much for your help!

You would need to start with the supplemental files:

 getGEOSuppFiles("GSE6547")

This will create a directory in your working directory called
"GSE6547".  Into that directory, all the available supplemental files
for that GSE record will be downloaded.  In this case, there happens
to be a .tar file with .CEL files in it.  You can use those .CEL files
in the usual way and take the phenoData from a call to getGEO() to add
annotation.

Hope that helps.

Sean

> Best regards,
>
> Julie
>
>
> On 3/31/10 6:38 PM, "Sean Davis" <seandavi at gmail.com> wrote:
>
> On Wed, Mar 31, 2010 at 6:07 PM, Zhu, Julie <Julie.Zhu at umassmed.edu> wrote:
>> Hi,
>>
>> First of all I would like to thank the developers for developing such a
>> useful package!
>>
>> I downloaded a dataset using GEOquery package successfully as following.
>> However, I could not convert it to eSet or get metadata out of it. Could
>> you
>> please let me know what I did wrong and how to proceed? Thanks so much for
>> your help!
>>
>> Best regards,
>>
>> Julie
>>
>>  > gds <- getGEO('GSE6547')
>
> Hi, Julie.  Thanks for the kind words.
>
> gds is a list:
>
> class(gds)
> names(gds)
>
> gds[[1]]
>
> You asked for a GSE record with GSEMatrix=TRUE, so the default is to
> return a list of ExpressionSets; see the help for getGEO().  The list
> is due to some oddities of the GSE Matrix format that limits each GSE
> Matrix file to 255 samples.
>
> Hope that answers your question.
>
> Sean
>
>> Found 1 file(s)
>> GSE6547_series_matrix.txt.gz
>> trying URL
>>
>> 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE6547/GSE6547_series_mat
>> rix.txt.gz'
>> ftp data connection made, file length 1739964 bytes
>> opened URL
>> ==================================================
>> downloaded 1.7 Mb
>>
>> File stored at:
>> /tmp/RtmpaOAhXo/GPL200.soft
>>
>>>eset <- GDS2eSet(gds, do.log2 = TRUE)
>> Error in function (classes, fdef, mtable)  :
>>  unable to find an inherited method for function "Meta", for signature
>> "list"
>>
>>> Meta(gds)
>> Error in function (classes, fdef, mtable)  :
>>  unable to find an inherited method for function "Meta", for signature
>> "list"
>>> Table(gds)
>> Error in function (classes, fdef, mtable)  :
>>  unable to find an inherited method for function "Table", for signature
>> "list"
>>> Columns(gds)
>> Error in function (classes, fdef, mtable)  :
>>  unable to find an inherited method for function "Columns", for signature
>> "list"
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
>> sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-apple-darwin8.11.1
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>  [1] GEOmetadb_1.6.0                     GEOquery_2.11.3
>  [3] RCurl_1.3-1                         bitops_1.0-4.1
>  [5] ChIPpeakAnno_1.2.13                 limma_3.2.3
>  [7] org.Hs.eg.db_2.3.6                  GO.db_2.3.5
>  [9] RSQLite_0.8-2                       DBI_0.2-5
> [11] AnnotationDbi_1.8.2                 BSgenome.Ecoli.NCBI.20080805_1.3.16
> [13] BSgenome_1.14.2                     Biostrings_2.14.12
> [15] IRanges_1.4.16                      multtest_2.2.0
> [17] Biobase_2.6.1                       biomaRt_2.2.0
>
>



More information about the Bioconductor mailing list