[BioC] GEOquery: getGEO() doesn\'t work (error \"invalid \'nlines\' argument\")

ecsi at gmx.net ecsi at gmx.net
Tue May 29 17:45:20 CEST 2012

> So when you use system.file() you are specifically telling GEOquery to 
> look for a file that is in your GEOquery library directory, rather 
> than telling GEOquery the actual directory.

Thank you for explaining the system.file() thing, I didn't know that 
this was about the package repository. I thought it would be necessary 
to be able to access the downloaded files, but now I understand what's 

> > mypath <- "C:/Users/bioinf_admin/Desktop/"
> > GSE19711 <- getGEO('GSE19711',destdir=mypath)
> This will result in a list of ExpressionSets

The problem is, that here I work with methylation data, so have to 
create MethyLumiSets instead of ExpressionSets.

My idea was to create phenodata.txt files using the data I get from 

 > GSE19711 <- getGEO(filename="mypath/GSE19711_family.soft.gz")

(Btw, I always get warnings when doing this, but it seems to work anyway:
 > warnings()
Warning messages:
1: In readLines(con, n = chunksize) :
   seek on a gzfile connection returned an internal error

And then accessing the information with some code like this for example:

 > Meta(GSMList(GSE19711)[[1]])$characteristics_ch1[3]

[1] "ageatrecruitment: 68"

And extract the relevant substrings and create a data.frame with all the 
information I need (age, sex, treatment, etc.). And all this in an apply 
function for every GSE or something like this. Furthermore getting the 
data matrices from the soft files as well and finally creating 
MethyLumiSets out of the data matrices and the phenodata.txt files I 

Maybe it would be better to first create ExpressionSets and convert them 
into MethyLumiSets somehow, but I would have to manipulate the objects 
anyway, because I can't use the phenodata information as it comes from 
GEO in these cases. I need the phenodata to be the same style for all 
the GEO sets I have to analyze, so in any case I'll have to do the work 
to extract (only) the information I need for the different GEO sets.

But I'm still not quite sure about the best way to create the 
MethyLumiSets efficiently ...


More information about the Bioconductor mailing list