[BioC] a problem in reading in cel files
James F. Reid
james.reid at ifom-ieo-campus.it
Fri Feb 10 14:44:32 CET 2012
Hi Manuela,
it looks like GSM310016.CEL starts with a blank line before the [CEL]
header, no idea why this is so ?!?
Removing this first empty line solves the issue, maybe check the other
CEL files too.
J.
On 10/02/12 10:40, Manuela Di Russo wrote:
> Dear all,
> I am learning to analyse Affymetrix microarray data but I have a problem in reading .cel files in.
> I downloaded from GEO the raw data provided as supplementary files (GSE12345_RAW.tar), than I have extracted the cel files in a directory which I have set as my working directory.
> Here is the R code I used:
>
>> setwd("C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW")
>> library(affy)
> Carico il pacchetto richiesto: Biobase
>
> Welcome to Bioconductor
>
> Vignettes contain introductory material. To view, type
> 'browseVignettes()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation("pkgname")'.
>
>> dir()
> [1] "data analysis.txt" "E-GEOD-12345.sdrf.txt" "E-GEOD-12345.sdrf.xls"
> [4] "GSM309986.CEL" "GSM309987.CEL" "GSM309988.CEL"
> [7] "GSM309989.CEL" "GSM309990.CEL" "GSM309991.CEL"
> [10] "GSM310012.CEL" "GSM310013.CEL" "GSM310014.CEL"
> [13] "GSM310015.CEL" "GSM310016.CEL" "GSM310068.CEL"
> [16] "GSM310070.CEL" "target.txt" "target.xls"
>> pd<- read.AnnotatedDataFrame("target.txt",header=TRUE,row.names=1,as.is=TRUE)
>> pData(pd)
> FileName Target
> N1 GSM309986.CEL pleural tissue
> N2 GSM309987.CEL pleural tissue
> N3 GSM309988.CEL pleural tissue
> N4 GSM309989.CEL pleural tissue
> MM1 GSM309990.CEL mesothelioma tissue
> MM2 GSM309991.CEL mesothelioma tissue
> MM3 GSM310012.CEL mesothelioma tissue
> MM4 GSM310013.CEL mesothelioma tissue
> MM5 GSM310014.CEL mesothelioma tissue
> MM6 GSM310015.CEL mesothelioma tissue
> MM7 GSM310016.CEL mesothelioma tissue
> MM8 GSM310068.CEL mesothelioma tissue
> MM9 GSM310070.CEL mesothelioma tissue
>> rawData<- read.affybatch(filenames=pData(pd)$FileName,phenoData=pd)
> Error in try(.Call("ReadHeaderDetailed", filename, PACKAGE = "affyio")) :
> Is GSM310016.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats.
>
> Errore in read.celfile.header(filenames[i], info = "full") :
> Failed to get full header information for GSM310016.CEL
>> rawData1<-ReadAffy()
> Error in try(.Call("ReadHeaderDetailed", filename, PACKAGE = "affyio")) :
> Is C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW/GSM310016.CEL really a CEL file? tried reading as text, gzipped text, binary, gzipped binary, command console and gzipped command console formats.
>
> Errore in read.celfile.header(filenames[i], info = "full") :
> Failed to get full header information for C:/BACKUP/Dati/Progetti/Landi/meta-analisi MPM/GSE12345_RAW/GSM310016.CEL
>> sessionInfo()
> R version 2.14.1 (2011-12-22)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252
> [3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
> [5] LC_TIME=Italian_Italy.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] affy_1.32.1 Biobase_2.14.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.22.0 BiocInstaller_1.2.1 preprocessCore_1.16.0
> [4] zlibbioc_1.0.0
>> traceback()
> 7: stop("Failed to get full header information for ", filename)
> 6: read.celfile.header(filenames[i], info = "full")
> 5: FUN(1:13[[11L]], ...)
> 4: lapply(X = X, FUN = FUN, ...)
> 3: sapply(seq_len(length(filenames)), function(i) {
> sdate<- read.celfile.header(filenames[i], info = "full")[["ScanDate"]]
> if (is.null(sdate) || length(sdate) == 0)
> NA_character_
> else sdate
> })
> 2: read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
> description = l$description, notes = notes, compress = compress,
> rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra = rm.extra,
> verbose = verbose, sd = sd, cdfname = cdfname)
> 1: ReadAffy()
>
> May be there is a problem in reading the cel file header, so I opened one of the cel files with a text-editor but it seems correct.
> Can anyone help me?
> Thank you very much!
> Manuela
>
> ----------------------------------------------------------------------------------------
> Manuela Di Russo, Ph.D. Student
> Department of Experimental Pathology, MBIE
> University of Pisa
> Pisa, Italy
> e-mail: manuela.dirusso at for.unipi.it
> tel: +39050993538
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list