[BioC] Error in lm.fit(design, t(M)) from GEO dataset

James F. Reid james.reid at ifom-ieo-campus.it
Fri May 28 11:25:23 CEST 2010


Hi Jeremiah,

my guess is that you are taking logs of negative values at some stage in 
your pre-processing.

James.

On 05/28/2010 03:39 AM, Jeremiah H. Savage wrote:
> Hello,
>
> I am using limma 3.4.0, and received an error with a GEO series (
> http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12113 ) that
> doesn't occur with other series data:
>
> Error in lm.fit(design, t(M)) :
>    NA/NaN/Inf in foreign function call (arg 4)
>
> If I do a summary on the eset, I get some -Inf values. Does this mean
> there is some invalid data in the series that lmFit doesn't handle?
>> summary(exprs(eset12113))
>     GSM305488        GSM305489        GSM305490        GSM305491
>   Min.   :-2.322   Min.   :-3.322   Min.   :-3.322   Min.   :  -Inf
>   1st Qu.: 3.263   1st Qu.: 2.982   1st Qu.: 4.585   1st Qu.: 3.868
>   Median : 5.202   Median : 4.916   Median : 6.200   Median : 5.638
>   Mean   : 5.140   Mean   : 4.934   Mean   : 6.054   Mean   :  -Inf
>   3rd Qu.: 6.785   3rd Qu.: 6.637   3rd Qu.: 7.485   3rd Qu.: 7.323
>   Max.   :15.620   Max.   :15.180   Max.   :14.749   Max.   :14.290
>
>
> The code I'm using is:
>> source("sampleEset.R")
>>   fit<- getLMFit()
>
>
> sampleEset.R :
> -------------------8<-----------------------------
>
> getLMFit<- function()
>    {
>      eset12113<- getEset("12113")
>
>      design12113<- model.matrix(~ 0+factor(c(1,1,2,3,3,4,5,5,6,7,7,8,9,9,10)))
>      colnames(design12113)<-
> c("untreated1","treated1","untreated2","treated2","untreated3","treated3","untreated4","treated4","untreated5","treated5")
>      contrast.matrix<- makeContrasts(untreated1-treated1,levels=design12113)
>      fit<- lmFit(eset12113,design12113)
> }
>
> getEset<- function(value)
>    {
>      DATADIR = "/home/jeremiah/R/i486-pc-linux-gnu-library/2.11/GEOquery/extdata"
>      data_file = paste("GSE",value,".soft",sep="")
>      data_dir_file= paste(DATADIR,data_file,sep="/")
>
>      library(GEOquery)
>      library(Biobase)
>      library(limma)
>
>      if (file.exists(data_dir_file))
>        {
>          GSE_parsed<- getGEO(filename=data_dir_file)
>        }
>      else
>        {
>          GSE_value<- paste("GSE",value,sep="")
>          GSE_file<- getGEOfile(GSE_value,destdir=DATADIR)
>          GSE_parsed<- getGEO(filename=data_dir_file)
>        }
>      GSM_platform<- lapply(GSMList(GSE_parsed), function(x) {
>        Meta(x)$platform
>      })
>      GSM_platform<- GSM_platform[[1]]
>
>      probe_set<- Table(GPLList(GSE_parsed)[[1]])$ID
>
>      data.matrix<- do.call("cbind", lapply(GSMList(GSE_parsed), function(x) {
>        tab<- Table(x)
>        mymatch<- match(probe_set, tab$ID_REF)
>        return(tab$VALUE[mymatch])
>      }))
>
>      data.matrix<- apply(data.matrix, 2, function(x) {
>        as.numeric(as.character(x))
>      })
>
>      dataLog2.matrix<- log2(data.matrix)
>
>      rownames(dataLog2.matrix)<- probe_set
>      colnames(dataLog2.matrix)<- names(GSMList(GSE_parsed))
>      p_data<- data.frame(samples = names(GSMList(GSE_parsed)))
>      rownames(p_data)<- names(GSMList(GSE_parsed))
>      pheno<- as(p_data, "AnnotatedDataFrame")
>      eset<- new("ExpressionSet", exprs = dataLog2.matrix, phenoData = pheno)
>      return(eset)
>    }
>
> -------------------8<-----------------------------
>
> Thanks,
> Jeremiah
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list