[BioC] crlmm problem

Matthias Arnold matthias.arnold at helmholtz-muenchen.de
Thu Mar 27 12:28:03 CET 2014


Matt Ritchie <mritchie at ...> writes:

> 
> Dear Agusti,
> 
> This sounds like a problem with importing the idats.  If you can send
through (offline) some Red and Green
> idat files for a few samples I can take a closer look.
> 
> Best wishes,
> 
> Matt
> 
> ----- Original Message -----
> From: "Aaerp" <al3n70rn <at> gmail.com>
> To: "Benilton Carvalho" <beniltoncarvalho <at> gmail.com>
> Cc: bioconductor <at> r-project.org
> Sent: Saturday, 1 February, 2014 9:39:20 PM
> Subject: Re: [BioC] crlmm problem
> 
> In English will be better...
> 
> Best regards,
> Agusti
> 
> Instantiate CNSet container.
> path arg not set.  Assuming files are in local directory, or that 
> complete path is provided
> Initializing container for genotyping and copy number estimation
> reading /Volumes/My Passport/SNP_LGG/6182351100/R01C01_Grn.idat
> Error in G$RunInfo[1, 1] : subscript out of bounds
> 
> Le 01/02/2014 11:27, Aaerp a écrit :
> > Hi all,
> >
> > I am trying to analyze some Illumina beadarray SNP data 
> > (humanomniexpress12v1b) using crlmm. Do you have any thoughts on what 
> > might be happening here?
> >
> > Kind regards,
> > Agusti
> >
> > library(crlmm)
> > library(ff)
> >
> > ocProbesets(150e3)
> >
> > datadir <- "/Volumes/My Passport/SNP_LGG/6182351100"
> > samplesheet = read.csv(file.path(datadir, 
> > "SampleSheet_PJ1111109_11S46v2.csv"), header=TRUE, as.is=TRUE,sep="\t")
> >
> > arrayNames <- file.path(datadir, unique(samplesheet[, 
> > "SentrixPosition_A"]))
> >
> > all(file.exists(paste(arrayNames, "_Grn.idat", sep="")))
> > #[1] TRUE
> > all(file.exists(paste(arrayNames, "_Red.idat", sep="")))
> > #[1] TRUE
> > cdfName <- "humanomniexpress12v1b"
> > batch <- rep("1", nrow(samplesheet))
> > arrayInfo <- list(barcode=NULL, position="SentrixPosition_A")
> >
> > cnSet <- genotype.Illumina(sampleSheet=samplesheet,
> > arrayNames=arrayNames,
> > arrayInfoColNames=arrayInfo,
> > cdfName="humanomniexpress12v1b",
> > batch=batch)
> >
> > Instantiate CNSet container.
> > path arg not set.  Assuming files are in local directory, or that 
> > complete path is provided
> > Initializing container for genotyping and copy number estimation
> > Loading required package: humanomniexpress12v1bCrlmm
> > Welcome to humanomniexpress12v1bCrlmm version 1.0.1
> > reading /Volumes/My Passport/SNP_LGG/6182351100/R01C01_Grn.idat
> > Erreur dans G$RunInfo[1, 1] : indice hors limites
> >
> > sessionInfo()
> > R version 3.0.2 (2013-09-25)
> > Platform: x86_64-apple-darwin10.8.0 (64-bit)
> >
> > locale:
> > [1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
> >
> > attached base packages:
> > [1] tools     parallel  stats     graphics  grDevices utils datasets  
> > methods   base
> >
> > other attached packages:
> > [1] ff_2.2-12 bit_1.1-10 humanomniexpress12v1bCrlmm_1.0.1
> > [4] crlmm_1.20.1 preprocessCore_1.22.0 oligoClasses_1.22.0
> > [7] BiocGenerics_0.6.0
> >
> > loaded via a namespace (and not attached):
> >  [1] affyio_1.28.0        Biobase_2.20.1 BiocInstaller_1.10.4 
> > Biostrings_2.28.0    codetools_0.2-8
> >  [6] ellipse_0.3-8        foreach_1.4.1 GenomicRanges_1.12.5 
> > grid_3.0.2           illuminaio_0.2.0
> > [11] IRanges_1.18.4       iterators_1.0.6      lattice_0.20-23 
> > Matrix_1.1-1.1       matrixStats_0.8.12
> > [16] mvtnorm_0.9-9996     R.methodsS3_1.5.2    RcppEigen_0.3.2.0 
> > stats4_3.0.2         VGAM_0.9-3
> > [21] zlibbioc_1.6.0
> >
> >
> >
> >
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor <at> r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> ______________________________________________________________________
> The information in this email is confidential and intend...{{dropped:6}}
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor <at> r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor


Hi,

I don't know if you were already able to solve this problem.

I encountered the same issue and it seems that the error occurs if scan date
and decode date are not read successfully (or are missing) from the idat-file.

For my analysis, I solved the problem by overwriting the
readIdatFiles-function, inserting the following code before the line

"dates$decode[i] = G$RunInfo[1, 1]"

if(dim(G$RunInfo)[1]<2){
    G$RunInfo<-rbind(G$RunInfo,matrix(c("1/1/2014 12:00:00
PM",NA,NA,NA,NA),ncol=5))
    G$RunInfo<-rbind(G$RunInfo,matrix(c("1/1/2014 12:00:00
PM",NA,NA,NA,NA),ncol=5))
}

"1/1/2014 12:00:00 PM" is a dummy date to make the function work. If you
know the actual scan and decode dates, you can enter the correct values.

Then you can use the updated readIdatFiles-function to obtain an NChannelSet
object with red/green intensities. These you can convert to XY using the
function RGtoXY (you'll have to copy all functions and variables used by
this function to the global environment for it to work (use getAnywhere() to
retrieve the code). For the crlmm environment variable, just use the
new.env() function to create an empty environment).

After that, you can call the genotype.Illumina function adding XY=XY (XY is
the object you obtained calling RGtoXY).

I know that this is a quite dirty hack, but it worked for me.

Best,
M



More information about the Bioconductor mailing list