[BioC] AgiMicroRna problems

Mark Cowley m.cowley at garvan.org.au
Fri Oct 1 04:01:41 CEST 2010


Hi,
You're right Corrinne,

According to the Agilent 10.5 Feature Extraction Manual:
Chapter 3 - Text File Parameters and Results, page 113
"You have the option in the Project Properties sheet of selecting to generate either the FULL set of parameters, statistics and feature information, or a COMPACT output package (default). The COMPACT output package contains only those columns that are required by GeneSpring and DNA Analytics software."

Since COMPACT is the default setting & there have been a number of similar error reports on BioC, can I suggest that the AgiMicroRNA code either error checks for this condition, or even if there's a more permanent solution which works with COMPACT TXT files?

I'm happy to provide some data files for testing purposes.

In the mean time, i'll work with the suggestions that have been made

cheers,
Mark
On 01/10/2010, at 12:46 AM, Mark Cowley wrote:

> Hi,
> I wonder if our core facility even knows about that option!
> Thanks for bringing this up
> 
> cheers,
> mark
> 
> On 30/09/2010, at 11:57 PM, Segal, Corrinne wrote:
> 
>> Hi,
>> 
>> If the data is extracted with the FE report set to 'Full' rather than 'Compact', then it reports the gMeanSignal and gBGUsed (but not chr_coord).  You can then follow the package using the amendments Pedro posted to get around not having the chr_coord column.
>> 
>> Cheers,
>> 
>> Corrinne
>> 
>> -----Original Message-----
>> From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of David Ruau
>> Sent: 29 September 2010 19:02
>> To: Mark Cowley
>> Cc: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] AgiMicroRna problems
>> 
>> Hi Mark,
>> 
>> I just received a data set of Human miRNA V3 and AgiMicroRna does not work.
>> Basically the column gMeanSignal, gBGUsed, chr_coord are not present.
>> I modified readMicroRnaAFE accordingly to the post of Pedro (see after the text)
>> 
>> My question is why those columns are not present in the txt file.
>> In the folder I received from our facility there is a XML file containing the settings of the FeatureExtraction software. One flag is TextOutPkgType="Compact"
>> Is there a way to test if this option can be change and what is the effect on the txt output?
>> 
>> readMicroRnaAFE <- function (targets, verbose = FALSE)
>> {
>>   if (!is(targets, "data.frame")) {
>>       stop("'targets' must be a data.frame")
>>   }
>>   ddaux=read.maimages(files=targets$FileName,source="agilent",
>>     other.columns=list(IsGeneDetected="gIsGeneDetected",
>>     IsSaturated ="gIsSaturated", IsFeatNonUnifOF ="gIsFeatNonUnifOL",
>>     IsFeatPopnOL ="gIsFeatPopnOL", BGKmd ="gBGMedianSignal"),
>>     columns=list(R="gTotalGeneSignal", G="gTotalProbeSignal",
>>     Rb="gTotalGeneSignal", Gb="gProcessedSignal"),
>>     verbose=TRUE,sep="\t",quote=""
>>   )
>>   #return(ddaux)
>>   dd = new("RGList")
>>   dd$R = ddaux$R
>>   dd$G = ddaux$G
>>   dd$Rb = ddaux$Rb
>>   dd$Gb = ddaux$Gb
>>   dd$targets = ddaux$targets
>>   ## suppress column 6 that should have contain chr_pos I guess
>>   dd$genes = ddaux$genes[, c(4, 5)]
>>   dd$other = ddaux$other
>>   rm(ddaux)
>>   if (verbose) {
>>       cat("", "\n")
>>       cat("  RGList:", "\n")
>>       cat("\tdd$R:\t\t'gTotalGeneSignal' ", "\n")
>>       cat("\tdd$G:\t\t'gTotalProbeSignal' ", "\n")
>>       cat("\tdd$Rb:\t\t'gMeanSignal' ", "\n")
>>       cat("\tdd$Gb:\t\t'gProcessedSignal' ", "\n")
>>       cat("", "\n")
>>   }
>>   return(dd)
>> }
>> 
>> 
>> David
>> 
>> On Sep 28, 2010, at 11:52 PM, Mark Cowley wrote:
>> 
>>> Has anyone had success using AgiMicroRna recently? what array types were you using?
>>> cheers,
>>> Mark
>>> 
>>> On 21/09/2010, at 9:44 PM, Mark Cowley wrote:
>>> 
>>>> Dear Pedro, and BioCers
>>>> similar to these 2 posts, i'm having problems running AgiMicroRna,
>>>> because my Agilent TXT files are missing these three columns:
>>>> gMeanSignal, gBGUsed, chr_coord.
>>>> https://www.stat.math.ethz.ch/pipermail/bioconductor/2010-August/035136.html
>>>> http://comments.gmane.org/gmane.science.biology.informatics.conductor/28101
>>>> 
>>>> Here was my first attempt
>>>>> library("AgiMicroRna")
>>>>> targets.micro=readTargets(infile="/Volumes/****/projects/****/
>>>> targets.txt") (sorry - paranoid collaborator)
>>>>> dd.micro=readMicroRnaAFE(targets.micro,verbose=TRUE)
>>>> Error in readGenericHeader(fullname, columns = columns, sep = sep) :
>>>> Specified column headings not found in file
>>>> 
>>>> I then tried to recreate my own readMicroRnaAFE which constructed
>>>> dummy chr_coord, BGKus objects, but then I wasn't able to run the
>>>> cvArray function:
>>>>> library("AgiMicroRna")
>>>>> targets.micro=readTargets(infile="/Volumes/external/projects/LW/
>>>> targets.txt")
>>>>> dd.micro=readMicroRnaAFE(targets.micro,verbose=TRUE)
>>>> # QC plots ran OK
>>>>> cvArray(dd.micro,"MeanSignal",targets.micro,verbose=TRUE)
>>>> Foreground: MeanSignal
>>>> 
>>>> 	FILTERING BY ControlType FLAG
>>>> 
>>>> RAW DATA: 			 15739
>>>> Error in object$other[[k]][i, , drop = FALSE] :
>>>> incorrect number of dimensions
>>>>> cvArray(dd.micro,"ProcessedSignal",targets.micro,verbose=TRUE)
>>>> Foreground: ProcessedSignal
>>>> 
>>>> 	FILTERING BY ControlType FLAG
>>>> 
>>>> RAW DATA: 			 15739
>>>> Error in object$other[[k]][i, , drop = FALSE] :
>>>> incorrect number of dimensions
>>>> 
>>>> I gave up on this approach, and instead I followed Pedro's advice in
>>>> the first URL that I mentioned, and used gTotalSignal instead of
>>>> gMeanSignal, and removed instances of chr_coord and gBGUsed, but then
>>>> I can't get TGS, or RMA normalization to work
>>>> 
>>>>> library("AgiMicroRna")
>>>>> targets.micro=readTargets(infile="/Volumes/****/projects/****/
>>>> targets.txt") (sorry - paranoid collaborator)
>>>> ddaux=read.maimages(files=targets.micro$FileName,source="agilent",
>>>> +
>>>> other.columns=list(IsGeneDetected="gIsGeneDetected",
>>>> +
>>>>                                                                      IsSaturated
>>>> ="gIsSaturated",
>>>> +
>>>>                                                                      IsFeatNonUnifOF
>>>> ="gIsFeatNonUnifOL",
>>>> +
>>>>                                                                      IsFeatPopnOL
>>>> ="gIsFeatPopnOL",
>>>> +
>>>>                                                                      BGKmd
>>>> ="gBGMedianSignal"),
>>>> +                          columns=list(Rf="gTotalGeneSignal",
>>>> +
>>>> Gf="gTotalProbeSignal",
>>>> +
>>>> Rb="gTotalGeneSignal",
>>>> +
>>>> Gb="gProcessedSignal"),
>>>> +                          verbose=TRUE,sep="\t",quote="")
>>>>> ddNORM = tgsNormalization(ddTGS, "quantile", makePLOTpre = T,
>>>> makePLOTpost = T, targets.micro, verbose = TRUE)
>>>> Error in density.default(object[, n], na.rm = TRUE) :
>>>> need at least 2 points to select a bandwidth automatically
>>>>> 
>>>>> ddNORM = tgsNormalization(ddTGS, "quantile", makePLOTpre = F,
>>>> makePLOTpost = F, targets.micro, verbose = TRUE)
>>>> Error in xy.coords(x, y) : 'x' and 'y' lengths differ
>>>>> 
>>>>> 
>>>>> ddTGS.rma = rmaMicroRna(ddaux, normalize = TRUE, background = TRUE)
>>>> Error in split.default(0:(length(pNList) - 1), pNList) :
>>>> Group length is 0 but data length > 0
>>>> # this takes quite a few minutes to process, then gives this error
>>>> 
>>>> I've seen quite a bit of Agilent microRNA data through our centre, and
>>>> can't recall ever seeing a chr_coord column, so is this to do with
>>>> different versions of Agilent Feature Extraction, or different
>>>> defaults set by the array facility?
>>>> 
>>>> I'd really like to RMA normalize these data, so any help would be
>>>> really appreciated
>>>> 
>>>> cheers,
>>>> Mark
>>>> 
>>>> 
>>>> sessionInfo()
>>>> R version 2.11.1 (2010-05-31)
>>>> i386-apple-darwin9.8.0
>>>> 
>>>> locale:
>>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>>>> 
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>> 
>>>> other attached packages:
>>>> [1] AgiMicroRna_1.2.0     preprocessCore_1.10.0 affy_1.26.1
>>>> limma_3.4.3           Biobase_2.8.0
>>>> 
>>>> loaded via a namespace (and not attached):
>>>> [1] affyio_1.16.0 tools_2.11.1
>>>>> 
>>>> 
>>>> 
>>>> -----------------------------------------------------
>>>> Mark Cowley, PhD
>>>> 
>>>> Peter Wills Bioinformatics Centre
>>>> Garvan Institute of Medical Research, Sydney, Australia
>>>> -----------------------------------------------------
>>>> 
>>>> 
>>>> 	[[alternative HTML version deleted]]
>>>> 
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> 
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list