[BioC] Unable to Generate QC Report for mogene10stv1
Rick Frausto
ricardo.frausto at sydney.edu.au
Tue Jan 11 20:57:45 CET 2011
Thanks for all your help Jim!
On 11/01/11 6:58 AM, "James W. MacDonald" <jmacdon at med.umich.edu> wrote:
> Hi Rick,
>
> On 1/10/2011 4:57 PM, Rick Frausto wrote:
>> Hi Jim,
>>
>> You're right...
>>
>>> any(duplicated(unlist(indexProbes(mydata, "both"))))
>> [1] TRUE
>>>
>>
>> Figured it would be something simple, almost always is. Guess since the MM
>> values are only really necessary for calculating a "real" PM value I should
>> generally still be ok with using R Bioconductor packages for downstream
>> analysis of these chips?? For example, using eset<-rma() to normalize my
>> data should still be ok.
>
> Yep. RMA only uses PM values, so this will be fine. You only get into
> trouble when trying to use mas5 based methods.
>
>>
>> By the way, the documentation on the AffyQCReport function regarding
>> signalDist() states that "The first is a boxplot plot of the all pm
>> intensities and the second plot consists of kernel density estimates of
>> these intensities." From this it would seem to a novice like me that it only
>> uses PM values, clearly I'm not correct. I guess these are PM values
>> adjusted for the MM signal.
>
> Nope, they aren't adjusted for MM, they just include the MM values as
> well. Here is a little primer on how to see what is going on.
>
> If you load the affyQCReport package and then type signalDist at the R
> prompt, you will get this:
>
>> signalDist
> function (object)
> {
> par(mfrow = c(2, 1))
> ArrayIndex = as.character(1:length(sampleNames(object)))
> boxplot(object, names = ArrayIndex, ylab = "Log2(Intensity)",
> xlab = "Array Index")
> hist(x = object, lt = 1:length(ArrayIndex), col = 1:length(ArrayIndex),
> which = "both")
> temppar <- par()
> legend(((temppar$xaxp[2] - temppar$xaxp[1])/temppar$xaxp[3]) *
> (temppar$xaxp[3] - 1) + temppar$xaxp[1], temppar$yaxp[2],
> as.character(ArrayIndex), lt = 1:length(ArrayIndex),
> col = 1:length(ArrayIndex), cex = 0.5)
> }
> <environment: namespace:affyQCReport>
>
> So you can see that we are calling boxplot() as well as hist() on the
> 'object', which is an AffyBatch. Let's see what boxplot() and hist() do.
>
>> boxplot
> standardGeneric for "boxplot" defined from package "graphics"
>
> function (x, ...)
> standardGeneric("boxplot")
> <environment: 0x184ea378>
> Methods may be defined for arguments: x
> Use showMethods("boxplot") for currently available ones.
>
> So this is an S4 method, and the methods are slightly harder to get to,
> but let's follow the prescription on the last line.
>
>> showMethods(boxplot, class = "AffyBatch", includeDefs = TRUE)
> Function: boxplot (package graphics)
> x="AffyBatch"
> function (x, ...)
> {
> .local <- function (x, which = "both", range = 0, main, ...)
> {
> tmp <- description(x)
> if (missing(main) && (is(tmp, "MIAME")))
> main <- tmp at title
> tmp <- unlist(indexProbes(x, which))
> tmp <- tmp[seq(1, length(tmp), len = 5000)]
> boxplot(data.frame(log2(intensity(x)[tmp, ])), main = main,
> range = range, ...)
> }
> .local(x, ...)
> }
>
> Note two things here. I added in class = "AffyBatch", because there may
> be other boxplot methods for other objects, and we really don't care
> about them. Additionally, I included includeDefs = TRUE, which will
> cause the function to be output.
>
> The .local function has a default of which = 'both', and you see that
> argument is used for the call to indexProbes (also note that there is a
> '...' argument to .local, that could be used to pass in a which = "pm"
> in signalDist() to override the default, but it is not, so the help page
> is incorrect). If you look at ?indexProbes, you will see this in the
> methods section:
>
> indexProbes 'signature(object = "AffyBatch", which =
> "character")': returns a list with locations of the probes in
> each probe set. The affyID corresponding to the probe set to
> retrieve can be specified in an optional parameter
> 'genenames'. By default, all the affyIDs are retrieved. The
> names of the elements in the list returned are the affyIDs.
> 'which' can be "pm", "mm", or "both". If "both" then perfect
> match locations are given followed by mismatch locations.
>
> The warning you get comes from here:
>
> tmp <- unlist(indexProbes(x, which))
> tmp <- tmp[seq(1, length(tmp), len = 5000)]
> boxplot(data.frame(log2(intensity(x)[tmp, ])), main = main,
> range = range, ...)
>
> Which is basically getting a subset of 5000 probes to create the
> boxplot. Since half of your indices from indexProbes() will be NA, a
> bunch of the tmp variable will be NAs as well. We can re-create the
> warning you get below with a little example:
>
>> x <- matrix(rnorm(100), ncol = 10)
>> row.names(x) <- letters[1:10]
>> z <- data.frame(x[c(1,2,3,NA,4,5,NA),])
> Warning message:
> In data.row.names(row.names, rowsi, i) :
> some row.names duplicated: 7 --> row.names NOT used
>
> Best,
>
> Jim
>
>
>>
>> Thanks for figuring this out for me. Let me know if these and other related
>> questions would be better served as standalone e-mails.
>>
>> Cheers,
>> Rick
>>
>>
>>
>> On 10/01/11 7:04 AM, "James W. MacDonald"<jmacdon at med.umich.edu> wrote:
>>
>>> Hi Rick,
>>>
>>> After all that, the reason is really simple. You are trying to use
>>> affyQCReport on a PM-only chip, which isn't going to work out so well. I
>>> don't have any mogene data around to play with (and don't have the time
>>> to go searching), so I will have to make some educated guesses.
>>>
>>> Internally in signalDist() you are calling boxplot() and hist() on your
>>> AffyBatch. And the default for both functions is to use both PM and MM
>>> probes. I'm betting that
>>>
>>> any(duplicated(unlist(indexProbes(mydata, "both"))))
>>>
>>> returns TRUE, indicating that indexProbes doesn't work correctly on a
>>> PM-only chip, which is fair enough, as it was never designed to do so.
>>>
>>> And plot(qc(mydata)) will never work, as it relies on computing a
>>> Wilcoxon signed-rank between the PM and MM probes, and since you don't
>>> have MM probes, well you get the picture...
>>>
>>> Best,
>>>
>>> Jim
>>>
>>>
>>>
>>> On 1/7/2011 6:56 PM, Rick Frausto wrote:
>>>> Hi Jim,
>>>>
>>>> Ok, so after doing a bit of reading and re-reading I was eventually able to
>>>> generate each page in a quartz window that the "QCReport" function should
>>>> also generate. I found which ones give me the errors. So, there should be 6
>>>> pages in total. Page 2 gives me the duplication error and page 3 gives me
>>>> the error in evaluating the argument x. The other pages are ok and are
>>>> generated as expected.
>>>>
>>>> In brief, page 2 is suppose to be generated with the "signalDist(mydata)"
>>>> command. Page 3 is suppose to generated with the "plot(qc(mydata))"
>>>> command.
>>>>
>>>> So, I guess there must be particular requirements for these commands that
>>>> I'm missing.I've included the session below along with traceback() and
>>>> sessionInfo().
>>>>
>>>>
>>>> R version 2.12.0 (2010-10-15)
>>>> Copyright (C) 2010 The R Foundation for Statistical Computing
>>>> ISBN 3-900051-07-0
>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>
>>>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>>>> You are welcome to redistribute it under certain conditions.
>>>> Type 'license()' or 'licence()' for distribution details.
>>>>
>>>> Natural language support but running in an English locale
>>>>
>>>> R is a collaborative project with many contributors.
>>>> Type 'contributors()' for more information and
>>>> 'citation()' on how to cite R or R packages in publications.
>>>>
>>>> Type 'demo()' for some demos, 'help()' for on-line help, or
>>>> 'help.start()' for an HTML browser interface to help.
>>>> Type 'q()' to quit R.
>>>>
>>>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0]
>>>>
>>>> [Workspace restored from /Users/rickfrausto/.RData]
>>>> [History restored from /Users/rickfrausto/.Rapp.history]
>>>>
>>>>> library(simpleaffy)
>>>> Loading required package: affy
>>>> Loading required package: Biobase
>>>>
>>>> Welcome to Bioconductor
>>>>
>>>> Vignettes contain introductory material. To view, type
>>>> 'openVignette()'. To cite Bioconductor, see
>>>> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>>>>
>>>> Loading required package: genefilter
>>>> Loading required package: gcrma
>>>>
>>>> Attaching package: 'simpleaffy'
>>>>
>>>> The following object(s) are masked _by_ '.GlobalEnv':
>>>>
>>>> getBioC
>>>>
>>>>> library(affy)
>>>>> mydata<- ReadAffy()
>>>>> eset<- rma(mydata)
>>>> Background correcting
>>>> Normalizing
>>>> Calculating Expression
>>>>> library(affycoretools); affystart(plot=T, express="rma")
>>>> Loading required package: GO.db
>>>> Loading required package: AnnotationDbi
>>>> Loading required package: DBI
>>>> Loading required package: KEGG.db
>>>> Background correcting
>>>> Normalizing
>>>> Calculating Expression
>>>> Please give the x-coordinate for a legend.30
>>>> Please give the y-coordinate for a legend.80
>>>> ExpressionSet (storageMode: lockedEnvironment)
>>>> assayData: 34760 features, 35 samples
>>>> element names: exprs
>>>> protocolData
>>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ...
>>>> ZI_ST1KO_HIL6_12hr.CEL (35 total)
>>>> varLabels: ScanDate
>>>> varMetadata: labelDescription
>>>> phenoData
>>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ...
>>>> ZI_ST1KO_HIL6_12hr.CEL (35 total)
>>>> varLabels: sample
>>>> varMetadata: labelDescription
>>>> featureData: none
>>>> experimentData: use 'experimentData(object)'
>>>> Annotation: mogene10stv1
>>>>> write.exprs(eset, file="mydata.txt")
>>>>> x<- data.frame(exprs(eset), exprs(eset_PMA), assayDataElement(eset_PMA,
>>>> "se.exprs")); x<- x[,sort(names(x))]; write.table(x, file="mydata_PMA.xls",
>>>> quote=F, col.names = NA, sep="\t")
>>>> Error in exprs(eset_PMA) :
>>>> error in evaluating the argument 'object' in selecting a method for
>>>> function 'exprs'
>>>>> mypm<- pm(mydata)
>>>>> mymm<- mm(mydata)
>>>>> myaffyids<- probeNames(mydata)
>>>>> result<- data.frame(myaffyids, mypm, mymm)
>>>>> eset; pData(eset)
>>>> ExpressionSet (storageMode: lockedEnvironment)
>>>> assayData: 34760 features, 35 samples
>>>> element names: exprs
>>>> protocolData
>>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ...
>>>> ZI_ST1KO_HIL6_12hr.CEL (35 total)
>>>> varLabels: ScanDate
>>>> varMetadata: labelDescription
>>>> phenoData
>>>> sampleNames: A_WT1_NT_2hr.CEL B_WT1_NT_2hr.CEL ...
>>>> ZI_ST1KO_HIL6_12hr.CEL (35 total)
>>>> varLabels: sample
>>>> varMetadata: labelDescription
>>>> featureData: none
>>>> experimentData: use 'experimentData(object)'
>>>> Annotation: mogene10stv1
>>>> sample
>>>> A_WT1_NT_2hr.CEL 1
>>>> B_WT1_NT_2hr.CEL 2
>>>> C_WT1_NT_12hr.CEL 3
>>>> D_WT1_NT_12hr.CEL 4
>>>> E_WT1_HIL6_2hr.CEL 5
>>>> F_WT1_HIL6_2hr.CEL 6
>>>> G_WT1_HIL6_12hr.CEL 7
>>>> H_WT1_HIL6_12hr.CEL 8
>>>> I_FF_NT_2hr.CEL 9
>>>> J_FF_NT_2hr.CEL 10
>>>> K_FF_NT_12hr.CEL 11
>>>> L_FF_NT_12hr.CEL 12
>>>> M_FF_HIL6_2hr.CEL 13
>>>> N_FF_HIL6_2hr.CEL 14
>>>> O_FF_HIL6_12hr.CEL 15
>>>> P_FF_HIL6_12hr.CEL 16
>>>> Q_WT2_NT_2hr.CEL 17
>>>> R_WT2_NT_2hr.CEL 18
>>>> S_WT2_NT_12hr.CEL 19
>>>> T_WT2_NT_12hr.CEL 20
>>>> U_WT2_HIL6_2hr.CEL 21
>>>> V_WT2_HIL6_2hr.CEL 22
>>>> W_WT2_HIL6_12hr.CEL 23
>>>> X_WT2_HIL6_12hr.CEL 24
>>>> Y_DD_NT_2hr.CEL 25
>>>> Z_DD_NT_2hr.CEL 26
>>>> ZA_DD_NT_12hr.CEL 27
>>>> ZB_DD_NT_12hr.CEL 28
>>>> ZC_DD_HIL6_2hr.CEL 29
>>>> ZD_DD_HIL6_2hr.CEL 30
>>>> ZE_DD_HIL6_12hr.CEL 31
>>>> ZF_DD_HIL6_12hr.CEL 32
>>>> ZG_ST1KO_NT_2hr.CEL 33
>>>> ZH_ST1KO_HIL6_2hr.CEL 34
>>>> ZI_ST1KO_HIL6_12hr.CEL 35
>>>>> data.frame(eset)
>>>> X10338001 X10338003 X10338004 X10338017 X10338025
>>>> A_WT1_NT_2hr.CEL 11.71717 10.183620 9.440631 12.79412 8.823529
>>>> B_WT1_NT_2hr.CEL 11.78778 10.027760 9.489226 12.98544 8.843002
>>>> X10338026 X10338029 X10338035 X10338036 X10338037
>>>> A_WT1_NT_2hr.CEL 13.22585 9.405038 8.853564 9.379031 3.661987
>>>> B_WT1_NT_2hr.CEL 13.29043 9.575309 8.772872 9.513050 3.514885
>>>> X10338041 X10338042 X10338044 X10338047 X10338056
>>>> A_WT1_NT_2hr.CEL 10.94638 10.116516 11.88296 8.872839 3.133222
>>>> B_WT1_NT_2hr.CEL 11.23276 10.134084 12.03381 7.568584 3.088548
>>>> X10338059 X10338060 X10338063 X10338064 X10338065
>>>>
>>>> JIM, I TRUNCATED THIS LIST, BUT THOUGHT IT MIGHT BE USEFUL IN DIAGNOSING
>>>> THE
>>>> PROBLEMS I'M HAVING. SESSION IS CONTINUED BELOW.
>>>>
>>>>> library(affyQCReport)
>>>> Loading required package: lattice
>>>>> titlePage(mydata)
>>>> [1] TRUE
>>>>> signalDist(mydata)
>>>> Warning message:
>>>> In data.row.names(row.names, rowsi, i) :
>>>> some row.names duplicated:
>>>>
4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,52,53,>>>>
5
>>>>
4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,102,>>>>
1
>>>>
03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141,142,>>>>
1
>>>>
47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170,171,>>>>
1
>>>>
73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206,207,>>>>
2
>>>>
10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,>>>>
2
>>>>
52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291,292,>>>>
2
>>>>
96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334,337,>>>>
3
>>>>
38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376,378,>>>>
3
>>>>
82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405,406,>>>>
4
>>>>
07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445,447,>>>>
4
>>>>
49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494,>>>>
4
>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [...
>>>> truncated]
>>>>> plot(qc(mydata))
>>>> Error in plot(qc(mydata)) :
>>>> error in evaluating the argument 'x' in selecting a method for function
>>>> 'plot'
>>>>> borderQC1(mydata)
>>>> [1] TRUE
>>>>> borderQC2(mydata)
>>>> [1] TRUE
>>>>> correlationPlot(mydata)
>>>> [1] TRUE
>>>>> titlePage(mydata)
>>>> [1] TRUE
>>>>> titlePage(mydata)
>>>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) :
>>>> plot.new has not been called yet
>>>>> correlationPlot(mydata)
>>>> [1] TRUE
>>>>> titlePage(mydata)
>>>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) :
>>>> plot.new has not been called yet
>>>> In addition: Warning message:
>>>> Display list redraw incomplete
>>>>> borderQC1(mydata)
>>>> [1] TRUE
>>>>> titlePage(mydata)
>>>> [1] TRUE
>>>>> titlePage(mydata)
>>>> Error in polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05)) :
>>>> plot.new has not been called yet
>>>>> traceback()
>>>> 2: polygon(c(0, 0, 0.9, 0.9, 0), c(0.05, 0.95, 0.95, 0.05, 0.05))
>>>> 1: titlePage(mydata)
>>>>> sessionInfo()
>>>> R version 2.12.0 (2010-10-15)
>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods base
>>>>
>>>> other attached packages:
>>>> [1] affyQCReport_1.28.1 lattice_0.19-13 affycoretools_1.22.0
>>>> [4] KEGG.db_2.4.5 GO.db_2.4.5 RSQLite_0.9-4
>>>> [7] DBI_0.2-5 AnnotationDbi_1.12.0 mogene10stv1cdf_2.7.0
>>>> [10] simpleaffy_2.26.1 gcrma_2.22.0 genefilter_1.32.0
>>>> [13] affy_1.28.0 Biobase_2.10.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] affyio_1.18.0 affyPLM_1.26.0 annaffy_1.22.0
>>>> [4] annotate_1.28.0 biomaRt_2.6.0 Biostrings_2.18.2
>>>> [7] Category_2.16.0 GOstats_2.16.0 graph_1.28.0
>>>> [10] grid_2.12.0 GSEABase_1.12.2 IRanges_1.8.7
>>>> [13] limma_3.6.9 preprocessCore_1.12.0 RBGL_1.26.0
>>>> [16] RColorBrewer_1.0-2 RCurl_1.4-3 splines_2.12.0
>>>> [19] survival_2.36-2 tools_2.12.0 XML_3.2-0
>>>> [22] xtable_1.5-6
>>>>>
>>>>
>>>> On 7/01/11 12:47 PM, "James W. MacDonald"<jmacdon at med.umich.edu> wrote:
>>>>
>>>>> Hi Rick,
>>>>>
>>>>> What happens if you load the simpleaffy package first?
>>>>>
>>>>> Best,
>>>>>
>>>>> Jim
>>>>>
>>>>> On 1/7/2011 2:14 PM, Rick Frausto wrote:
>>>>>> Hi James,
>>>>>>
>>>>>> Below is the information that you requested - traceback() and
>>>>>> sessioninfo().
>>>>>> Doesn't seem like much to me, but perhaps you can help. As you answer to
>>>>>> a
>>>>>> lot of e-mails, thought I'd remind you that this is in regards to the
>>>>>> "some
>>>>>> row.names duplicated" error.
>>>>>>
>>>>>> Hope your holidays were good!
>>>>>>
>>>>>> -Rick
>>>>>>
>>>>>> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0]
>>>>>>
>>>>>> [Workspace restored from /Users/rickfrausto/.RData]
>>>>>> [History restored from /Users/rickfrausto/.Rapp.history]
>>>>>>
>>>>>>> library(affy)
>>>>>> Loading required package: Biobase
>>>>>>
>>>>>> Welcome to Bioconductor
>>>>>>
>>>>>> Vignettes contain introductory material. To view, type
>>>>>> 'openVignette()'. To cite Bioconductor, see
>>>>>> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>>>>>>
>>>>>>> mydata<- ReadAffy()
>>>>>>> eset<- rma(mydata)
>>>>>> Background correcting
>>>>>> Normalizing
>>>>>> Calculating Expression
>>>>>>> write.exprs(eset, file="mydata.txt")
>>>>>>> mypm<- pm(mydata)
>>>>>>> mymm<- mm(mydata)
>>>>>>> myaffyids<- probeNames(mydata)
>>>>>>> result<- data.frame(myaffyids, mypm, mymm)
>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf")
>>>>>> Loading required package: lattice
>>>>>> Warning message:
>>>>>> In data.row.names(row.names, rowsi, i) :
>>>>>> some row.names duplicated:
>>>>>>
>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,52,53,>>
>> >>
>> 5
>>>>>>
>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,102,>>
>> >>
>> 1
>>>>>>
>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141,142,>>
>> >>
>> 1
>>>>>>
>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170,171,>>
>> >>
>> 1
>>>>>>
>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206,207,>>
>> >>
>> 2
>>>>>>
>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,>>
>> >>
>> 2
>>>>>>
>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291,292,>>
>> >>
>> 2
>>>>>>
>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334,337,>>
>> >>
>> 3
>>>>>>
>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376,378,>>
>> >>
>> 3
>>>>>>
>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405,406,>>
>> >>
>> 4
>>>>>>
>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445,447,>>
>> >>
>> 4
>>>>>>
>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494,>>
>> >>
>> 4
>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [...
>>>>>> truncated]
>>>>>> Error in plot(qc(object)) :
>>>>>> error in evaluating the argument 'x' in selecting a method for
>>>>>> function
>>>>>> 'plot'
>>>>>>> traceback()
>>>>>> 2: plot(qc(object))
>>>>>> 1: QCReport(mydata, file = "ExampleQC.pdf")
>>>>>>> sessionInfo()
>>>>>> R version 2.12.0 (2010-10-15)
>>>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>>>
>>>>>> locale:
>>>>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>>>>>>
>>>>>> attached base packages:
>>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>>
>>>>>> other attached packages:
>>>>>> [1] affyQCReport_1.28.1 latptice_0.19-13 mogene10stv1cdf_2.7.0
>>>>>> [4] affy_1.28.0 Biobase_2.10.0
>>>>>>
>>>>>> loaded via a namespace (and not attached):
>>>>>> [1] affyio_1.18.0 affyPLM_1.26.0 annotate_1.28.0
>>>>>> [4] AnnotationDbi_1.12.0 Biostrings_2.18.2 DBI_0.2-5
>>>>>> [7] gcrma_2.22.0 genefilter_1.32.0 grid_2.12.0
>>>>>> [10] IRanges_1.8.7 preprocessCore_1.12.0 RColorBrewer_1.0-2
>>>>>> [13] RSQLite_0.9-4 simpleaffy_2.26.1 splines_2.12.0
>>>>>> [16] survival_2.36-2 tools_2.12.0 xtable_1.5-6
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at med.umich.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Rick,
>>>>>>>
>>>>>>> On 12/17/2010 9:24 PM, Rick Frausto wrote:
>>>>>>>> Hey Jim,
>>>>>>>>
>>>>>>>> Ok, I will give that a go. The only problem is an ExpressionSet
>>>>>>>> contains
>>>>>>>> all
>>>>>>>> of the necessary information for further analysis (e.g. phenodata,
>>>>>>>> featuredata and annotation, etc - including, treatment type, cell type,
>>>>>>>> time
>>>>>>>> points, replicates). I am still learning how to include all of these
>>>>>>>> for
>>>>>>>> a
>>>>>>>> complete ExpressionSet. As a starting point I've loaded a txt file
>>>>>>>> containing some of this information (gene abbrev, ontology, probeset
>>>>>>>> ID)
>>>>>>>> which I created using Affymetrix's Expression Console software, without
>>>>>>>> replicate, time point and cell type info. Doing this I've gotten as far
>>>>>>>> as
>>>>>>>> creating a minimal ExpressionSet, which I guess the functions you
>>>>>>>> mention
>>>>>>>> below do just that but with the information contained in the CEL file
>>>>>>>> only.
>>>>>>>>
>>>>>>>> In any case, since as you say, the functions in the online manual
>>>>>>>> create
>>>>>>>> a
>>>>>>>> proper ExpressionSet why would I get the issue of duplication?
>>>>>>>
>>>>>>> Oh yeah, the original question ;-D. Try running QCreport() again, and
>>>>>>> when it errors out run traceback() and send the output. Also include the
>>>>>>> output of sessionInfo().
>>>>>>>
>>>>>>> Jim
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> In regards to the 64-bit discussion. It may have very well made enough
>>>>>>>> of
>>>>>>>> a
>>>>>>>> difference as it did not come up with the memory error the last time I
>>>>>>>> tried
>>>>>>>> it. Going to upgrade to 8GB RAM anyways, can't hurt.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Rick
>>>>>>>>
>>>>>>>>
>>>>>>>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at med.umich.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Rick,
>>>>>>>>>
>>>>>>>>> On 12/16/2010 4:13 PM, Rick Frausto wrote:
>>>>>>>>>> Hi Jim,
>>>>>>>>>>
>>>>>>>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest
>>>>>>>>>> answer,
>>>>>>>>>> I
>>>>>>>>>> don't know, I just put in a command line from a manual I found online
>>>>>>>>>> and
>>>>>>>>>> it
>>>>>>>>>> spit out some result- see #3 Affy packages in following link (
>>>>>>>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#biocon_int
>>>>>>>>>> ro
>>>>>>>>>> ).
>>>>>>>>>
>>>>>>>>> You are mistaken. All of the functions mentioned there result in a
>>>>>>>>> proper ExpressionSet. And if you just do
>>>>>>>>>
>>>>>>>>> abatch<- ReadAffy()
>>>>>>>>> eset<- rma(abatch)
>>>>>>>>>
>>>>>>>>> Then you will 100% surely get an ExpressionSet.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Perhaps you don't need an ExpressionSet until after the
>>>>>>>>>> preprocessing,
>>>>>>>>>> at
>>>>>>>>>> least that is what I get from the "An Introduction to Bioconductor's
>>>>>>>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert
>>>>>>>>>> Gentleman. Everything seemed to be going smoothly until I tried to
>>>>>>>>>> get
>>>>>>>>>> a
>>>>>>>>>> QC
>>>>>>>>>> Report.
>>>>>>>>>>
>>>>>>>>>> Now, the answer for why I would want to do such a thing is easy.
>>>>>>>>>> Simply
>>>>>>>>>> that
>>>>>>>>>> I don't know any better :) Just started working with R a few days
>>>>>>>>>> ago,
>>>>>>>>>> but
>>>>>>>>>> I'm learning.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB
>>>>>>>>>> of
>>>>>>>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit
>>>>>>>>>> OS
>>>>>>>>>> and
>>>>>>>>>> see if it makes a difference.
>>>>>>>>>
>>>>>>>>> Well, it won't be much different. The reason a 32-bit OS can only use
>>>>>>>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS
>>>>>>>>> also
>>>>>>>>> needs to use some RAM, so you won't get all 4 Gb there either. The
>>>>>>>>> issue
>>>>>>>>> is how much RAM can be allocated to a single process, and on a 64-bit
>>>>>>>>> OS
>>>>>>>>> that gets bumped up significantly.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Jim
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks for your insight!
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Rick
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at med.umich.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Rick,
>>>>>>>>>>>
>>>>>>>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote:
>>>>>>>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but
>>>>>>>>>>>> have
>>>>>>>>>>>> quite a few other programs running in the background...I'll see if
>>>>>>>>>>>> closing
>>>>>>>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the
>>>>>>>>>>>> problem.
>>>>>>>>>>>> I
>>>>>>>>>>>> just started reading up on how to set one of these up yesterday.
>>>>>>>>>>>> Will
>>>>>>>>>>>> do
>>>>>>>>>>>> this and see if the duplicates will go away.
>>>>>>>>>>>>
>>>>>>>>>>>> The "mydata" originates from CEL files and then I run the RMA
>>>>>>>>>>>> analysis
>>>>>>>>>>>> on
>>>>>>>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm
>>>>>>>>>>>> guessing
>>>>>>>>>>>> that
>>>>>>>>>>>> doing this might reduce the QCReport PDF file size quite
>>>>>>>>>>>> considerably
>>>>>>>>>>>> since
>>>>>>>>>>>> I won't have any duplication and will make further analysis easier.
>>>>>>>>>>>
>>>>>>>>>>> How do you run an RMA analysis without setting up a proper
>>>>>>>>>>> ExpressionSet? The default behavior is to create one. In addition,
>>>>>>>>>>> why
>>>>>>>>>>> would you want to do such a thing? The ExpressionSet class is
>>>>>>>>>>> specifically designed to contain these sorts of data.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would
>>>>>>>>>>>> running
>>>>>>>>>>>> as
>>>>>>>>>>>> 64bit still necessitate more RAM?
>>>>>>>>>>>
>>>>>>>>>>> Probably. The difference isn't efficiency, but the ability to
>>>>>>>>>>> address
>>>>>>>>>>> more RAM. A 32-bit OS can still address all the available memory
>>>>>>>>>>> that
>>>>>>>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you
>>>>>>>>>>> want to do all the chips together. As for how much, I don't know.
>>>>>>>>>>> Since
>>>>>>>>>>> RAM isn't that expensive these days, you might look at maxing your
>>>>>>>>>>> box
>>>>>>>>>>> out.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>>
>>>>>>>>>>> Jim
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>> Rick
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at med.umich.edu>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Rick,
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote:
>>>>>>>>>>>> Dear All,
>>>>>>>>>>>>
>>>>>>>>>>>> I have recently entered the world of R. Through some trial and
>>>>>>>>>>>> error
>>>>>>>>>>>> I'm
>>>>>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy
>>>>>>>>>>>> packages.
>>>>>>>>>>>> I¹m a molecular and cell biologist with rudimentary statistical
>>>>>>>>>>>> knowledge
>>>>>>>>>>>> and even less knowledge with respect to R.
>>>>>>>>>>>>
>>>>>>>>>>>> When I enter the following:
>>>>>>>>>>>>
>>>>>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf")
>>>>>>>>>>>>
>>>>>>>>>>>> I get some errors in return.
>>>>>>>>>>>>
>>>>>>>>>>>> Loading required package: lattice
>>>>>>>>>>>> Error: cannot allocate vector of size 437.4 Mb
>>>>>>>>>>>>
>>>>>>>>>>>> This indicates that you need more RAM, as you are running out of
>>>>>>>>>>>> memory.
>>>>>>>>>>>>
>>>>>>>>>>>> In addition: Warning message:
>>>>>>>>>>>> In data.row.names(row.names, rowsi, i) :
>>>>>>>>>>>> some row.names duplicated:
>>>>>>>>>>>>
>>>>>>>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,
>>>>>>>>>> 52
>>>>>>>>>> ,5
>>>>>>>>>> 3,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 5
>>>>>>>>>>>>
>>>>>>>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,9
>>>>>>>>>> 9,
>>>>>>>>>> 10
>>>>>>>>>> 2,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 1
>>>>>>>>>>>>
>>>>>>>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,14
>>>>>>>>>> 1,
>>>>>>>>>> 14
>>>>>>>>>> 2,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 1
>>>>>>>>>>>>
>>>>>>>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,17
>>>>>>>>>> 0,
>>>>>>>>>> 17
>>>>>>>>>> 1,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 1
>>>>>>>>>>>>
>>>>>>>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,20
>>>>>>>>>> 6,
>>>>>>>>>> 20
>>>>>>>>>> 7,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 2
>>>>>>>>>>>>
>>>>>>>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,25
>>>>>>>>>> 0,
>>>>>>>>>> 25
>>>>>>>>>> 1,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 2
>>>>>>>>>>>>
>>>>>>>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,29
>>>>>>>>>> 1,
>>>>>>>>>> 29
>>>>>>>>>> 2,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 2
>>>>>>>>>>>>
>>>>>>>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,33
>>>>>>>>>> 4,
>>>>>>>>>> 33
>>>>>>>>>> 7,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 3
>>>>>>>>>>>>
>>>>>>>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,37
>>>>>>>>>> 6,
>>>>>>>>>> 37
>>>>>>>>>> 8,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 3
>>>>>>>>>>>>
>>>>>>>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,40
>>>>>>>>>> 5,
>>>>>>>>>> 40
>>>>>>>>>> 6,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 4
>>>>>>>>>>>>
>>>>>>>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,44
>>>>>>>>>> 5,
>>>>>>>>>> 44
>>>>>>>>>> 7,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 4
>>>>>>>>>>>>
>>>>>>>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,49
>>>>>>>>>> 3,
>>>>>>>>>> 49
>>>>>>>>>> 4,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> 4
>>>>>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [...
>>>>>>>>>>>> truncated]
>>>>>>>>>>>>
>>>>>>>>>>>> What exactly is 'mydata', and how did you generate it? The above
>>>>>>>>>>>> error
>>>>>>>>>>>> indicates that you have duplicate row names, which IIRC isn't
>>>>>>>>>>>> possible
>>>>>>>>>>>> to do with an expressionSet.
>>>>>>>>>>>>
>>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error
>>>>>>>>>>>> code=12)
>>>>>>>>>>>> *** error: can't allocate region
>>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error
>>>>>>>>>>>> code=12)
>>>>>>>>>>>> *** error: can't allocate region
>>>>>>>>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>>>>>>>>>
>>>>>>>>>>>> More lack of memory errors.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) :
>>>>>>>>>>>> unused argument(s) (htmlhelp = TRUE)
>>>>>>>>>>>> In addition: Warning messages:
>>>>>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) :
>>>>>>>>>>>> datasets have been moved from package 'base' to package
>>>>>>>>>>>> 'datasets'
>>>>>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) :
>>>>>>>>>>>> datasets have been moved from package 'stats' to package
>>>>>>>>>>>> 'datasets'
>>>>>>>>>>>> starting httpd help server ... done
>>>>>>>>>>>>
>>>>>>>>>>>> Would someone be able to diagnose the problem and suggest a
>>>>>>>>>>>> solution?
>>>>>>>>>>>>
>>>>>>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit
>>>>>>>>>>>> OS.
>>>>>>>>>>>> Depending on your hardware, you might be able to just install a
>>>>>>>>>>>> 64-bit
>>>>>>>>>>>> version of R.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>>
>>>>>>>>>>>> Jim
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> If it is useful, I am using the following R software: R for Mac OS
>>>>>>>>>>>> X
>>>>>>>>>>>> GUI
>>>>>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that
>>>>>>>>>>>> would
>>>>>>>>>>>> be
>>>>>>>>>>>> useful please let me know.
>>>>>>>>>>>>
>>>>>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the
>>>>>>>>>>>> following
>>>>>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried
>>>>>>>>>>>> library(affyQCReport);
>>>>>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be
>>>>>>>>>>>> doing
>>>>>>>>>>>> something, in other words it doesn¹t go to the error, yet, but it¹s
>>>>>>>>>>>> been
>>>>>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips.
>>>>>>>>>>>>
>>>>>>>>>>>> Perhaps it would work if I tried to generate each QCReport page
>>>>>>>>>>>> separately
>>>>>>>>>>>> rather than as a whole.
>>>>>>>>>>>>
>>>>>>>>>>>> Cordially,
>>>>>>>>>>>> Rick
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Bioconductor mailing list
>>>>>>>>>>>> Bioconductor at r-project.org
>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>>>>>>> Search the archives:
>>>>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>
--
Rick Frausto
PhD Candidate
The University of Sydney
School of Molecular Bioscience G08
Camperdown, NSW 2006 AUSTRALIA
ricardo.frausto at sydney.edu.au
Phone: 61 2 9036 5354
Lab of Iain L. Campbell
More information about the Bioconductor
mailing list