[BioC] QCReport: specifying alt CDF (MoGene-1_0-st-v1)?
Harry Mangalam
harry.mangalam at uci.edu
Tue Sep 21 21:55:29 CEST 2010
Hi Jim,
Thanks for the rapid reply, info and pointers.
I tried to take your advice and on a larger machine (due to malloc
errors on the 1st - new sessionInfo() below) I can get a bit further
but still can't convince arrayQualityMetrics() to take or recognize
the appropriate cdf env.
While I include the entire session below, the main problem seems to be
that R will not conclude the installation of the CDF you referenced:
biocLite("mogene10stv1cdf")
either referenced separately or as part of the arrayQualityMetrics()
dependency. It gave the identical results on the machine I used
before (w/ R 2.11.1) and on the larger 64b machine (w/ R 2.11.0).
The entire session follows.
(From a clean start on the machine whose sessionInfo() is included at
beginning and end of the session.)
$ module load R/2.11.0 # we use modules to keep things separate
$ R
> sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] graphics grDevices datasets stats utils methods base
other attached packages:
[1] Rmpi_0.5-8
>library(affy)
># deleted all 'std' output, including only errors or warnings.
#create an affybatch object from the cel files.
> ab <- ReadAffy(widget=TRUE) # select all 8 wt cels (sal vs coc)
> library("arrayQualityMetrics")
# and run the code on all the wt cels
> arrayQualityMetrics(expressionset = ab,outdir = "wt_sal_v_coc",force
= TRUE,do.logtransform = TRUE)
Loading required package: affyPLM
Loading required package: gcrma
Loading required package: preprocessCore
Attaching package: 'affyPLM'
The following object(s) are masked from 'package:stats':
resid, residuals, weights
>arrayQualityMetrics(expressionset = ab,outdir = "wt_sal_v_coc",force
= TRUE,do.logtransform = TRUE)
The report will be written in directory 'wt_sal_v_coc'.
trying URL
'http://bioconductor.org/packages/2.6/data/annotation/src/contrib/mogene10stv1cdf_2.6.2.tar.gz'
Content type 'application/x-gzip' length 3126174 bytes (3.0 Mb)
opened URL
==================================================
downloaded 3.0 Mb
* installing *source* package ‘mogene10stv1cdf’ ...
** R
** data
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
object 'annoStartupMessages' not found
ERROR: loading failed
* removing ‘/apps/R/2.11.0/lib64/R/library/mogene10stv1cdf’
The downloaded packages are in
‘/tmp/Rtmpq2sQrq/downloaded_packages’
Updating HTML index of packages in '.Library'
Error in getCdfInfo(object) :
Could not obtain CDF environment, problems encountered:
Specified environment does not contain MoGene-1_0-st-v1
Library - package mogene10stv1cdf not installed
Library - package mogene10stv1cdf not installed
In addition: Warning message:
In install.packages(cdfname, lib = lib, repos =
Biobase:::biocReposList(), :
installation of package 'mogene10stv1cdf' had non-zero exit status
<<the above stanza repeated 2 more times, downloading and then failing
to install the same pkg>>
Is this a problem with matching case and intervening characters?
(mogene10stv1 vs MoGene-1_0-st-v1) or something more fundamental.
I tried this as a user and as root, to see if it was a permissions
problem. The results were identical.
I also tried the installation of the CDF that came with the cel files.
(MoGene-1_0-st-v1.r3.cdf), but while this apparently went to
completion (as previously noted), it did not change anything.
# at end of session, here is the sessionInfo()
> sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] tools tcltk graphics grDevices datasets stats utils
[8] methods base
other attached packages:
[1] arrayQualityMetrics_2.6.0 affyPLM_1.24.1
[3] preprocessCore_1.10.0 gcrma_2.20.0
[5] tkWidgets_1.26.0 DynDoc_1.26.0
[7] widgetTools_1.26.0 affy_1.26.1
[9] Biobase_2.8.0 Rmpi_0.5-8
loaded via a namespace (and not attached):
[1] affyio_1.16.0 annotate_1.26.1 AnnotationDbi_1.10.0
[4] beadarray_1.16.0 Biostrings_2.16.9 DBI_0.2-5
[7] genefilter_1.30.0 grid_2.11.0 hwriter_1.2
[10] IRanges_1.6.17 lattice_0.18-5 latticeExtra_0.6-11
[13] limma_3.4.5 marray_1.26.0 RColorBrewer_1.0-2
[16] RSQLite_0.8-4 simpleaffy_2.24.0 splines_2.11.0
[19] stats4_2.11.0 survival_2.35-8 vsn_3.16.0
[22] xtable_1.5-6
Thanks for your consideration.
harry
On Tuesday 21 September 2010 06:49:38 James W. MacDonald wrote:
> Hi Harry,
>
> On 9/20/2010 6:20 PM, Harry Mangalam wrote:
> > Hi BioC
> >
> > (sessionInfo() at bottom)
> >
> > I'm trying to help a group here do some QC on their affy datasets
> > derived from the mogene10stv1 array set. This array is not in
> > mainstream BioC support but I've created and installed the CDF
>
> > environment for that array:
> This is not correct.
>
> biocLite("mogene10stv1cdf")
>
> Will get you the package you create below.
>
> >> make.cdf.package("MoGene-1_0-st-v1.r3.cdf", species =
> >> "Mus_mus")
> >
> > (completes, and I've installed the generated CDF env)
> >
> > but when I try to run QCReport on this dataset (even explicitly
>
> > specifying the mogene10stv1 CDF env), I get the errors:
> In future, please mention the package you are using. I happen to
> know that QCReport() is part of the affyQCReport package, but by
> neglecting to include this bit of information you seriously
> degrade your chances of an answer.
>
> Now on to the answer. ;-D
>
> You are not going to be very satisfied with affyQCReport for this
> chip, as it uses the simpleaffy package for much of the quality
> control output, a good portion of which is based on MAS5 calls.
> Since the MoGene chip is a PM-only chip, you won't be able to
> compute MAS5 calls, as they rely on the matching MM probes, which
> don't exist. Hence the NA values below.
>
> I believe you will be better off using the arrayQualityMetrics
> package, which is more general.
>
> Best,
>
> Jim
>
> >> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1cdf"))
> >
> > # or
> >
> >> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1"))
> >
> > # (get same error)
> >
> > Error: NAs in foreign function call (arg 1)
> > In addition: Warning messages:
> >
> > 1: In data.row.names(row.names, rowsi, i) :
> > some row.names duplicated:
> > 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50
> > ,51,52,53,54,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,9
> > 4,95,96,97,98,99,102,103,104,108,109,110,111,114,119,120,121,122,
> > 127,134,136,137,138,139,141,142,147,148,149,152,153,156,157,158,1
> > 59,162,163,164,165,166,167,168,169,170,171,173,175,176,179,180,18
> > 3,184,185,186,191,192,195,197,198,199,200,202,206,207,210,219,220
> > ,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,
> > 252,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,2
> > 90,291,292,296,297,298,302,304,305,306,310,311,312,313,317,318,31
> > 9,321,322,324,334,337,338,339,340,341,345,346,350,351,356,359,362
> > ,364,366,367,370,371,373,376,378,382,383,384,385,386,387,388,389,
> > 391,394,395,397,398,399,400,402,403,405,406,407,409,410,411,415,4
> > 16,418,419,425,431,432,433,434,435,440,441,443,445,447,449,450,45
> > 2,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494
> > ,495,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51
> > [... truncated]
> >
> > 2: In qc.affy(unnormalised, ...) :
> > CDF Environment name ' hgu95av2cdf ' does not match cdfname '
> >
> > mogene10stv1cdf '
> >
> > Error in plot(qc(object)) :
> > error in evaluating the argument 'x' in selecting a method for
> >
> > function 'plot'
> >
> >
> > This: /Error: NAs in foreign function call (arg 1)/
> >
> > seems to imply that:
> > - there's an error in the '(arg 1)' but which (arg 1)?
> >
> > If this refers to the arg
> >
> > /ReadAffy(widget=TRUE,cdfname="mogene10stv1cdf")/
> >
> > then that part of the command seems to complete fine and
> > returns an
> >
> > AffyBatch object as it should
> >
> >> str(rawdata)
> >
> > Formal class 'AffyBatch' [package "affy"] with 10 slots
> >
> > ..@ cdfName : chr "mogene10stv1cdf"
> > ..@ nrow : int 1050
> > ..@ ncol : int 1050
> >
> > /etc/
> >
> >
> > - or I have NAs in the data, but doesn't point to where or
> > whether I should address them.
> > If this is the critical error, I'm guessing I have to choose a
> > transform that removes or floor-shifts the NAs into a
> > computational form?
> >
> > - the Warnings:
> >
> > 1: In data.row.names(row.names, rowsi, i) :
> > some row.names duplicated:
> > 4,8,9,13,14,15,16,24,25,26,27,28,29, <almost every
> > intervening # omitted>
> > ,513,515,516,51 [... truncated]
> >
> > Would this be related to warning 2 below?
> >
> > 2: In qc.affy(unnormalised, ...) :
> > CDF Environment name ' hgu95av2cdf ' does not match cdfname '
> >
> > mogene10stv1cdf '
> >
> > but if so, what is the proper way to tell QCReport that I'm using
> > a non-default CDF?
> > the help section for QCReport doesn't describe any params for
> > telling it that the CDF env is not 'hgu95av2cdf' and I've tried
> > including that info in the ReadAffy() fn as noted:
> >
> > ie:
> >> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1"))
> >>
> >>
> >>
> >>
> >>
> >> sessionInfo()
> >
> > R version 2.11.1 (2010-05-31)
> > i486-pc-linux-gnu
> >
> > locale:
> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> > [9] LC_ADDRESS=C LC_TELEPHONE=C
> >
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] tools tcltk stats graphics grDevices utils
> > datasets
> > [8] methods base
> >
> > other attached packages:
> > [1] makecdfenv_1.26.0 tkWidgets_1.26.0 DynDoc_1.26.0
> > [4] widgetTools_1.26.0 hgu95av2cdf_2.6.0
> > affydata_1.11.10 [7] affyQCReport_1.26.0 lattice_0.19-11
> > RColorBrewer_1.0-2
> >
> > [10] affyPLM_1.24.1 preprocessCore_1.10.0 xtable_1.5-6
> > [13] simpleaffy_2.24.0 gcrma_2.20.0
> > genefilter_1.30.0 [16] mogene10stv1cdf_2.6.2 affy_1.26.1
> > Biobase_2.8.0
> >
> > loaded via a namespace (and not attached):
> > [1] affyio_1.16.0 annotate_1.26.1
> > AnnotationDbi_1.10.2 [4] Biostrings_2.16.9 DBI_0.2-5
> > grid_2.11.1 [7] IRanges_1.6.17 RSQLite_0.9-2
> > splines_2.11.1
> >
> > [10] survival_2.35-8
> >
> >
> > Thanks for your consideration.
--
Harry Mangalam - Research Computing, NACS, Rm 225 MSTB, UC Irvine
[ZOT 2225] / 92697 949 824-0084(o), 949 285-4487(c)
MSTB=Bldg 415 (G-5 on <http://today.uci.edu/pdf/UCI_09_map_campus.pdf>
---
Vision: <http://goo.gl/WWdy>
More information about the Bioconductor
mailing list