[BioC] QCReport: specifying alt CDF (MoGene-1_0-st-v1)?

Harry Mangalam harry.mangalam at uci.edu
Tue Sep 21 21:55:29 CEST 2010


Hi Jim,

Thanks for the rapid reply, info and pointers.

I tried to take your advice and on a larger machine (due to malloc 
errors on the 1st - new sessionInfo() below) I can get a bit further 
but still can't convince arrayQualityMetrics() to take or recognize 
the appropriate cdf env.


While I include the entire session below, the main problem seems to be 
that R will not conclude the installation of the CDF you referenced:

biocLite("mogene10stv1cdf")

either referenced separately or as part of the arrayQualityMetrics() 
dependency.  It gave the identical results on the machine I used 
before (w/ R 2.11.1) and on the larger 64b machine (w/ R 2.11.0).

The entire session follows.
(From a clean start on the machine whose sessionInfo() is included at 
beginning and end of the session.)

$ module load R/2.11.0 # we use modules to keep things separate
$ R
> sessionInfo()
R version 2.11.0 (2010-04-22) 
x86_64-unknown-linux-gnu 

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] graphics  grDevices datasets  stats     utils     methods   base     

other attached packages:
[1] Rmpi_0.5-8

>library(affy) 
># deleted all 'std' output, including only errors or warnings.

#create an affybatch object  from the cel files.
> ab <- ReadAffy(widget=TRUE)  # select all 8 wt cels (sal vs coc)

> library("arrayQualityMetrics")
# and run the code on all the wt cels
> arrayQualityMetrics(expressionset = ab,outdir = "wt_sal_v_coc",force 
= TRUE,do.logtransform = TRUE)
Loading required package: affyPLM
Loading required package: gcrma
Loading required package: preprocessCore

Attaching package: 'affyPLM'

The following object(s) are masked from 'package:stats':

    resid, residuals, weights

>arrayQualityMetrics(expressionset = ab,outdir = "wt_sal_v_coc",force 
= TRUE,do.logtransform = TRUE)
The report will be written in directory 'wt_sal_v_coc'. 
trying URL 
'http://bioconductor.org/packages/2.6/data/annotation/src/contrib/mogene10stv1cdf_2.6.2.tar.gz'
Content type 'application/x-gzip' length 3126174 bytes (3.0 Mb)
opened URL
==================================================
downloaded 3.0 Mb

* installing *source* package ‘mogene10stv1cdf’ ...
** R
** data
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) : 
  object 'annoStartupMessages' not found
ERROR: loading failed
* removing ‘/apps/R/2.11.0/lib64/R/library/mogene10stv1cdf’

The downloaded packages are in
        ‘/tmp/Rtmpq2sQrq/downloaded_packages’
Updating HTML index of packages in '.Library'
Error in getCdfInfo(object) : 
  Could not obtain CDF environment, problems encountered:
Specified environment does not contain MoGene-1_0-st-v1
Library - package mogene10stv1cdf not installed
Library - package mogene10stv1cdf not installed
In addition: Warning message:
In install.packages(cdfname, lib = lib, repos = 
Biobase:::biocReposList(),  :
  installation of package 'mogene10stv1cdf' had non-zero exit status

<<the above stanza repeated 2 more times, downloading and then failing 
to install the same pkg>>

Is this a problem with matching case and intervening characters? 
(mogene10stv1 vs MoGene-1_0-st-v1) or something more fundamental.

I tried this as a user and as root, to see if it was a permissions 
problem.  The results were identical.

I also tried the installation of the CDF that came with the cel files. 
(MoGene-1_0-st-v1.r3.cdf), but while this apparently went to 
completion (as previously noted), it did not change anything.

# at end of session, here is the sessionInfo()
> sessionInfo()
R version 2.11.0 (2010-04-22) 
x86_64-unknown-linux-gnu 

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] tools     tcltk     graphics  grDevices datasets  stats     utils    
[8] methods   base     

other attached packages:
 [1] arrayQualityMetrics_2.6.0 affyPLM_1.24.1           
 [3] preprocessCore_1.10.0     gcrma_2.20.0             
 [5] tkWidgets_1.26.0          DynDoc_1.26.0            
 [7] widgetTools_1.26.0        affy_1.26.1              
 [9] Biobase_2.8.0             Rmpi_0.5-8               

loaded via a namespace (and not attached):
 [1] affyio_1.16.0        annotate_1.26.1      AnnotationDbi_1.10.0
 [4] beadarray_1.16.0     Biostrings_2.16.9    DBI_0.2-5           
 [7] genefilter_1.30.0    grid_2.11.0          hwriter_1.2         
[10] IRanges_1.6.17       lattice_0.18-5       latticeExtra_0.6-11 
[13] limma_3.4.5          marray_1.26.0        RColorBrewer_1.0-2  
[16] RSQLite_0.8-4        simpleaffy_2.24.0    splines_2.11.0      
[19] stats4_2.11.0        survival_2.35-8      vsn_3.16.0          
[22] xtable_1.5-6        

Thanks for your consideration.

harry


On Tuesday 21 September 2010 06:49:38 James W. MacDonald wrote:
> Hi Harry,
> 
> On 9/20/2010 6:20 PM, Harry Mangalam wrote:
> > Hi BioC
> > 
> > (sessionInfo() at bottom)
> > 
> > I'm trying to help a group here do some QC on their affy datasets
> > derived from the mogene10stv1 array set.  This array is not in
> > mainstream BioC support but I've created and installed the CDF
> 
> > environment for that array:
> This is not correct.
> 
> biocLite("mogene10stv1cdf")
> 
> Will get you the package you create below.
> 
> >>   make.cdf.package("MoGene-1_0-st-v1.r3.cdf", species =
> >>   "Mus_mus")
> > 
> > (completes, and I've installed the generated CDF env)
> > 
> > but when I try to run  QCReport on this dataset (even explicitly
> 
> > specifying the mogene10stv1 CDF env), I get the errors:
> In future, please mention the package you are using. I happen to
> know that QCReport() is part of the affyQCReport package, but by
> neglecting to include this bit of information you seriously
> degrade your chances of an answer.
> 
> Now on to the answer. ;-D
> 
> You are not going to be very satisfied with affyQCReport for this
> chip, as it uses the simpleaffy package for much of the quality
> control output, a good portion of which is based on MAS5 calls.
> Since the MoGene chip is a PM-only chip, you won't be able to
> compute MAS5 calls, as they rely on the matching MM probes, which
> don't exist. Hence the NA values below.
> 
> I believe you will be better off using the arrayQualityMetrics
> package, which is more general.
> 
> Best,
> 
> Jim
> 
> >> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1cdf"))
> > 
> > #   or
> > 
> >> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1"))
> > 
> > #   (get same error)
> > 
> > Error: NAs in foreign function call (arg 1)
> > In addition: Warning messages:
> > 
> > 1: In data.row.names(row.names, rowsi, i) :
> >    some row.names duplicated:
> > 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50
> > ,51,52,53,54,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,9
> > 4,95,96,97,98,99,102,103,104,108,109,110,111,114,119,120,121,122,
> > 127,134,136,137,138,139,141,142,147,148,149,152,153,156,157,158,1
> > 59,162,163,164,165,166,167,168,169,170,171,173,175,176,179,180,18
> > 3,184,185,186,191,192,195,197,198,199,200,202,206,207,210,219,220
> > ,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,
> > 252,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,2
> > 90,291,292,296,297,298,302,304,305,306,310,311,312,313,317,318,31
> > 9,321,322,324,334,337,338,339,340,341,345,346,350,351,356,359,362
> > ,364,366,367,370,371,373,376,378,382,383,384,385,386,387,388,389,
> > 391,394,395,397,398,399,400,402,403,405,406,407,409,410,411,415,4
> > 16,418,419,425,431,432,433,434,435,440,441,443,445,447,449,450,45
> > 2,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494
> > ,495,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51
> > [... truncated]
> > 
> > 2: In qc.affy(unnormalised, ...) :
> >    CDF Environment name ' hgu95av2cdf ' does not match cdfname '
> > 
> > mogene10stv1cdf '
> > 
> > Error in plot(qc(object)) :
> >    error in evaluating the argument 'x' in selecting a method for
> > 
> > function 'plot'
> > 
> > 
> > This: /Error: NAs in foreign function call (arg 1)/
> > 
> >   seems to imply that:
> > - there's an error in the '(arg 1)'  but which (arg 1)?
> > 
> >    If this refers to the arg
> > 
> > /ReadAffy(widget=TRUE,cdfname="mogene10stv1cdf")/
> > 
> >    then that part of the command seems to complete fine and
> >    returns an
> > 
> > AffyBatch object as it should
> > 
> >> str(rawdata)
> > 
> > Formal class 'AffyBatch' [package "affy"] with 10 slots
> > 
> >    ..@ cdfName          : chr "mogene10stv1cdf"
> >    ..@ nrow             : int 1050
> >    ..@ ncol             : int 1050
> > 
> > /etc/
> > 
> > 
> > - or I have NAs in the data, but doesn't point to where or
> > whether I should address them.
> > If this is the critical error, I'm guessing I have to choose a
> > transform that removes or floor-shifts the NAs into a
> > computational form?
> > 
> > - the Warnings:
> > 
> > 1: In data.row.names(row.names, rowsi, i) :
> >    some row.names duplicated:
> >    4,8,9,13,14,15,16,24,25,26,27,28,29, <almost every
> >    intervening # omitted>
> >    ,513,515,516,51 [... truncated]
> > 
> > Would this be related to warning 2 below?
> > 
> > 2: In qc.affy(unnormalised, ...) :
> >    CDF Environment name ' hgu95av2cdf ' does not match cdfname '
> > 
> > mogene10stv1cdf '
> > 
> > but if so, what is the proper way to tell QCReport that I'm using
> > a non-default CDF?
> > the help section for QCReport doesn't describe any params for
> > telling it that the CDF env is not 'hgu95av2cdf' and I've tried
> > including that info in the ReadAffy() fn as noted:
> > 
> > ie:
> >> QCReport(ReadAffy(widget=TRUE,cdfname="mogene10stv1"))
> >> 
> >> 
> >> 
> >> 
> >> 
> >> sessionInfo()
> > 
> > R version 2.11.1 (2010-05-31)
> > i486-pc-linux-gnu
> > 
> > locale:
> >   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> >   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> >   [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> >   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> >   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> > 
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> > 
> > attached base packages:
> > [1] tools     tcltk     stats     graphics  grDevices utils
> > datasets
> > [8] methods   base
> > 
> > other attached packages:
> >   [1] makecdfenv_1.26.0     tkWidgets_1.26.0      DynDoc_1.26.0
> >   [4] widgetTools_1.26.0    hgu95av2cdf_2.6.0    
> >   affydata_1.11.10 [7] affyQCReport_1.26.0   lattice_0.19-11    
> >     RColorBrewer_1.0-2
> > 
> > [10] affyPLM_1.24.1        preprocessCore_1.10.0 xtable_1.5-6
> > [13] simpleaffy_2.24.0     gcrma_2.20.0         
> > genefilter_1.30.0 [16] mogene10stv1cdf_2.6.2 affy_1.26.1        
> >   Biobase_2.8.0
> > 
> > loaded via a namespace (and not attached):
> >   [1] affyio_1.16.0        annotate_1.26.1     
> >   AnnotationDbi_1.10.2 [4] Biostrings_2.16.9    DBI_0.2-5       
> >       grid_2.11.1 [7] IRanges_1.6.17       RSQLite_0.9-2       
> >   splines_2.11.1
> > 
> > [10] survival_2.35-8
> > 
> > 
> > Thanks for your consideration.

-- 
Harry Mangalam - Research Computing, NACS, Rm 225 MSTB, UC Irvine
[ZOT 2225] / 92697  949 824-0084(o), 949 285-4487(c)
MSTB=Bldg 415 (G-5 on <http://today.uci.edu/pdf/UCI_09_map_campus.pdf>
---
Vision: <http://goo.gl/WWdy>



More information about the Bioconductor mailing list