[BioC] Using custom CDF with 'make.cdf.env'

Wed Aug 27 17:56:28 CEST 2014

Hi James,

Thanks for the quick response. If you open the new CDF in a text editor like notepad++ and use a find & count function you will find there are only 12,380 matches to the text string “[Unit”. This string matches both the “Unit###” section and the “Unit###_Block#” sections for each probe set, so divide by 2 = 6190 probe sets. The original CDF you do the same and find 15630/2 = 7815 probe sets.

I read through the Affymetrix CDF format documentation pretty thoroughly and checked how it corresponded to the original CDF file but couldn’t see anything I had done wrong.

Thanks,

Scott

From: James W. MacDonald [mailto:jmacdon at uw.edu]
Sent: 27 August 2014 16:20
To: Scott Robinson
Cc: bioconductor at r-project.org
Subject: Re: Using custom CDF with 'make.cdf.env'

Hi Scott,

As far as I can tell, you haven't made any changes to the cdf at all:

> z <- make.cdf.env("newmir1.cdf")
Reading CDF file.
Creating CDF environment
Wait for about 78 dots.........................................................................
> z
<environment: 0x00000000113d5c08>
> length(ls(z))
[1] 7815
> zz <- as.list(z)
> table(sapply(zz, nrow))

   4    8    9   10   11   20   25   40   50   67   73   88   89   90   91   92   94
6703    8   14   32  959    9    1    1    2    1    1    1    2    1    1    1   78
> y <- make.cdf.env("miRNA-1_0.CDF")
Reading CDF file.
Creating CDF environment
Wait for about 78 dots..........................................................................
> yy <- as.list(y)
> length(yy)
[1] 7815
> table(sapply(yy, nrow))

   4    8    9   10   11   20   25   40   50   67   73   88   89   90   91   92   94
6703    8   14   32  959    9    1    1    2    1    1    1    2    1    1    1   78
> all.equal(names(zz), names(yy))
[1] TRUE

Best,

Jim

On Wed, Aug 27, 2014 at 10:31 AM, Scott Robinson <Scott.Robinson at glasgow.ac.uk<mailto:Scott.Robinson at glasgow.ac.uk>> wrote:
Dear All,

Since it exceeds 1MB, here is a link to the old ("miRNA-1_0.CDF") and new ("newmir1.cdf") CDFs, test script and example CEL file:

http://www.files.com/set/53fdeb0aa2176

Thanks,

Scott
________________________________________
From: Scott Robinson [guest] [guest at bioconductor.org<mailto:guest at bioconductor.org>]
Sent: 27 August 2014 13:11
To: bioconductor at r-project.org<mailto:bioconductor at r-project.org>; Scott Robinson
Cc: makecdfenv Maintainer
Subject: Using custom CDF with 'make.cdf.env'

Dear List,

I made a custom CDF by modifying the original Affymetrix miRNA v1 file. As there is a great level of redundancy in this chip I have condensed the original 7815 probe sets into 6190 probe sets (by 'moving' probes from one set to another), however when I try making and attaching my new CDF environment I still seem to have 7815 probe sets so presumably I must have done something wrong.

I have read the vignette and many similar posts to mine however still cannot work out what I am doing wrong. Perhaps the problem is with the CDF itself? I have a short script testing the functionality, the output of which I have copied in below. I will gladly attach the script, CDFs and example CEL file if there is nothing obviously wrong with the code - would do this now but there doesn't appear to be an option on the webform.

Many thanks,

Scott

> folder <- "C:\Work\COPD-ASTHMA\microRNA files\newCDF\test\"
>
> setwd(paste0(folder,"CEL"))
> options(stringsAsFactors=FALSE)
> library(affy)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

    xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, as.data.frame, cbind, colnames, duplicated, eval,
    Filter, Find, get, intersect, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int<http://pmax.int>, pmin, pmin.int<http://pmin.int>, Position, rank,
    rbind, Reduce, rep.int<http://rep.int>, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unlist

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> library(makecdfenv)
Loading required package: affyio
>
> cleancdfname("newmir1.cdf")
[1] "newmir1.cdf"
> newmir1 = make.cdf.env("newmir1.cdf")
Reading CDF file.
Creating CDF environment
Wait for about 78 dots.......................................................................
> Data <- ReadAffy()
> Data at cdfName <- "newmir1"
>
> Data
AffyBatch object
size of arrays=230x230 features (17 kb)
cdf=newmir1 (7815 affyids)
number of samples=1
number of genes=7815
annotation=mirna102xgain
notes=
>
> dim(exprs(rma(Data)))
Background correcting
Normalizing
Calculating Expression
[1] 7815    1

 -- output of sessionInfo():

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] makecdfenv_1.36.0  affyio_1.28.0      affy_1.38.1        Biobase_2.20.1
[5] BiocGenerics_0.6.0

loaded via a namespace (and not attached):
[1] BiocInstaller_1.10.4  preprocessCore_1.22.0 tools_3.0.2
[4] zlibbioc_1.6.0

--
Sent via the guest posting facility at bioconductor.org<http://bioconductor.org>.

--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099

	[[alternative HTML version deleted]]