[BioC] Using custom CDF with 'make.cdf.env'
Scott Robinson
Scott.Robinson at glasgow.ac.uk
Wed Aug 27 17:56:28 CEST 2014
Hi James,
Thanks for the quick response. If you open the new CDF in a text editor like notepad++ and use a find & count function you will find there are only 12,380 matches to the text string “[Unit”. This string matches both the “Unit###” section and the “Unit###_Block#” sections for each probe set, so divide by 2 = 6190 probe sets. The original CDF you do the same and find 15630/2 = 7815 probe sets.
I read through the Affymetrix CDF format documentation pretty thoroughly and checked how it corresponded to the original CDF file but couldn’t see anything I had done wrong.
Thanks,
Scott
From: James W. MacDonald [mailto:jmacdon at uw.edu]
Sent: 27 August 2014 16:20
To: Scott Robinson
Cc: bioconductor at r-project.org
Subject: Re: Using custom CDF with 'make.cdf.env'
Hi Scott,
As far as I can tell, you haven't made any changes to the cdf at all:
> z <- make.cdf.env("newmir1.cdf")
Reading CDF file.
Creating CDF environment
Wait for about 78 dots.........................................................................
> z
<environment: 0x00000000113d5c08>
> length(ls(z))
[1] 7815
> zz <- as.list(z)
> table(sapply(zz, nrow))
4 8 9 10 11 20 25 40 50 67 73 88 89 90 91 92 94
6703 8 14 32 959 9 1 1 2 1 1 1 2 1 1 1 78
> y <- make.cdf.env("miRNA-1_0.CDF")
Reading CDF file.
Creating CDF environment
Wait for about 78 dots..........................................................................
> yy <- as.list(y)
> length(yy)
[1] 7815
> table(sapply(yy, nrow))
4 8 9 10 11 20 25 40 50 67 73 88 89 90 91 92 94
6703 8 14 32 959 9 1 1 2 1 1 1 2 1 1 1 78
> all.equal(names(zz), names(yy))
[1] TRUE
Best,
Jim
On Wed, Aug 27, 2014 at 10:31 AM, Scott Robinson <Scott.Robinson at glasgow.ac.uk<mailto:Scott.Robinson at glasgow.ac.uk>> wrote:
Dear All,
Since it exceeds 1MB, here is a link to the old ("miRNA-1_0.CDF") and new ("newmir1.cdf") CDFs, test script and example CEL file:
http://www.files.com/set/53fdeb0aa2176
Thanks,
Scott
________________________________________
From: Scott Robinson [guest] [guest at bioconductor.org<mailto:guest at bioconductor.org>]
Sent: 27 August 2014 13:11
To: bioconductor at r-project.org<mailto:bioconductor at r-project.org>; Scott Robinson
Cc: makecdfenv Maintainer
Subject: Using custom CDF with 'make.cdf.env'
Dear List,
I made a custom CDF by modifying the original Affymetrix miRNA v1 file. As there is a great level of redundancy in this chip I have condensed the original 7815 probe sets into 6190 probe sets (by 'moving' probes from one set to another), however when I try making and attaching my new CDF environment I still seem to have 7815 probe sets so presumably I must have done something wrong.
I have read the vignette and many similar posts to mine however still cannot work out what I am doing wrong. Perhaps the problem is with the CDF itself? I have a short script testing the functionality, the output of which I have copied in below. I will gladly attach the script, CDFs and example CEL file if there is nothing obviously wrong with the code - would do this now but there doesn't appear to be an option on the webform.
Many thanks,
Scott
> folder <- "C:\Work\COPD-ASTHMA\microRNA files\newCDF\test\"
>
> setwd(paste0(folder,"CEL"))
> options(stringsAsFactors=FALSE)
> library(affy)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following object is masked from ‘package:stats’:
xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, as.data.frame, cbind, colnames, duplicated, eval,
Filter, Find, get, intersect, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int<http://pmax.int>, pmin, pmin.int<http://pmin.int>, Position, rank,
rbind, Reduce, rep.int<http://rep.int>, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unlist
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
> library(makecdfenv)
Loading required package: affyio
>
> cleancdfname("newmir1.cdf")
[1] "newmir1.cdf"
> newmir1 = make.cdf.env("newmir1.cdf")
Reading CDF file.
Creating CDF environment
Wait for about 78 dots.......................................................................
> Data <- ReadAffy()
> Data at cdfName <- "newmir1"
>
> Data
AffyBatch object
size of arrays=230x230 features (17 kb)
cdf=newmir1 (7815 affyids)
number of samples=1
number of genes=7815
annotation=mirna102xgain
notes=
>
> dim(exprs(rma(Data)))
Background correcting
Normalizing
Calculating Expression
[1] 7815 1
-- output of sessionInfo():
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] makecdfenv_1.36.0 affyio_1.28.0 affy_1.38.1 Biobase_2.20.1
[5] BiocGenerics_0.6.0
loaded via a namespace (and not attached):
[1] BiocInstaller_1.10.4 preprocessCore_1.22.0 tools_3.0.2
[4] zlibbioc_1.6.0
--
Sent via the guest posting facility at bioconductor.org<http://bioconductor.org>.
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
[[alternative HTML version deleted]]
More information about the Bioconductor
mailing list