[BioC] Working flow_own CEL file reading problem
James W. MacDonald
jmacdon at med.umich.edu
Fri May 20 21:11:56 CEST 2005
Junshi Yazaki wrote:
> Hi Jim, Seth, Reddy, Paul,
>
> Thank you very much for your suggestion. I may be make cdf
> environment. Could you please help me how to confirm the env is OK or
> No? Next I tried cel file reading and normalize from our custom affy
> array. If my working flow are useful for affy beginner like me, could
> you please help me?
>
> At first, I typed below...
>
>> source("http://www.bioconductor.org/getBioC.R")
>> getBioC("all")
>> library(makecdfenv)
>> Library(affy)
>> make.cdf.package ("arabidopsistlgF.cdf")
>
>
> And move to Terminal on my Mac,
>
>> R CMD INSTALL arabidopsistlgFcdf
It should actually be arabidopsistlgfcdf. Note that the F is lower case.
>
> Return to R,
>
>> arabidopsistlgF = make.cdf.env("arabidopsistlgF.cdf")
>
This is an unnecessary step - you already made and installed the package.
>
> And I shut down my Mac. Is these step correct for making cdf
> environment? And then I started again.
>
>> source("http://www.bioconductor.org/getBioC.R")
>> getBioC()
>> library(affy)
>> Data <- readAffy()
At this point, try
cleancdfname(cdfName(Data))
if the result is not arabidopsistlgfcdf, then you need to make your
cdfenv again, using the cleancdfname().
I am betting the cleancdfname will be arabidopsistlgf4xcdf, so you will
need to do
make.cdf.package("arabidopsistlgF.cdf", packagename="arabidopsistlgf4xcdf")
And then install using R CMD INSTALL
>> eset <- rma(data)
>
>
> I got Error below,
> ***********
> Note: You did not specify a download type. Using a default value of:
> Source
> This will be fine for almost all users
>
> Error in getCdfInfo(object) : Could not obtain CDF environment, problems
> encountered:
> Specified environment specified did not contain arabidopsis_tlgF_4x
> Library - package arabidopsistlgf4xcdf not installed
> Data for package affy did not contain arabidopsis_tlgF_4x
> Bioconductor - arabidopsistlgf4xcdf not available
> *********
> Q1. I have question. Do I need typing below every time after restart? If
> I need the typing every time for making cdf env, I need lot of time for
> this step (cdf file is big).
No. If you install correctly, it should be there for you every time you
run R.
> **********
>
>> source("http://www.bioconductor.org/getBioC.R")
>> getBioC("all")
>> library(makecdfenv)
>> Library(affy)
>> make.cdf.package ("arabidopsistlgF.cdf")
>
> **********
> And next, I tried makecdfenv again like below,
>
>> env = make.cdf.env("arabidopsistlgF.cdf")
>> library(makecdfenv)
>> env = make.cdf.env("arabidopsistlgF.cdf")
>> cel.files=list.files(pattern=".CEL$")
>> data=ReadAffy(filenames=cel.files)
>> pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL"))
>> temp=rma(data)
>
>
> I got Error below,
> ******
> Note: You did not specify a download type. Using a default value of:
> Source
> This will be fine for almost all users
>
> Error in getCdfInfo(object) : Could not obtain CDF environment, problems
> encountered:
> Specified environment specified did not contain arabidopsis_tlgF_4x
> Library - package arabidopsistlgf4xcdf not installed
> Data for package affy did not contain arabidopsis_tlgF_4x
> Bioconductor - arabidopsistlgf4xcdf not available
> *********
> So I made copy of "arabidopsistlgF.cdf", and change name
> "arabidopsistlgF4x". And continue,
>
>> env = make.cdf.env("arabidopsistlgF4x.cdf")
>> cel.files=list.files(pattern=".CEL$")
>> data=ReadAffy(filenames=cel.files)
>>
>> pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL"))
>
>
> I got Error again,
> ********
> Error in whatcdf("J_HpaII_Wt_10uM.CEL") : Could not open file
> J_HpaII_Wt_10uM.CEL
> ********
>
> I thought I may be need for cel file normalization, below,
>
>> library(gcrma)
>
> Loading required package: matchprobes
>
>> Data <- ReadAffy()
>> eset <- gcrma(Data)
>
>
> I got Error again,
> ********
> Computing affinities[1] "Checking to see if your internet connection
> works..."
> Warning message:
> unable to connect to 'www.bioconductor.org' on port 80.
> Note: http://www.bioconductor.org/repository/devel/package/Source does
> not seem to have a valid repository, skipping
> Warning messages:
> 1: Failed to read replisting at
> http://www.bioconductor.org/repository/devel/package/Source in:
> getReplisting(repURL, repFile, method = method)
> 2: unable to connect to 'www.bioconductor.org' on port 80.
> Note: http://www.bioconductor.org/repository/devel/package/Win32 does
> not seem to have a valid repository, skipping
> Note: You did not specify a download type. Using a default value of:
> Source
> This will be fine for almost all users
>
> Error in getCDF(cdfpackagename) : Environment arabidopsistlgf4xcdf was
> not found in the Bioconductor repository.
> In addition: Warning message:
> Failed to read replisting at
> http://www.bioconductor.org/repository/devel/package/Win32 in:
> getReplisting(repURL, repFile, method = method)
> ********
> Q2. I can not read my cel file now. Our cdf file name is
> "arabidopsistlgF.cdf" . But cif file name is "arabidopsistlgF_4x.cif".
> Do I need to use same name for cif and cdf? Because cel file include cif
> file name. And how can I start to read cel file?
Once you have the cdfenv installed correctly, you can read celfiles
using ReadAffy().
>
> Q3. And also I would like to read cel file and normalization using a lot
> of cel files. Could you please suggest me what package is better for
> reading and normalization of affy custom array? and which is better rma
> (Robust Multi-Array Average expression measure) or gcrma (Background
> adjustment using sequence information)?
Which are better, apples or oranges? I guess it all depends on who you ask.
>
> Q4. If our array has over 3 million data, how long do I need for reading
> and normalization for 1 data (depend on machine power?)? Do you have
> some speculation for calculation efficiency? I need to read cdf file for
> about 20min.
If it takes that long to read in the cdf I am betting you are using
virtual memory. In that case, you really need to get more RAM or things
will be crushingly slow.
HTH,
Jim
>
> Thank you very much,
> Junshi
--
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
More information about the Bioconductor
mailing list