[BioC] Working flow_own CEL file reading problem
Junshi Yazaki
jyazaki at salk.edu
Fri May 20 20:51:42 CEST 2005
Hi Jim, Seth, Reddy, Paul,
Thank you very much for your suggestion. I may be make cdf
environment. Could you please help me how to confirm the env is OK or
No? Next I tried cel file reading and normalize from our custom affy
array. If my working flow are useful for affy beginner like me, could
you please help me?
At first, I typed below...
>source("http://www.bioconductor.org/getBioC.R")
>getBioC("all")
>library(makecdfenv)
>Library(affy)
>make.cdf.package ("arabidopsistlgF.cdf")
And move to Terminal on my Mac,
>R CMD INSTALL arabidopsistlgFcdf
Return to R,
>arabidopsistlgF = make.cdf.env("arabidopsistlgF.cdf")
And I shut down my Mac. Is these step correct for making cdf
environment? And then I started again.
> source("http://www.bioconductor.org/getBioC.R")
> getBioC()
> library(affy)
>Data <- readAffy()
>eset <- rma(data)
I got Error below,
***********
Note: You did not specify a download type. Using a default value of: Source
This will be fine for almost all users
Error in getCdfInfo(object) : Could not obtain CDF environment,
problems encountered:
Specified environment specified did not contain arabidopsis_tlgF_4x
Library - package arabidopsistlgf4xcdf not installed
Data for package affy did not contain arabidopsis_tlgF_4x
Bioconductor - arabidopsistlgf4xcdf not available
*********
Q1. I have question. Do I need typing below every time after restart?
If I need the typing every time for making cdf env, I need lot of
time for this step (cdf file is big).
**********
>source("http://www.bioconductor.org/getBioC.R")
>getBioC("all")
>library(makecdfenv)
>Library(affy)
>make.cdf.package ("arabidopsistlgF.cdf")
**********
And next, I tried makecdfenv again like below,
> env = make.cdf.env("arabidopsistlgF.cdf")
> library(makecdfenv)
> env = make.cdf.env("arabidopsistlgF.cdf")
> cel.files=list.files(pattern=".CEL$")
> data=ReadAffy(filenames=cel.files)
> pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL"))
> temp=rma(data)
I got Error below,
******
Note: You did not specify a download type. Using a default value of: Source
This will be fine for almost all users
Error in getCdfInfo(object) : Could not obtain CDF environment,
problems encountered:
Specified environment specified did not contain arabidopsis_tlgF_4x
Library - package arabidopsistlgf4xcdf not installed
Data for package affy did not contain arabidopsis_tlgF_4x
Bioconductor - arabidopsistlgf4xcdf not available
*********
So I made copy of "arabidopsistlgF.cdf", and change name
"arabidopsistlgF4x". And continue,
> env = make.cdf.env("arabidopsistlgF4x.cdf")
> cel.files=list.files(pattern=".CEL$")
> data=ReadAffy(filenames=cel.files)
>
> pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL"))
I got Error again,
********
Error in whatcdf("J_HpaII_Wt_10uM.CEL") : Could not open file
J_HpaII_Wt_10uM.CEL
********
I thought I may be need for cel file normalization, below,
> library(gcrma)
Loading required package: matchprobes
> Data <- ReadAffy()
> eset <- gcrma(Data)
I got Error again,
********
Computing affinities[1] "Checking to see if your internet connection works..."
Warning message:
unable to connect to 'www.bioconductor.org' on port 80.
Note: http://www.bioconductor.org/repository/devel/package/Source
does not seem to have a valid repository, skipping
Warning messages:
1: Failed to read replisting at
http://www.bioconductor.org/repository/devel/package/Source in:
getReplisting(repURL, repFile, method = method)
2: unable to connect to 'www.bioconductor.org' on port 80.
Note: http://www.bioconductor.org/repository/devel/package/Win32 does
not seem to have a valid repository, skipping
Note: You did not specify a download type. Using a default value of: Source
This will be fine for almost all users
Error in getCDF(cdfpackagename) : Environment arabidopsistlgf4xcdf
was not found in the Bioconductor repository.
In addition: Warning message:
Failed to read replisting at
http://www.bioconductor.org/repository/devel/package/Win32 in:
getReplisting(repURL, repFile, method = method)
********
Q2. I can not read my cel file now. Our cdf file name is
"arabidopsistlgF.cdf" . But cif file name is
"arabidopsistlgF_4x.cif". Do I need to use same name for cif and cdf?
Because cel file include cif file name. And how can I start to read
cel file?
Q3. And also I would like to read cel file and normalization using a
lot of cel files. Could you please suggest me what package is better
for reading and normalization of affy custom array? and which is
better rma (Robust Multi-Array Average expression measure) or gcrma
(Background adjustment using sequence information)?
Q4. If our array has over 3 million data, how long do I need for
reading and normalization for 1 data (depend on machine power?)? Do
you have some speculation for calculation efficiency? I need to read
cdf file for about 20min.
Thank you very much,
Junshi
--
***********************************************************
***********************************************************
Junshi Yazaki
The Salk Institute for Biological Studies
More information about the Bioconductor
mailing list