[BioC] agilent data format

Weiwei Shi helprhelp at gmail.com
Thu Jun 14 21:56:52 CEST 2007


The original paper is not very clear about the data they provided. I
am contacting with them about it.

If logratio is treat vs. control, why there are two files with similar
names which only differ at the dyes' names? I mean both of two files
have logratio column but with different number for the same probe.

I heard about JMP can handle data preprocessing , normalization , gene
selection,
multidimensional scaling , PCA , clustering ananysis , and annotation
analysis. I am wondering if bioconductor builds a similar pipeline to
perform that?

This is some work left from others, really making me headache...

On 6/14/07, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> Weiwei Shi wrote:
> > Dear Listers:
> >
> > This is my first time to look at Agilent's data for one of my research
> > on cross-platform issue; So this is very newbie's question:
> >
> > one example is like this:
> > two files with cy3_50mg_6h_Rat_3125.txt and cy5_50my_6h_Rat_3125.txt.
> > I believe one of them must be control but not sure which one (is there
> > a tradition to use cy5 as control?).
>
> You will almost certainly need to communicate with the biologist that
> did the arrays to determine the experimental design.
>
> > One of the dataset's format looks
> > like this:
> >
> > ProbeName     GeneName        LogRatio        PValueLogRatio
> > (+)Pro25G-03  Pro25G  -4.92E-01       9.04E-16
> > (-)3xSLv1     NegativeControl 0.00E+00        1.00E+00
> > A_43_P21252   CB546590        1.64E-02        7.96E-01
> > A_42_P534203  272585_Rn       -8.69E-03       9.16E-01
> > A_43_P22195   CB547437        4.77E-02        4.90E-01
> > A_43_P16421   AA964066        -9.25E-04       9.87E-01
> > (+)Pro25G-02  Pro25G  -1.58E+00       1.10E-33
> > A_43_P13118   NM_130424       4.12E-02        4.68E-01
> > A_43_P19445   CB605581        -1.74E-03       9.93E-01
> > A_43_P11302   BQ206007        -9.17E-02       1.56E-01
> > A_43_P22361   AA964019        -1.31E-01       5.83E-01
> > A_43_P10152   BF420136        6.54E-02        4.11E-01
> > A_42_P573643  234509_Rn       5.87E-02        2.43E-01
> >
> > I assume the LogRatio is the signal against background? But what is
> > (+)Pro25G-03 or (-)3xSLv1?
>
> I think the LogRatio is probably the ratio between Red and Green, but
> there is not a way to tell here.  Have these files been manipulated in
> excel or something?  Agilent files typically have a much larger number
> of columns and have a 9-line header.  The (+)Pro25G-03 and (-)3xSLv1 are
> controls.
>
> > btw, are there some packages to read this type of data format?
>
> If you have files like the one above, read.table will read them just
> fine.  If you have actual Agilent files (which these are not, I don't
> think), then the limma package will read them.
>
> Sean
>


-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III



More information about the Bioconductor mailing list