[BioC] agilent data format
Weiwei Shi
helprhelp at gmail.com
Thu Jun 14 21:56:52 CEST 2007
The original paper is not very clear about the data they provided. I
am contacting with them about it.
If logratio is treat vs. control, why there are two files with similar
names which only differ at the dyes' names? I mean both of two files
have logratio column but with different number for the same probe.
I heard about JMP can handle data preprocessing , normalization , gene
selection,
multidimensional scaling , PCA , clustering ananysis , and annotation
analysis. I am wondering if bioconductor builds a similar pipeline to
perform that?
This is some work left from others, really making me headache...
On 6/14/07, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> Weiwei Shi wrote:
> > Dear Listers:
> >
> > This is my first time to look at Agilent's data for one of my research
> > on cross-platform issue; So this is very newbie's question:
> >
> > one example is like this:
> > two files with cy3_50mg_6h_Rat_3125.txt and cy5_50my_6h_Rat_3125.txt.
> > I believe one of them must be control but not sure which one (is there
> > a tradition to use cy5 as control?).
>
> You will almost certainly need to communicate with the biologist that
> did the arrays to determine the experimental design.
>
> > One of the dataset's format looks
> > like this:
> >
> > ProbeName GeneName LogRatio PValueLogRatio
> > (+)Pro25G-03 Pro25G -4.92E-01 9.04E-16
> > (-)3xSLv1 NegativeControl 0.00E+00 1.00E+00
> > A_43_P21252 CB546590 1.64E-02 7.96E-01
> > A_42_P534203 272585_Rn -8.69E-03 9.16E-01
> > A_43_P22195 CB547437 4.77E-02 4.90E-01
> > A_43_P16421 AA964066 -9.25E-04 9.87E-01
> > (+)Pro25G-02 Pro25G -1.58E+00 1.10E-33
> > A_43_P13118 NM_130424 4.12E-02 4.68E-01
> > A_43_P19445 CB605581 -1.74E-03 9.93E-01
> > A_43_P11302 BQ206007 -9.17E-02 1.56E-01
> > A_43_P22361 AA964019 -1.31E-01 5.83E-01
> > A_43_P10152 BF420136 6.54E-02 4.11E-01
> > A_42_P573643 234509_Rn 5.87E-02 2.43E-01
> >
> > I assume the LogRatio is the signal against background? But what is
> > (+)Pro25G-03 or (-)3xSLv1?
>
> I think the LogRatio is probably the ratio between Red and Green, but
> there is not a way to tell here. Have these files been manipulated in
> excel or something? Agilent files typically have a much larger number
> of columns and have a 9-line header. The (+)Pro25G-03 and (-)3xSLv1 are
> controls.
>
> > btw, are there some packages to read this type of data format?
>
> If you have files like the one above, read.table will read them just
> fine. If you have actual Agilent files (which these are not, I don't
> think), then the limma package will read them.
>
> Sean
>
--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
More information about the Bioconductor
mailing list