[BioC] Question regarding MADE4 graphs
aedin culhane
aedin at jimmy.harvard.edu
Fri May 4 18:22:41 CEST 2012
Hi Barbara
Yes F1 and F2 refer to the factors or axes, where 1 is the first axes, 2
is the second more highly ranked etc. The % they state is the amount of
variance captured by the analysis. These plots were drawn in ade4, and
made4 is an extension to ade4.
In COA the values refer to the % of total inertia captured by each
principal, where the sum of the inertia is equal to
total chi-sq of the matrix. COA transform data into a matrix of
chi-square distance before decomposing the matrix using SVD. To
calculate the chi-sq (its really simply), your observed values are the
real data values in the matrix and your expected values is a product of
the row and column weight, and these are plugged into the Pearson ChiSq
statistic.
This are numerous nice tutorials on COA online, I would recommend
slides from Michael Greenacre available online at
http://www.econ.upf.edu/~michael/vienna/CARME1_BW.pdf. A worked example
is also given in
http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf
To calculate the % inertia on each factor or principal axis in made4:
dudi$ord$eig*100/sum(dudi$ord$eig)
The cumulative variance is given by
cumsum(dudi$ord$eig*100/sum(dudi$ord$eig))
where dudi is the result of running an ord analysis
If you are using ade4, run dudi.coa and drop the $ord subset above, ie
use dudi$eig. I have explained in more detail in a tutorial I wrote
http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/EDA.pdf. Rcode
and datasets to run this tutorial are available on
http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/
Best wishes
Aedin
On 5/3/2012 7:59 AM, Barbara Shih wrote:
> Hi Aedin,
> Thank you very much for your explaination. I understand the d value now. However, I still have some questions regarding the axis for the correspondance analysis.
> For instance, http://www.sciencedirect.com/science/article/pii/S1567134810000730
> In Figure 1, they put down on the axis "F2 14.2%" and "F1 15.9%". Does the % usually apply to CA plots? Or that's a study specific situtation?
> Also, is F1 and F2 for correspondence analysis?
>
>
> P.S. Should I reply in the Bioconductor list?
>
> Thank you very much for your help.
>
> Regards
> Barbara
>
>
>
> ________________________________________
> From: aedin culhane [aedin at jimmy.harvard.edu]
> Sent: 02 May 2012 16:02
> To: Barbara Shih
> Subject: Re: Question regarding MADE4 graphs
>
> Hi Barbara
> I replied on the Bioc list about the d, it indicates the scale. In
> principcal components analysis or correspondence analysis. we don't tend
> to include numbers on the axes, as these are relative and not really
> meaningful. Also the axis orientation is arbitrary, ie whether samples
> are at the negative or positive end of the axis. We are interested in
> the relative direction and distance from the origin (the greater the
> distance the higher the weight or score that sample has in that axis).
> Samples projected in the same direction share the same trend
> (upregulation of the same genes)
>
> If you want to include an axis label, I would simple put PC1, PC2 etc
> or F1, F2 for principal component or factor respectively. The first
> component (axes1 in made4) is the horizontal (x-axis) and the second
> (axes2) is the vertical (y-axis)
>
> Hope this helps, please let me know if you have more questions
> Aedin
>
>
>
> On 5/1/2012 4:55 AM, Barbara Shih wrote:
>> Dear Dr Culhane,
>> I am currently using MADE4 for my data analysis. I am wondering how to
>> label the axis, and what the number in the top right corner means, for
>> the correspondence analysis (COA) plot generated by the package
>> (figure 3 in the link)
>> http://www.bioconductor.org/packages/2.9/bioc/vignettes/made4/inst/doc/introduction.pdf
>>
>> Thank you very much for your help in advance.
>>
>> Barbara
> --
> Aedin Culhane
> Computational Biology and Functional Genomics Laboratory
> Harvard School of Public Health,
> Dana-Farber Cancer Institute
>
> web: http://www.hsph.harvard.edu/research/aedin-culhane/
> email: aedin at jimmy.harvard.edu
> phone: +1 617 632 2468
> Fax: +1 617 582 7760
>
>
> Mailing Address:
> Attn: Aedin Culhane, SM822C
> 450 Brookline Ave.
> Boston, MA 02215
>
--
Aedin Culhane
Computational Biology and Functional Genomics Laboratory
Harvard School of Public Health,
Dana-Farber Cancer Institute
web: http://www.hsph.harvard.edu/research/aedin-culhane/
email: aedin at jimmy.harvard.edu
phone: +1 617 632 2468
Fax: +1 617 582 7760
Mailing Address:
Attn: Aedin Culhane, SM822C
450 Brookline Ave.
Boston, MA 02215
More information about the Bioconductor
mailing list