[BioC] Question regarding MADE4 graphs

aedin culhane aedin at jimmy.harvard.edu
Fri May 4 18:22:41 CEST 2012


Hi Barbara
Yes F1 and F2 refer to the factors or axes, where 1 is the first axes, 2 
is the second more highly ranked etc.  The % they state is the amount of 
variance captured by the analysis.  These plots were drawn in ade4, and 
made4 is an extension to ade4.

In COA the values refer to the % of total inertia captured by each 
principal, where the sum of the inertia is equal to
total chi-sq of the matrix.  COA transform data into a matrix of 
chi-square distance before decomposing the matrix using SVD.  To 
calculate the chi-sq (its really simply), your observed values are the 
real data values in the matrix and your expected values is a product of 
the row and column weight, and these are plugged into the Pearson ChiSq 
statistic.
   This are numerous nice tutorials on COA online, I would recommend 
slides from Michael Greenacre available online at 
http://www.econ.upf.edu/~michael/vienna/CARME1_BW.pdf.  A worked example 
is also given in   
http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf

To calculate the % inertia on each factor or principal axis in made4:

dudi$ord$eig*100/sum(dudi$ord$eig)

The cumulative variance is given by
cumsum(dudi$ord$eig*100/sum(dudi$ord$eig))

where dudi is the result of running an ord analysis

If you are using ade4, run dudi.coa and drop the $ord subset above, ie 
use dudi$eig.  I have explained in more detail in a tutorial I wrote 
http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/EDA.pdf. Rcode 
and datasets to run this tutorial are available on 
http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/

Best wishes
Aedin

On 5/3/2012 7:59 AM, Barbara Shih wrote:
> Hi Aedin,
> Thank you very much for your explaination. I understand the d value now. However, I still have some questions regarding the axis for the correspondance analysis.
> For instance, http://www.sciencedirect.com/science/article/pii/S1567134810000730
> In Figure 1, they put down on the axis "F2 14.2%" and "F1 15.9%". Does the % usually apply to CA plots? Or that's a study specific situtation?
> Also, is F1 and F2 for correspondence analysis?
>
>
> P.S. Should I reply in the Bioconductor list?
>
> Thank you very much for your help.
>
> Regards
> Barbara
>
>
>
> ________________________________________
> From: aedin culhane [aedin at jimmy.harvard.edu]
> Sent: 02 May 2012 16:02
> To: Barbara Shih
> Subject: Re: Question regarding MADE4 graphs
>
> Hi Barbara
> I replied on the Bioc list about the d, it indicates the scale. In
> principcal components analysis or correspondence analysis. we don't tend
> to include numbers on the axes, as these are relative and not really
> meaningful.  Also the axis orientation is arbitrary, ie whether samples
> are at the negative or positive end of the axis. We are interested in
> the relative direction and distance from the origin (the greater the
> distance the higher the weight or score that sample has in that axis).
> Samples projected in the same direction share the same trend
> (upregulation of the same genes)
>
> If you want to include an axis label, I would simple put  PC1, PC2 etc
> or F1, F2 for principal component or factor respectively.   The first
> component (axes1 in made4) is the horizontal (x-axis) and the second
> (axes2) is the vertical (y-axis)
>
> Hope this helps, please let me know if you have more questions
> Aedin
>
>
>
> On 5/1/2012 4:55 AM, Barbara Shih wrote:
>> Dear Dr Culhane,
>> I am currently using MADE4 for my data analysis. I am wondering how to
>> label the axis, and what the number in the top right corner means, for
>> the correspondence analysis (COA) plot generated by the package
>> (figure 3 in the link)
>> http://www.bioconductor.org/packages/2.9/bioc/vignettes/made4/inst/doc/introduction.pdf
>>
>> Thank you very much for your help in advance.
>>
>> Barbara
> --
> Aedin Culhane
> Computational Biology and Functional Genomics Laboratory
> Harvard School of Public Health,
> Dana-Farber Cancer Institute
>
> web: http://www.hsph.harvard.edu/research/aedin-culhane/
> email: aedin at jimmy.harvard.edu
> phone: +1 617 632 2468
> Fax: +1 617 582 7760
>
>
> Mailing Address:
> Attn: Aedin Culhane, SM822C
> 450 Brookline Ave.
> Boston, MA 02215
>

-- 
Aedin Culhane
Computational Biology and Functional Genomics Laboratory
Harvard School of Public Health,
Dana-Farber Cancer Institute

web: http://www.hsph.harvard.edu/research/aedin-culhane/
email: aedin at jimmy.harvard.edu
phone: +1 617 632 2468
Fax: +1 617 582 7760


Mailing Address:
Attn: Aedin Culhane, SM822C
450 Brookline Ave.
Boston, MA 02215



More information about the Bioconductor mailing list