# [BioC] Question regarding MADE4 graphs

aedin culhane aedin at jimmy.harvard.edu
Fri May 4 18:22:41 CEST 2012

```Hi Barbara
Yes F1 and F2 refer to the factors or axes, where 1 is the first axes, 2
is the second more highly ranked etc.  The % they state is the amount of
variance captured by the analysis.  These plots were drawn in ade4, and

In COA the values refer to the % of total inertia captured by each
principal, where the sum of the inertia is equal to
total chi-sq of the matrix.  COA transform data into a matrix of
chi-square distance before decomposing the matrix using SVD.  To
calculate the chi-sq (its really simply), your observed values are the
real data values in the matrix and your expected values is a product of
the row and column weight, and these are plugged into the Pearson ChiSq
statistic.
This are numerous nice tutorials on COA online, I would recommend
slides from Michael Greenacre available online at
http://www.econ.upf.edu/~michael/vienna/CARME1_BW.pdf.  A worked example
is also given in
http://marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf

To calculate the % inertia on each factor or principal axis in made4:

dudi\$ord\$eig*100/sum(dudi\$ord\$eig)

The cumulative variance is given by
cumsum(dudi\$ord\$eig*100/sum(dudi\$ord\$eig))

where dudi is the result of running an ord analysis

If you are using ade4, run dudi.coa and drop the \$ord subset above, ie
use dudi\$eig.  I have explained in more detail in a tutorial I wrote
http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/EDA.pdf. Rcode
and datasets to run this tutorial are available on
http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/

Best wishes
Aedin

On 5/3/2012 7:59 AM, Barbara Shih wrote:
> Hi Aedin,
> Thank you very much for your explaination. I understand the d value now. However, I still have some questions regarding the axis for the correspondance analysis.
> For instance, http://www.sciencedirect.com/science/article/pii/S1567134810000730
> In Figure 1, they put down on the axis "F2 14.2%" and "F1 15.9%". Does the % usually apply to CA plots? Or that's a study specific situtation?
> Also, is F1 and F2 for correspondence analysis?
>
>
> P.S. Should I reply in the Bioconductor list?
>
> Thank you very much for your help.
>
> Regards
> Barbara
>
>
>
> ________________________________________
> From: aedin culhane [aedin at jimmy.harvard.edu]
> Sent: 02 May 2012 16:02
> To: Barbara Shih
> Subject: Re: Question regarding MADE4 graphs
>
> Hi Barbara
> I replied on the Bioc list about the d, it indicates the scale. In
> principcal components analysis or correspondence analysis. we don't tend
> to include numbers on the axes, as these are relative and not really
> meaningful.  Also the axis orientation is arbitrary, ie whether samples
> are at the negative or positive end of the axis. We are interested in
> the relative direction and distance from the origin (the greater the
> distance the higher the weight or score that sample has in that axis).
> Samples projected in the same direction share the same trend
> (upregulation of the same genes)
>
> If you want to include an axis label, I would simple put  PC1, PC2 etc
> or F1, F2 for principal component or factor respectively.   The first
> component (axes1 in made4) is the horizontal (x-axis) and the second
> (axes2) is the vertical (y-axis)
>
> Hope this helps, please let me know if you have more questions
> Aedin
>
>
>
> On 5/1/2012 4:55 AM, Barbara Shih wrote:
>> Dear Dr Culhane,
>> I am currently using MADE4 for my data analysis. I am wondering how to
>> label the axis, and what the number in the top right corner means, for
>> the correspondence analysis (COA) plot generated by the package
>> (figure 3 in the link)
>>
>>
>> Barbara
