[R] The RV coinertia coefficient to interpret multivariate analysis plots

Tue Oct 15 16:15:26 CEST 2024

В Sun, 13 Oct 2024 13:19:09 +0200
David Bars <el.segarrenc using gmail.com> пишет:

> Through different microbiota datasets, I have plotted PCoA, db-RDA and
> sPLS-DA using 3 different types of normalization methods (Total sum of
> squares, cumulative sum of squares and rarefaction). For each dataset
> and multivariate analysis (PCoA, db-RDA or sPLS-DA) in order to easily
> interpret if the different normalization strategies creates me
> different or equivalent PCoA for example, I have calculated the
> Procrustes sum of squares and the RV coefficient of co-inertia.
> However, for the RV coefficient of co-inertia, I have obtained the
> value 1 (perfect equivalence) for the PCoA comparisons amongst the 3
> methods of normalization, and also for the db-RDA. For the sPLS-DA I
> have not obtained 1 for all the comparisons.

> Why could obtain 1 for PCoA and db-RDA and not for sPLS-DA?

It's hard to give a precise answer without seeing the details. For
example, how exactly was sPLS-DA calculated? Isn't it a supervised
_classification_ method, unlike the distance-based methods PCoA
(unsupervised dimensionality reduction) and db-RDA (correlation
analysis between two blocks of data)?

Perhaps the reason you couldn't get an RV coefficient of exactly 1 for
sPLS-DA with different scalings is because the method is sparse, i.e.,
it sacrifices some of its ability to explain the data in order to
obtain loadings that are exactly 0 for "unimportant" variables.
Different scalings may have resulted in different subsets of related
variables passing the "importance" threshold.

-- 
Best regards,
Ivan