[R-pkgs] MANOVA for collinear responses with rotation testing (ffmanova) + Synthetic data (RegSDC)

Wed Jan 23 23:10:32 CET 2019

Hi,

Package ffmanova originally released on CRAN in 2006 has now been updated to Version 1.0. It has been a stable working horse and the changes are cosmetic.

Package RegSDC (Version: 0.2.0) is a new package on CRAN. The two packages are theoretically related and both make use of conditioned multivariate normal simulations - rotation testing in ffmanova and synthetic data generation in RegSDC.

Package ffmanova is mainly meant as a package for multivariate responses, but it also involves a general contribution to ANOVA testing in linear models. The approach to sums of squares (Type II*) is invariant to scale changes of continuous variables and pitfalls are avoided. Try to run code below.

ffmanova: Fifty-Fifty MANOVA

General linear modeling with multiple responses (MANCOVA). An overall p-value for each model term is calculated by the 50-50 MANOVA method by Langsrud (2002) <https://10.1111/1467-9884.00320>, which handles collinear responses. Rotation testing, described by Langsrud (2005) <https://10.1007/s11222-005-4789-5>, is used to compute adjusted single response p-values according to familywise error rates and false discovery rates (FDR). The approach to FDR is described in the appendix of Moen et al. (2005) <https://10.1128/AEM.71.4.2086-2094.2005>. Unbalanced designs are handled by Type II sums of squares as argued in Langsrud (2003) <https://10.1023/A:1023260610025>. Furthermore, the Type II philosophy is extended to continuous design variables as described in Langsrud et al. (2007) <https://10.1080/02664760701594246>. This means that the method is invariant to scale changes and that common pitfalls are avoided.

RegSDC: Information Preserving Regression-Based Tools for Statistical Disclosure Control

Information Preserving Regression-Based Tools for Statistical Disclosure Control
Implementation of the methods described in the paper with the above title: Langsrud, �. (2019) <https://10.1007/s11222-018-9848-9>. Open view-only version at <https://rdcu.be/bfeWQ>. The package can be used to generate synthetic or hybrid continuous microdata, and the relationship to the original data can be controlled in several ways.

Best,
�yvind Langsrud

set.seed(123)
z <- 1:9
x <- c(0, 0, 0, 10, 10, 10, 1, 1, 1)
y <- rnorm(9)/10 + x  # y depends strongly on x
z100 <- z + 100  # change of scale (origin)
x100 <- x + 100  # change of scale (origin)
library(car)  # Anova with type II as default
library(ffmanova)

# Type III depends on scale
Anova(lm(y ~ factor(x) * z), type = 3)
Anova(lm(y ~ factor(x) * z100), type = 3)

# Even Type II depends on scale
Anova(lm(y ~ x + I(x^2)), type = 2)
Anova(lm(y ~ x100 + I(x100^2)), type = 2)

# Type II* within ffmanova is invariant
ffmanova(y ~ x100 + I(x100^2))
ffmanova(y ~ z * (x100 + I(x100^2)))

# Type I for comparison
anova(lm(y ~ x100 + I(x100^2)))  # same as ffmanova
anova(lm(y ~ z * (x100 + I(x100^2))))  # but here z is significant

	[[alternative HTML version deleted]]