[R] t.test with Welch correction is ambiguous

Mon Nov 27 13:42:25 CET 2023

Dear R Team!

There was an ongoing debate on Research Gate about the “Welch” option in
your base R t.test command. A user noticed that the correction of the
degrees of freedom labeled as “Welch Two Sample t-test”, if you choose
var.equal
= TRUE in the R t.test command, differs from the output of the Stata
analysis, which is also labeled as “Welch's degrees of freedom”.  Confusingly
enough, the R output coincided with the Stata result labeled as
“Satterthwaite's
degrees of freedom”. Unfortunately, the R documentation wasn’t clear
either, since it lacks any specific reference and the formulation is
ambiguous: “If TRUE then the pooled variance is used to estimate the
variance otherwise the Welch (or Satterthwaite) approximation to the
degrees of freedom is used." It rather sounds as if both options are
available and not that both authors proposed the same correction separately.

After doing some research and looking into the R code, we found a solution
and would like to suggest an update to the R documentation, to make it more
clear (you can find the similar proposal to the Stata list here:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1734987-unequal-vs-welch-options-for-ttest-why-no-mention-of-welch-1938-in-the-documentation
)

What is called “Welch Two Sample t-test” in the t.test command refers to
two publications (see links below) with the same correction, namely Welch
(1938) and Satterthwaite (1946). Hence, you also find "Welch–Satterthwaite"
correction as a description in the literature for this (which is the
aforementioned “Satterthwaite's degrees of freedom” correction in Stata).
But there is also another correction proposed by Welch (1947), which has
slightly different denominators (see code below), which is called “Welch's
degrees of freedom” in Stata. This option is not available in R so far.

Therefore, we suggest a) to cite the appropriate references in the
documentation (at least Welch (1938) and Satterthwaite (1946)), b) adapt
the output to something like “Welch-Satterthwaite adjusted Two Sample
t-test” and maybe c) to incorporate the third option for the Welch (1947)
adjustment, where the Welch-Satterthwaite correction should be the default
option (Aspin & Welch, 1949). Code proposal below for the df correction.

Best wishes,
Rainer Düsing

   1. ·  https://www.jstor.org/stable/2332010
   <https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332010?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1938)
   2. ·  https://www.jstor.org/stable/3002019
   <https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F3002019?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Satterthwaite, 1946)
   3. ·  https://www.jstor.org/stable/2332510
   <https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332510?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1947)
   4. ·  Aspin, Alice A., and B. L. Welch. “Tables for Use in Comparisons
   Whose Accuracy Involves Two Variances, Separately Estimated.”
   *Biometrika* 36, no. 3/4 (1949): 290–96. https://doi.org/10.2307/2332668
   <https://www.researchgate.net/deref/https%3A%2F%2Fdoi.org%2F10.2307%2F2332668?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>.
   [see point 4 in the Appendix by Welch]

var.equal = "yes"

var.equal = "Welch"

var.equal = "W-S"

vx <- var(x)

nx <- length(x)

vy <- var(y)

ny <- length(y)

if (var.equal == "yes") {

  df <- nx + ny - 2

  v <- 0

  if (nx > 1)

    v <- v + (nx - 1) * vx

  if (ny > 1)

    v <- v + (ny - 1) * vy

  v <- v/df

  stderr <- sqrt(v * (1/nx + 1/ny))

} else if (var.equal == "Welch") {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- -2+(stderr^4/(stderrx^4/(nx + 1) + stderry^4/(ny +1)))

} else {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- stderr^4/(stderrx^4/(nx - 1) + stderry^4/(ny -1))

}

-- 
*Dr. rer. nat. Rainer Düsing, Dipl.-Psych. *
Universität Osnabrück
Institut für Psychologie
Fachgebiet Forschungsmethodik, Diagnostik und Evaluation
Lise-Meitner-Str. 3
49076 Osnabrück

Raum 75/222
Tel: +49-541 969 7734
Email: raduesing using uos.de <rduesing using uos.de>

	[[alternative HTML version deleted]]