[RsR] Case where options in mad() seemingly have no effect

John Buggeln jhb @end|ng |rom ude|@edu
Wed Jul 2 18:14:12 CEST 2025


This was clarifying, thank you for your response.

A suggestion for the help file would be to specify which median the low/high options effect. E.g., 

"if TRUE, compute the ‘lo-median’ of the absolute deviations, i.e., for even sample size, do not average the two middle values, but take the smaller one.”

Best,

John Buggeln, MS
PhD Candidate
University of Delaware


> 
> On Jul 2, 2025, at 6:51 AM, Martin Maechler <maechler using stat.math.ethz.ch> wrote:
> 
>>>>>> John Buggeln 
>>>>>>    on Tue, 1 Jul 2025 11:39:34 -0400 writes:
> 
>> Hello!
>> I have what seems to be a quick question. Below I have pasted the code for the mad() function, from “package::stats”. 
> 
>> function (x, center = median(x), constant = 1.4826, na.rm = FALSE, 
>>         low = FALSE, high = FALSE) 
>> {
>>  if (na.rm) 
>>     x <- x[!is.na(x)]
>>  n <- length(x)
>>  constant * if ((low || high) && n%%2 == 0) {
>>     if (low && high) 
>>       stop("'low' and 'high' cannot be both TRUE")
>>     n2 <- n%/%2 + as.integer(high)
>>     sort(abs(x - center), partial = n2)[n2]
>>  }
>>  else median(abs(x - center))
>> }
> 
>> The documentation states that the options “low” and “high”
>> update the way the median is being calculated.
> 
> Yes, but *not* the "inner" median(x) which is the default for `center`, 
> but rather the "outer" median( |x - center| ).
> 
>> However, in the code there is no update to “center" from the value of low or high. What am I missing here? There must be something fundamental I don’t understand about R implementing this function. 
> 
> Yes.  I think you just need to read the help file more carefully.
> 
> If you think the help file was not clear, can you propose an
> improvement there ?
> 
>> I also have a small example where the options seemingly have to effect on the MAD calculation.
> 
> Yes, in that case there is no effect (of course, see below)
> 
>> ###################VERSION INFO#############################
> 
>> R 4.5.0 on MacOSX 14.5 (M2 Apple Silicon)
>> Sys.getlocale() is "en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8”
> 
>> Information on package ‘stats’
> 
>> Description:
> 
> ..............
> ..............
> 
> 
> ... the R version  4.5.0   is all that counts;  'stats' is part of R !
> 
>> ###################EXAMPLE#############################
> 
>>> var <-scan(text="3.2 3.6 4.3 4.3 4.4 4.7 5.1 5.3 5.5 5.8 6.7 8.9")
>> Read 12 items
>>> mad(var) 
>> [1] 0.88956
>>> mad(var, high=TRUE)
>> [1] 0.88956
>>> mad(var, low=TRUE)
>> [1] 0.88956
> 
> Indeed, in this case, the outer lo-median, median, and hi-median
> are identical (using 'x' instead of 'var' (which is after all
> the var() function)) :
> 
> 
> ##--------------o<------------------o<------------------o<-----------
> x <- c(3.2, 3.6, 4.3, 4.3, 4.4, 4.7, 5.1, 5.3, 5.5, 5.8, 6.7, 8.9)
> mad(x)            # 0.88956
> mad(x, high=TRUE) # 0.88956
> mad(x, low =TRUE) # 0.88956
> 
> ## MM:  the same, yes, because in this case,
> sort(abs(x - median(x)))
> ## [1] 0.2 0.2 0.4 0.5 0.6 0.6 0.6 0.9 1.3 1.7 1.8 4.0
> ##      1   2   3   4   5   6   7   8   ..          12
> ## the median(), lo-median() and hi-median  {w/o the constant factor}
> ## are all == 0.6
> 
> ## MM: slightly more interesting (after all this is about robustness !
> x <- round(10*x)
> x[1:3] <- 1000 + 1:3
> dput(x) # c(1001, 1002, 1003, 43, 44, 47, 51, 53, 55, 58, 67, 89)
> ## and now,
> cbind(mads <- c(lomad = mad(x, low=TRUE), mad = mad(x), himad = mad(x, high = TRUE))) / 1.4826
> ## give
> 
> ## lomad 10.5
> ## mad   11.5
> ## himad 12.5
> ##--------------o<------------------o<------------------o<-----------
> 
> 
> With best regards,
> Martin
> 
> 
> 
> 
>> This is confusing because when calculated manually the MAD is different:
> 
> This is not "the MAD" that we have defined; rather you use
> different *center*, i.e. lo_median, median, or hi_median for the __center__ 
> 
>>> # Manually
>>> median(abs(var-median(var)))*1.4826
>> [1] 0.88956
>>> median(abs(var-4.7))*1.4826 # LOW for median
>> [1] 1.03782
>>> median(abs(var-5.1))*1.4826 # HIGH for median
>> [1] 1.11195
> 
> 
>> Best,
> 
>> John Buggeln, MS
>> PhD Candidate
>> University of Delaware
>> _______________________________________________
>> R-SIG-Robust using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-robust



More information about the R-SIG-Robust mailing list