[R] OT: a weighted rank-based, non-paired test statistic ?
Torsten Hothorn
Torsten.Hothorn at stat.uni-muenchen.de
Tue Jun 9 09:27:03 CEST 2009
> Date: Fri, 5 Jun 2009 16:09:42 -0700 (PDT)
> From: Thomas Lumley <tlumley at u.washington.edu>
> To: dylan.beaudette at gmail.com
> Cc: "'r-help at stat.math.ethz.ch'" <r-help at stat.math.ethz.ch>
> Subject: Re: [R] OT: a weighted rank-based, non-paired test statistic ?
>
> On Fri, 5 Jun 2009, Dylan Beaudette wrote:
>> Is anyone aware of a rank-based, non-paired test such as the
>> Krustal-Wallis,
>> that can accommodate weights?
>
> You don't say what sort of weights, but basically, no.
>
> Whether you have precision weights or sampling weights, the test will no
> longer be distribution-free.
>
>> Alternatively, would it make sense to simulate a dataset by duplicating
>> observations in proportion to their weight, and then using the
>> Krustal-Wallis
>> test?
>
> No.
>
well, if you have case weights, i.e., w[i] == 5 means: there are five
observations which look exactly like observation i, then there are several
ways to do it:
> library("coin")
>
> set.seed(29)
> x <- gl(3, 10)
> y <- rnorm(length(x), mean = c(0, 0, 1)[x])
> d <- data.frame(y = y, x = x)
> w <- rep(2, nrow(d)) ### double each obs
>
> ### all the same
> kruskal_test(y ~ x, data = rbind(d, d))
Asymptotic Kruskal-Wallis Test
data: y by x (1, 2, 3)
chi-squared = 12.1176, df = 2, p-value = 0.002337
>
> kruskal_test(y ~ x, data = d[rep(1:nrow(d), w),])
Asymptotic Kruskal-Wallis Test
data: y by x (1, 2, 3)
chi-squared = 12.1176, df = 2, p-value = 0.002337
>
> kruskal_test(y ~ x, data = d, weights = ~ w)
Asymptotic Kruskal-Wallis Test
data: y by x (1, 2, 3)
chi-squared = 12.1176, df = 2, p-value = 0.002337
the first two work by duplicating data, the latter one is more memory
efficient since it computes weighted statistics (and their distribution).
However, as Thomas pointed out, other forms of weights are more difficult
to deal with.
Best wishes,
Torsten
> -thomas
>
> Thomas Lumley Assoc. Professor, Biostatistics
> tlumley at u.washington.edu University of Washington, Seattle
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list