[R] prop.trend.test
peter dalgaard
pd@|gd @end|ng |rom gm@||@com
Tue Sep 12 14:21:24 CEST 2023
Argh, yes, drats, thanks.
There will be a matter of an estimated residual error.
So
> coef(summary(ht$lmfit))["score","t value"]*sigma(ht$lmfit)
[1] -2.867913
matches the signes square root of the Chi-square.
Or, likely better (avoid 0 df cases), switch to a Gaussian glm fit and use the z stat
> coef(summary(ht$glmfit, dispersion = 1))
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.02187141 0.03199737 31.936102 8.425506e-224
score -0.03341563 0.01165155 -2.867913 4.131897e-03
The Estimate (-0.0334) should still make sense as the LPM estimate of the regression slope.
- Peter D.
> On 8 Sep 2023, at 12:53 , Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
> Às 10:06 de 08/09/2023, peter dalgaard escreveu:
>> Yes, this was written a bit bone-headed (as I am allowed to say...)
>> If you look at the code, you will see inside:
>> a <- anova(lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)),
>> weights = w))
>> and the lm() inside should give you the direction via the sign of the regression coefficient on "score".
>> So, at least for now, you could just doctor a copy of the code for your own purposes, as in
>> fit <- lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)),
>> weights = w)
>> a <- anova(fit)
>> and arrange to return coef(fit)["score"] at the end. Something like structure(... estimate=c(lpm.slope=coef(fit)["score"]) ....)
>> (I expect that you might also extract the t-statistic from coef(summary(fit)) and find that it is the signed square root of the Chi-square, but I won't have time to test that just now.)
>> -pd
>>> On 8 Sep 2023, at 07:22 , Thomas Subia via R-help <r-help using r-project.org> wrote:
>>>
>>> Colleagues,
>>>
>>> Thanks all for the responses.
>>>
>>> I am monitoring the daily total number of defects per sample unit.
>>> I need to know whether this daily defect proportion is trending upward (a bad thing for a manufacturing process).
>>>
>>> My first thought was to use either a u or a u' control chart for this.
>>> As far as I know, u or u' charts are poor to detect drifts.
>>>
>>> This is why I chose to use prop.trend.test to detect trends in proportions.
>>>
>>> While prop.trend.test can confirm the existence of a trend, as far as I know, it is left to the user
>>> to determine what direction that trend is.
>>>
>>> One way to illustrate trending is of course to plot the data and use geom_smooth and method lm
>>> For the non-statisticians in my group, I've found that using this method along with the p-value of prop.trend.test, makes it easier for the users to determine the existence of trending and its direction.
>>>
>>> If there are any other ways to do this, please let me know.
>>>
>>> Thomas Subia
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thursday, September 7, 2023 at 10:31:27 AM PDT, Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>>
>>>
>>>
>>>
>>>
>>> Às 14:23 de 07/09/2023, Thomas Subia via R-help escreveu:
>>>>
>>>> Colleagues
>>>>
>>>> Consider
>>>> smokers <- c( 83, 90, 129, 70 )
>>>> patients <- c( 86, 93, 136, 82 )
>>>>
>>>> prop.trend.test(smokers, patients)
>>>>
>>>> Output:
>>>>
>>>> Chi-squared Test for Trend inProportions
>>>>
>>>> data: smokers out of patients ,
>>>>
>>>> using scores: 1 2 3 4
>>>>
>>>> X-squared = 8.2249, df = 1, p-value = 0.004132
>>>>
>>>> # trend test for proportions indicates proportions aretrending.
>>>>
>>>> How does one identify the direction of trending?
>>>> # prop.test indicates that the proportions are unequal but doeslittle to indicate trend direction.
>>>> All the best,
>>>> Thomas Subia
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> Hello,
>>>
>>> By visual inspection it seems that there is a decreasing trend.
>>> Note that the sample estimates of prop.test and smokers/patients are equal.
>>>
>>>
>>> smokers <- c( 83, 90, 129, 70 )
>>> patients <- c( 86, 93, 136, 82 )
>>>
>>> prop.test(smokers, patients)$estimate
>>> #> prop 1 prop 2 prop 3 prop 4
>>> #> 0.9651163 0.9677419 0.9485294 0.8536585
>>>
>>> smokers/patients
>>>
>>> #> [1] 0.9651163 0.9677419 0.9485294 0.8536585
>>>
>>> plot(smokers/patients, type = "b")
>>>
>>>
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> Actually, the t-statistic is not the signed square root of the X-squared test statistic. I have edited the function, assigned the lm fit and returned it as is. (print.htest won't print this new list member so the output is not cluttered with irrelevant noise.)
>
>
> smokers <- c( 83, 90, 129, 70 )
> patients <- c( 86, 93, 136, 82 )
>
> edit(prop.trend.test, file = "ptt.R")
> source("ptt.R")
>
> # stats::prop.trend.test edited to include the results
> # of the lm fit and saved under a new name
> ptt <- function (x, n, score = seq_along(x))
> {
> method <- "Chi-squared Test for Trend in Proportions"
> dname <- paste(deparse1(substitute(x)), "out of", deparse1(substitute(n)),
> ",\n using scores:", paste(score, collapse = " "))
> x <- as.vector(x)
> n <- as.vector(n)
> p <- sum(x)/sum(n)
> w <- n/p/(1 - p)
> a <- anova(fit <- lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)),
> weights = w))
> chisq <- c(`X-squared` = a["score", "Sum Sq"])
> structure(list(statistic = chisq, parameter = c(df = 1),
> p.value = pchisq(as.numeric(chisq), 1, lower.tail = FALSE),
> method = method, data.name = dname
> , lmfit = fit
> ), class = "htest")
> }
>
>
> ht <- ptt(smokers, patients)
> ht$statistic |> sqrt()
> #> X-squared
> #> 2.867913
> ht$lmfit |> summary() |> coef()
> #> Estimate Std. Error t value Pr(>|t|)
> #> (Intercept) 1.02187141 0.04732740 21.591539 0.00213815
> #> score -0.03341563 0.01723384 -1.938954 0.19207036
>
>
> Hope this helps,
>
> Rui Barradas
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
More information about the R-help
mailing list