# [R] prop.trend.test

peter dalgaard pd@|gd @end|ng |rom gm@||@com
Tue Sep 12 14:21:24 CEST 2023

```Argh, yes, drats, thanks.

There will be a matter of an estimated residual error.

So

> coef(summary(ht\$lmfit))["score","t value"]*sigma(ht\$lmfit)
[1] -2.867913

matches the signes square root of the Chi-square.

Or, likely better (avoid 0 df cases), switch to a Gaussian glm fit and use the z stat

> coef(summary(ht\$glmfit, dispersion = 1))
Estimate Std. Error   z value      Pr(>|z|)
(Intercept)  1.02187141 0.03199737 31.936102 8.425506e-224
score       -0.03341563 0.01165155 -2.867913  4.131897e-03

The Estimate (-0.0334) should still make sense as the LPM estimate of the regression slope.

- Peter D.

> On 8 Sep 2023, at 12:53 , Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
> Às 10:06 de 08/09/2023, peter dalgaard escreveu:
>> Yes, this was written a bit bone-headed (as I am allowed to say...)
>> If you look at the code, you will see inside:
>>     a <- anova(lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)),
>>         weights = w))
>> and the lm() inside should give you the direction via the sign of the regression coefficient on "score".
>>  So, at least for now, you could just doctor a copy of the code for your own purposes, as in
>>  fit <- lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)),
>>         weights = w)
>>  a <- anova(fit)
>>  and arrange to return coef(fit)["score"] at the end. Something like structure(... estimate=c(lpm.slope=coef(fit)["score"]) ....)
>> (I expect that you might also extract the t-statistic from coef(summary(fit)) and find that it is the signed square root of the Chi-square, but I won't have time to test that just now.)
>> -pd
>>> On 8 Sep 2023, at 07:22 , Thomas Subia via R-help <r-help using r-project.org> wrote:
>>>
>>> Colleagues,
>>>
>>> Thanks all for the responses.
>>>
>>> I am monitoring the daily total number of defects per sample unit.
>>> I need to know whether this daily defect proportion is trending upward (a bad thing for a manufacturing process).
>>>
>>> My first thought was to use either a u or a u' control chart for this.
>>> As far as I know, u or u' charts are poor to detect drifts.
>>>
>>> This is why I chose to use prop.trend.test to detect trends in proportions.
>>>
>>> While prop.trend.test can confirm the existence of a trend, as far as I know, it is left to the user
>>> to determine what direction that trend is.
>>>
>>> One way to illustrate trending is of course to plot the data and use geom_smooth and method lm
>>> For the non-statisticians in my group, I've found that using this method along with the p-value of prop.trend.test, makes it easier for the users to determine the existence of trending and its direction.
>>>
>>> If there are any other ways to do this, please let me know.
>>>
>>> Thomas Subia
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thursday, September 7, 2023 at 10:31:27 AM PDT, Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>>
>>>
>>>
>>>
>>>
>>> Às 14:23 de 07/09/2023, Thomas Subia via R-help escreveu:
>>>>
>>>> Colleagues
>>>>
>>>>    Consider
>>>> smokers  <- c( 83, 90, 129, 70 )
>>>> patients <- c( 86, 93, 136, 82 )
>>>>
>>>>    prop.trend.test(smokers, patients)
>>>>
>>>>    Output:
>>>>
>>>>        Chi-squared Test for Trend inProportions
>>>>
>>>>    data:  smokers out of patients ,
>>>>
>>>> using scores: 1 2 3 4
>>>>
>>>> X-squared = 8.2249, df = 1, p-value = 0.004132
>>>>
>>>>    # trend test for proportions indicates proportions aretrending.
>>>>
>>>>    How does one identify the direction of trending?
>>>>    # prop.test indicates that the proportions are unequal but doeslittle to indicate trend direction.
>>>> All the best,
>>>> Thomas Subia
>>>>
>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> Hello,
>>>
>>> By visual inspection it seems that there is a decreasing trend.
>>> Note that the sample estimates of prop.test and smokers/patients are equal.
>>>
>>>
>>> smokers  <- c( 83, 90, 129, 70 )
>>> patients <- c( 86, 93, 136, 82 )
>>>
>>> prop.test(smokers, patients)\$estimate
>>> #>    prop 1    prop 2    prop 3    prop 4
>>> #> 0.9651163 0.9677419 0.9485294 0.8536585
>>>
>>> smokers/patients
>>>
>>> #> [1] 0.9651163 0.9677419 0.9485294 0.8536585
>>>
>>> plot(smokers/patients, type = "b")
>>>
>>>
>>>
>>> Hope this helps,
>>>
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> Actually, the t-statistic is not the signed square root of the X-squared test statistic. I have edited the function, assigned the lm fit and returned it as is. (print.htest won't print this new list member so the output is not cluttered with irrelevant noise.)
>
>
> smokers  <- c( 83, 90, 129, 70 )
> patients <- c( 86, 93, 136, 82 )
>
> edit(prop.trend.test, file = "ptt.R")
> source("ptt.R")
>
> # stats::prop.trend.test edited to include the results
> # of the lm fit and saved under a new name
> ptt <- function (x, n, score = seq_along(x))
> {
>  method <- "Chi-squared Test for Trend in Proportions"
>  dname <- paste(deparse1(substitute(x)), "out of", deparse1(substitute(n)),
>                 ",\n using scores:", paste(score, collapse = " "))
>  x <- as.vector(x)
>  n <- as.vector(n)
>  p <- sum(x)/sum(n)
>  w <- n/p/(1 - p)
>  a <- anova(fit <- lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)),
>                       weights = w))
>  chisq <- c(`X-squared` = a["score", "Sum Sq"])
>  structure(list(statistic = chisq, parameter = c(df = 1),
>                 p.value = pchisq(as.numeric(chisq), 1, lower.tail = FALSE),
>                 method = method, data.name = dname
>                 , lmfit = fit
>                 ), class = "htest")
> }
>
>
> ht <- ptt(smokers, patients)
> ht\$statistic |> sqrt()
> #> X-squared
> #>  2.867913
> ht\$lmfit |> summary() |> coef()
> #>                Estimate Std. Error   t value   Pr(>|t|)
> #> (Intercept)  1.02187141 0.04732740 21.591539 0.00213815
> #> score       -0.03341563 0.01723384 -1.938954 0.19207036
>
>
> Hope this helps,
>