[R] How to run prop.test on 3-level factors?

Jim Lemon drj|m|emon @end|ng |rom gm@||@com
Tue Nov 16 10:59:03 CET 2021


Hi Luigi,
Maybe multinomial regression?

https://www.r-bloggers.com/2020/05/multinomial-logistic-regression-with-r/

Jim

On Tue, Nov 16, 2021 at 7:33 PM Luigi Marongiu <marongiu.luigi using gmail.com> wrote:
>
> Hello,
> I have a large database with a column containing a factor:
> ```
> > str(df)
> 'data.frame': 5000000 obs. of  4 variables:
> $ MR   : num  0.000809 0.001236 0.001663 0.002089 0.002516 ...
> $ FCN  : num  2 2 2 2 2 2 2 2 2 2 ...
> $ Class: Factor w/ 3 levels "negative","positive",..: 1 1 1 1 1 1 1 1 1 1 ...
> $ Set  : int  1 1 1 1 1 1 1 1 1 1 ...
> - attr(*, "out.attrs")=List of 2
> ..$ dim     : Named int [1:2] 1000 1000
> .. ..- attr(*, "names")= chr [1:2] "X1" "X2"
> ..$ dimnames:List of 2
> .. ..$ X1: chr [1:1000] "X1=0.0008094667" "X1=0.0012360955"
> "X1=0.0016627243" "X1=0.0020893531" ...
> .. ..$ X2: chr [1:1000] "X2= 2.000000" "X2= 2.048048" "X2= 2.096096"
> "X2= 2.144144" ...
> ```
> I would like to run prop.test on df$Class, but:
> ```
> > prop.test(x=point$Class, n=length(point$Class),
> + conf.level=.95, correct=FALSE)
> Error in prop.test(x = point$Class, n = length(point$Class),
> conf.level = 0.95,  :
> 'x' and 'n' must have the same length
> ```
> Since `x` is "a vector of counts of successes, a one-dimensional table
> with two entries, or a two-dimensional table (or matrix) with 2
> columns, giving the counts of successes and failures, respectively." I
> provided point$Class. The total number of tests is
> length(point$Class).
> There are three levels:
> ```
> > unique(df$Class)
> [1] negative  positive  uncertain
> Levels: negative positive uncertain
> ```
> I tried to remove the levels to check if the levels were interfering
> with the test:
> ```
> > df$Class = levels(droplevels(df$Class))
> Error in `$<-.data.frame`(`*tmp*`, Class, value = c("negative", "positive",  :
> replacement has 3 rows, data has 5000000
> ```
> What would be the syntax for this test? The idea is to get the most
> common value for each unique(df$Set) and prop.test will provide also
> the 95% CI for the estimate.
> Thanks
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list