[R] calculating p-values of columns in a dataframe
Uwe Ligges
ligges at statistik.uni-dortmund.de
Sun Jul 8 13:03:13 CEST 2007
Thomas Pujol wrote:
> I have a dataframe ("mydf") that contains "differences of means".
> I wish to test whether these differences are significantly different from zero.
>
> Below, I calculate the t-statistic for each column.
>
> What is a "good" method to calculate/look-up the p-value for each column?
>
>
> mydf=data.frame(a=c(1,-22,3,-4),b=c(5,-6,-7,9))
>
> mymean=mean(mydf)
> mysd=sd(mydf)
> mynn=sapply(mydf, function(x) {sum ( as.numeric(x) >= -Inf) })
> myse=mysd/sqrt(mynn)
> myt=mymean/myse
> myt
You can do the whole lot with
L <- lapply(mydf, t.test)
or if you only want the t statistics and p-values now:
sapply(L, "[", c("statistic", "p.value"))
If you want to follow your initial approach quickly, you can calculate
the probability function of the t distribution with 3 degrees of freedom
(for your data) with
2 * pt(-abs(myt), df = nrow(mydf) - 1)
Uwe Ligges
>
>
> ---------------------------------
> Food fight? Enjoy some healthy debate
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list