[R] Problem with NA data when computing standard error
Paul Johnson
pauljohn32 at gmail.com
Tue Apr 8 23:00:57 CEST 2008
On Tue, Apr 8, 2008 at 12:44 PM, LeCzar <sirnixu at gmail.com> wrote:
>
> Hey,
>
> I want to compute means and standard errors as two tables like this:
>
> se<-function(x)sqrt(var(x)/length(x))
>
>
The missings are not your main problem.
The command var computes the variance-covariance matrix. Some
covariance values can be negative. Trying to take square roots is a
mistake.
For example, run
> example(var)
to get some matrices to work with.
> C1[3,4] <- NA
> C1[3,5] <- NA
Observe you can calculate
> var(C1, na.rm=T)
but you cannot take sqrt of that because it would try to apply sqrt to
negative values.
To get the standard errors, it is necessary to reconsider the problem,
do something like
> diag(var(C1, na.rm=T))
That will give the diagonals, which are positive, so
> sqrt(diag(var(C1, na.rm=T)))
Works as well.
But you have the separate problem of dividing each one by the square
root of the length, and since there are missings that is not the same
for every column. Maybe somebody knows a smarter way, but this
appears to give the correct answer:
validX <- colSums( ! is.na(C1))
This gives the roots:
sqrt(validX)
Put that together, it seems to me you could try
se <- function(x) {
myDiag <- sqrt(diag(var(x, na.rm=T)))
validX <- colSums(! is.na(x))
myDiag/sqrt(validX)
}
That works for me:
> se(C1)
Fertility Agriculture Examination Education
50.740226 110.808614 39.390611 39.303898
Catholic Infant.Mortality
328.272207 4.513863
--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
More information about the R-help
mailing list