Corey Sparks corey.sparks at utsa.edu
Wed Mar 10 19:10:13 CET 2010

Hi R users,
I'm using the survey package to calculate summary statistics for a large
health survey (the Demographic and Health Survey for Honduras, 2006), and
when I try to calculate the variances for several variables, I get negative
numbers.  I thought it may be my data, so I ran the example on the help

## one-stage cluster sample
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

svyvar(~api00+enroll+api.stu+api99, dclus1)
        variance     SE
api00    11182.8 1386.4
api00    11516.3 1412.9
api.stu  -4547.1 3164.9
api99    12735.2 1450.1

If I look at the full matrix for the variances (and covariances):
test<-svyvar(~api00+enroll+api.stu+api99, dclus1)

print(test, covariance=T)
                variance      SE
api00:api00      11182.8  1386.4
enroll:api00     -5492.4  3458.1
api.stu:api00    -4547.1  3164.9
api99:api00      11516.3  1412.9
api00:enroll     -5492.4  3458.1
enroll:enroll   136424.3 41377.2
api.stu:enroll  114035.7 34153.9
api99:enroll     -3922.3  3589.9
api00:api.stu    -4547.1  3164.9
enroll:api.stu  114035.7 34153.9
api.stu:api.stu  96218.9 28413.7
api99:api.stu    -3060.0  3260.9
api00:api99      11516.3  1412.9
enroll:api99     -3922.3  3589.9
api.stu:api99    -3060.0  3260.9
api99:api99      12735.2  1450.1

I see that the function is actually returning the covariance for the api.stu
with the api00 variable.

I can get the correct variances if I just take

But I just was wondering if anyone else was having this problem.  I'm using
> sessionInfo()
R version 2.10.1 Patched (2009-12-20 r50794)

[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] survey_3.19

loaded via a namespace (and not attached):
[1] tools_2.10.1

And have the same error on a linux server.

