[R] Various Errors using Survey Package

Thompson, Trevor tkt2 at cdc.gov
Wed Feb 12 19:38:03 CET 2003


Hi,

I have been experimenting with the new Survey package.  Specifically, I was
trying to use some of the functions on the public-use survey data from NHIS
(2000 Sample Adult file).  

Error 1):  The first error I get is when I try to specify the complex survey
design.

nhis.design<-svydesign(ids=~psu, probs=~probs, strata=~strata, data=nhis.df,
check.strata=TRUE)
Error in svydesign(ids = ~psu, probs = ~probs, strata = ~strata, data =
nhis.df,  : 
        Clusters not nested in strata

My data are sorted by strata, psu.  Can someone tell me what the structure
has to be for a stratified sample with clustering?  Looking at the code, it
appears to me that it does not allow more than 1 observation per psu [i.e.
any(sc > 1)].

Error 2).  If I go ahead and specify check.strata=FALSE, then svydesign runs
ok.  I then tried using the svymean function.  In the following example, if
I specify na.rm=TRUE, I get the error below:

> svymean(nhis.df$crc10yr, design=nhis.design, na.rm=TRUE)
Error in rowsum.default(x, strata) : Incorrect length for 'group'

I traced this to the svyCprod call within svymean.   SvyCprod calls rowsum
and the group argument ("strata") appears to be the full length of that
column rather than the subset with non-missing data.  

Error 3).  I then tried svymean on another variable with na.rm=FALSE.  I got
the following error:

> svymean(nhis.df$age, design=nhis.design)
Error in drop(rval) : names attribute must be the same length as the vector 

I also traced this error to a call to rowsum within the function svyCprod.
I'm not sure what names attribute this is referring to because the arguments
to rowsum and the rval object do not appear to have a names attribute.  Does
anyone know what the problem here might be?

Has anyone else used the survey package on public-use survey datasets like
BRFSS or NHIS?  Was there anything special you had to do to those datasets
before specifying the survey design?  I know that's a pretty vague question.
If any of you are SUDAAN users, I basically mean does it have to be
structured differently that what you pass into a SUDAAN procedure.

Thanks in advance for any suggestions!  I am using R 1.6.2 on Windows 2000.

-Trevor




More information about the R-help mailing list