[R] Using svychisq inside user-defined function

Thomas Lumley tlumley at u.washington.edu
Thu May 20 17:23:04 CEST 2010


On Wed, 19 May 2010, Sabatier, Jennifer F. (CDC/OID/NCHHSTP) wrote:

> Hi R-help,
>
> Yes, this is my second request for assistance in a single day....
>
> I am attempting to use svychisq() inside a function  I made.  The goal
> of this function is to produce a table of summary statistics that I can
> later output to EXCEL (simple frequencies and sample sizes from regular
> crosstabulation on dataset "data" but the chi-square using survey
> methods on "audit").
>
> Here's my code (I can't supply data for you as I am not that
> sophisticated and the real data is not cleared for public consumption -
> I really apologize):
>

You could have checked to see if the same problem occurs using one of the built-in data sets, but it turns out that the diagnosis is fairly straightforward. Also, since you use the function 'crosstab' without saying which package you got it from, it might be hard to get your code to run anyway.

Your problem is that you are passing a bare name and trying to substitute it into a formula, and that isn't how R function arguments or formulas work.

Simplifying the problem to the essentials, your function would work only if

myFormulaMakingFunction<-function(svyX){
    ~svyX+SEX
}

myFormulaMakingFunction(con)

returned ~con+SEX.  It actually returns ~svyX+SEX

There are various ways around this.  My preferred one would be

myPreferredWay<-function(formula){
  update(formula, ~.+SEX)
}

myPreferredWay(~con)

which does return ~con+SEX.

Then, as an example:

data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

myIllustrativeExample<-function(X,formula){
    tab<-table(X,apiclus1$stype)
    chisq<-svychisq(update(formula, ~.+stype), statistic="adjWald",design=dclus1)
    list(tab,chisq)
}

myIllustrativeExample(apiclus1$comp.imp, ~comp.imp)


Now, this can be improved: it appears that the vector argument is supposed to be the same variable as the formula argument, so we shouldn't make the user supply both of them.

mySupererogatoryEffort <- function(formula, design){
     X<-model.frame(formula, model.frame(design))[[1]]
    tab<-table(X, model.frame(design)$stype)
    chisq<-svychisq(update(formula, ~.+stype), statistic="adjWald",design=design)
    list(table=tab, F=chisq$statistic, df=chisq$parameter, p=chisq$p.value)
}

mySupererogatoryEffort(~comp.imp, dclus1)


As a final note, there is no round= argument to svychisq().  There is a round= argument to svytable(), which is documented on the same help page, but 4 is not one of the allowed values.

      -thomas

> # create my svydesign object
>
> audit <- svydesign(id~id, strata=~field, weights=~wt, data=data,
> fpc=~AllocProportion)
>
> # my function to create my table
>
> mkMyCrossTable <- function(X, svyX, T) {
>
> tbl <- crosstab(X, data$SEX, prop.c=TRUE)
> tbl <- data.frame(cbind(tbl$t, tbl$prop.col))
> tbl$var <- rownames(tbl)
>
> chisq <- svychisq(~svyX + SEX, design=audit, statistic="adjWald",
> round=4)
> chisq <- data.frame(do.call("cbind", chisq)
> chisq <- data.frame(chisq[,3])
>
> Table <- data.frame(tbl$var,
>                    paste(formatC(tbl$X0.1*100, format="f", digits=1),
> "%", sep=""),
>         		  tbl$X0,
> 			  paste(formatC(tbl$X1.1*100, format="f",
> digits=1), "%", sep=""),
> 			  tbl$X1,
> 			  chisq[1])
> Table[2: length(Table[,1]), 6] <- NA
> Table <- NAToUnkown(Table, unknown = " ")
> Colnames(Table) <- c(T, "Male (%)", "Male (n)", "Female (%)", "Female
> (n)", "p-value")
> Table
>
> }
>
> con3 <- mkMyCrossTable(data$con, con, "Constituency")
>


Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list