# [R] tapply error svyby function "survey" package

Martin Canon martin.canon at gmail.com
Wed Nov 12 13:59:17 CET 2014

```Hi.

I'm trying to calculate the weighted mean score of a quality of life
measure (ovt) in patients with irritable bowel syndrome by their
marital status (d7).

This is a summary of the structure of the dataset:

> str(sii.tesis)
'data.frame':    1063 obs. of  75 variables:
\$ id         : int  51 52 53 54 55 56 57 58 59 60 ...
\$ stratum    : Factor w/ 6 levels "MEst","MAcad",..: 1 4 NA 4 4 1 6 NA 4 4 ...
\$ expfc      : num  22.8 17.1 NA 17.1 17.1 ...
\$ d6         : Factor w/ 3 levels "Estudiante","Profesor",..: 1 1 NA
1 1 1 3 NA 1 1 ...
\$ d7         : Factor w/ 6 levels "Soltero","Casado",..: 1 1 NA 1 1 1
1 NA 1 1 ...
\$ d7c        : Factor w/ 2 levels "No estable","Estable": 1 1 NA 1 1
1 1 NA 1 1 ...
\$ s1cm       : Factor w/ 2 levels "No","Si": 1 2 NA 1 1 1 2 NA 1 1 ...
\$ ovt        : num  NA 93.4 NA NA NA ...

I declared the sampling design:

> sii.design <- svydesign(
id = ~1,
strata = ~stratum,
weights = ~expfc,
data = subset(sii.tesis, !is.na(stratum)))

Then I tried to get the result:

> svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)

but i get the error:

Error in tapply(1:NROW(x), list(factor(strata)), function(index) { :
arguments must have same length

The length of both variables is the same. If the variable ovt exists,
there is a d7 match in the data frame.

I try the same thing using another variable instead - "role" (d6) -
and it works.

> svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95)
d6      ovt       se
Estudiante         Estudiante 71.01805 1.370569
Profesor             Profesor 72.30923 6.518378

If I use the recategorized d7 variable (d7c,  two levels only) it works too:

> svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95)
d7c      ovt      se
No estable No estable 70.92344 1.37460
Estable       Estable 74.53719 4.16954

What could be the problem?

Regards.

Martin Canon
Colombia, South America

```