[R] svyglm error message

Bert Gunter gunter.berton at gene.com
Tue Feb 11 23:11:21 CET 2014


Disclaimer:

I have not followed this thread and claim no statistical expertise. I
just wanted to point out a couple of misconceptions that may be
relevant. Inline below.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Tue, Feb 11, 2014 at 1:56 PM, Claire Wladis <cwladis at gmail.com> wrote:
> Thanks for your reply, Thomas.
>
> Yes, this is NCES data.
>
> There are no negative or missing weights.
>
> I am not a programmer and so I'm afraid I don't understand what you mean by
> not being able to have blank cells in a data.frame object

(In my opinion) This claim does not absolve you of the responsibility
of learning how to properly use R. If you do not wish to put in the
requisite effort, then you should not use R. Find something else.

 -  What I mean
> specifically is that in the csv file which I imported into R to create the
> dataset (using read.csv) there were blank cells for any missing data.  This
> has never given me problems with R in the past using glm or related
> functions.
>
> Traceback gives the following:
> 3: glm.fit(XX, YY, weights = wi/sum(wi), start = beta0, offset = offs,
>        family = fam, control = contrl, intercept = incpt)
> 2: svyglm.svyrep.design(model, design = surveydatastructure, family =
> quasibinomial())
> 1: svyglm(model, design = surveydatastructure, family = quasibinomial())
>
> As for the model: I need to run this code on a number of different models.
>  By playing around with this a lot, I have found that I get the error if I
> include one particular variable (HSCRDANY) in the model.  I have checked
> all the values for that variable, and there are only three: "yes", "no" and
> empty cells for missing data (or however one correctly phrases that for a
> dataframe in R).

There is no such thing as "empty cells" -- R is **not** Excel (thank
heaven!). **Blank** values are **not** missing in character vectors:
they are blank characters, "" (if that is, in fact, what your data
input did -- I'm never sure with .csv files). "An Introduction to R"
or, if you prefer,  various good R web tutorials explain this. If you
do not care to put in the effort to learn about it, as I said above,
you probably shouldn't be using R.



 Another variable, HSGPA, which has empty cells for all
> the same individuals, and which is also a categorical variable, does not
> have this problem.
>
> So, for example, this model works fine:
> DISTEDUC~1+RACE+GENDER+RISKINDX+GPA+REMEVER+HSGPA+PELLAMT+FEDBEND+CAGI+PAREDUC+PRIMLANG+CITIZEN2
>
>
> But this model returns the error message listed above:
> DISTEDUC~1+RACE+GENDER+RISKINDX+GPA+REMEVER+HSGPA*+HSCRDANY*
> +PELLAMT+FEDBEND+CAGI+PAREDUC+PRIMLANG+CITIZEN2
>
> I don't understand what it is about the specific variable HSCRDANY which
> would prompt this error message?  I'm not sure what else to look for to try
> to figure out what the issue with this particular variable may be?
>
> Thanks again for your time!
>
>
>
>
>
> On Tue, Feb 11, 2014 at 1:05 PM, Thomas Lumley <tlumley at uw.edu> wrote:
>
>> This is some sort of NCES data, right?
>>
>> I can't see any way to get that particular error (which happens inside
>> glm.fit()) for a logistic model.
>>   Are there any negative or missing weights?
>>   What do you mean 'represented by blank cells' -- you can't have blank
>> cells in a data.frame object?
>>   What does traceback() give after the error?
>>   What is the model?
>>
>>    -thomas
>>
>>
>>
>> On Mon, Feb 10, 2014 at 4:10 PM, Claire Wladis <cwladis at gmail.com> wrote:
>>
>>> Hello,
>>> I am using the survey package for the first time to analyze a dataset that
>>> has both weights and 200 BRR replication weights.  When I try to run
>>> svyglm
>>> on the output from svrepdesign, I get an error message that I do not know
>>> how to interpret, and an extended period of time searching for this error
>>> on the web hasn't returned any results that seem relevant to my situation.
>>>  I have no idea how to proceed with my analysis at this point, so I am
>>> hoping that someone with more experience with this package and with R in
>>> general would be willing to help me figure out what the problem is.
>>>
>>> Here is my code:
>>> surveydatastructure <- svrepdesign(repweights=dataset[, 29:228] ,
>>> data=dataset, weights=dataset$WTA000)
>>>
>>> modeloutput <- svyglm(model, design=surveydatastructure,
>>> family=quasibinomial() )
>>>
>>> The model is defined in an earlier line of code, but for the sake of
>>> readability here, I have not included it.  The dataset has a binary
>>> dependent variable and a combination of categorical and continuous
>>> variables as dependent variables.  There is missing data in the dataset,
>>> represented by blank "cells" in the data frame.  The data itself is
>>> restricted but I can describe any part of it as necessary.
>>>
>>>
>>> Here is the error message that R returns when I enter the svyglm function
>>> line of code:
>>>
>>> Error in if (!(validmu(mu) && valideta(eta))) stop("cannot find valid
>>> starting values: please specify some",  :  missing value where TRUE/FALSE
>>> needed
>>>
>>> Thanks for reading my post, and thanks in advance for any help!
>>> Sincerely,
>>> Claire
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Thomas Lumley
>> Professor of Biostatistics
>> University of Auckland
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list