[R] using weights in lrm

Frank E Harrell Jr f.harrell at vanderbilt.edu
Tue Jul 4 17:08:07 CEST 2006


Stephan Lindner wrote:
> Dear all,
> 
> here's my own answer to the first warning message -- the warning
> message comes from handling missing values, which is specified as
> na.delte as default in lrm.
> 
> Cheers,
> 
> Stephan

You're right, and I found the bug.  Until we update the Design package 
please put this command at the top of you script to get the corrected 
version of lrm:

source('http://biostat.mc.vanderbilt.edu/cgi-bin/cvsweb.cgi/~checkout~/Design/R/lrm.s?rev=1.4;content-type=text%2Fplain')

Thanks

Frank

> 
> 
> 
> 
> # Consider a toy data frame:
> 
> 
>> d.temp
>    y.js h.hhsize h.work.frac h.age  h.sex h.popgroup weightsd cluster
> 1    No        3   0.3333333    20 female   Coloured 47.80062    1001
> 2    No        5   0.6000000    18 female   Coloured 47.80062    1001
> 3   Yes        4   0.7500000    18 female      White 47.80062    1001
> 4   Yes        6   0.5000000    21 female   Coloured 49.71264    1002
> 5    No        6   0.5000000    15 female   Coloured 49.71264    1002
> 6    No        3   0.6666667    20 female      White 49.71264    1002
> 7    No        3   0.3333333    21 female      White 49.71264    1002
> 8   Yes        6   0.6666667    19 female      White 49.71264    1002
> 9    No        6   0.6666667    16   male      White 49.71264    1002
> 10   No        3   0.3333333    16   male   Coloured 49.71264    1002
> 11   No        5   0.4000000    15   male   Coloured 42.85572    1003
> 12   No        6   0.6666667    18   male      White 42.85572    1003
> 13   No        4   0.2500000    17   male      White 45.88860    1004
> 14   No        3   0.3333333    15 female   Coloured 45.88860    1004
> 15   No        4   0.5000000    19 female      White 45.88860    1004
> 16  Yes        4   0.5000000    16 female      White 45.88860    1004
> 17  Yes        6   0.3333333    21 female   Coloured 45.88860    1004
> 18   No        3   0.6666667    15 female      White 46.03022    1005
> 19  Yes        5   0.4000000    20 female      White 46.03022    1005
> 20   No        5   1.0000000    19 female      White 46.03022    1005
> 
> 
> # The dependent variable has no missing values. Then, lrm works fine. 
> 
> results <- robcov(ols.results <- lrm(y.js ~
>                                      + h.hhsize             
>                                      + h.work.frac           
>                                      + factor(h.age)        
>                                      + h.sex
>                                      + h.popgroup           
>                                      
>                                     ,data=d.temp,x=T,y=T
>                                     ,weights=weightsd, normwt=TRUE),
>                                     d.temp$cluster)
> 
> 
> # Now change the first observation to a missing value:
> 
> d.temp$y.js[1] <- NA
> 
> # and do the same again produces the warning:
> 
> 
> results <- robcov(ols.results <- lrm(y.js ~
>                                      + h.hhsize             
>                                      + h.work.frac           
>                                      + factor(h.age)        
>                                      + h.sex
>                                      + h.popgroup           
>                                      
>                                     ,data=d.temp,x=T,y=T
>                                     ,weights=weightsd, normwt=TRUE),
>                                     d.temp$cluster)
> 
> 
> 
> # But specifying na.action="exclude" resolves it.
> 
> results <- robcov(ols.results <- lrm(y.js ~
>                                      + h.hhsize             
>                                      + h.work.frac           
>                                      + factor(h.age)        
>                                      + h.sex
>                                      + h.popgroup           
>                                      
>                                     ,data=d.temp,x=T,y=T, na.action="na.exclude"
>                                     ,weights=weightsd, normwt=TRUE),
>                                     d.temp$cluster)
> 
> 
> # ------------------------------------------- #
> 
> 
> On Tue, Jul 04, 2006 at 07:59:31AM -0500, Frank E Harrell Jr wrote:
>> Stephan Lindner wrote:
>>> Dear all,
>>>
>>>
>>> just a quick question regarding weights in logistic regression. I do 
>>>
>>>
>>>
>>> results <- lrm(y.js ~
>>>                h.hhsize             
>>>               + h.death1              
>>>               + h.ill1                  
>>>               + h.ljob1              
>>>               + h.fin1 
>>>               + h.div1 
>>>               + h.fail1 
>>>               + h.sex
>>>               + h.ch.1      
>>>               + h.ch.5      
>>>               + h.ch.12     
>>>               + h.ch.13     
>>>               + h.popgroup
>>>               + y.school.now
>>>               ,x=T,y=T, data=d.caps1y, weights=weightsd, normwt=TRUE
>>>                )
>>>
>>>
>>> The regression works (in the sense that the results are not way off
>>> the one w/o wighting the sample), but I get the following warning messages:
>>>
>>> Warning messages:
>>> 1: number of items to replace is not a multiple of replacement length 
>>> 2: currently weights are ignored in model validation and bootstrapping lrm 
>>> fits in: lrm(y.js ~ h.hhsize + h.death1 + h.ill1 + h.ljob1 + h.fin1 +  
>>>
>>> Perhaps someone can help me clearifying the warning messages -- thanks
>>> a lot in advance !
>> I think the second warning is clear.  Regarding the first, make sure 
>> that the weights vector has length equal to the number of rows in 
>> d.capsly.  Sometimes you have to subset weights.  If that's not the 
>> problem, try to create a minimal failing example and we'll work on it.
>>
>> Frank Harrell
>>
>>>
>>> Cheers,
>>>
>>> Stephan
>>>
>>> 	
>>>
>>>
>>
>> -- 
>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>                      Department of Biostatistics   Vanderbilt University
>>
>>
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list