[R] Case weighting
David Winsemius
dwinsemius at comcast.net
Sat Mar 3 20:16:25 CET 2012
You might want to look at the various wtd.* functions in the Hmisc
package:
require(Hmisc)
?wtd.stats
'wtd.mean' is just one of the functions supplied. You might want to
contemplate the simplicity of Harrell's function code, since it is not
hidden. Just type:
wtd.mean
--
David.
On Mar 3, 2012, at 2:04 PM, Hed Bar-Nissan wrote:
> Following David example if i just wanted to do means
> would multiplying the cases according to the weight do the work?
>
>
> Something like this on a data.frame
> (Must be a simpler way to do it with R - the sapply scope confused me)
>
>
> weightBy <- function(origDataFrame,weightVector)
> {
> case_Number_After_Weighting = sum(weightVector);
> #print ( "case_Number_After_Weighting =
> ");#print(case_Number_After_Weighting );
>
> data.weighted.local = data.frame
> ( 1:case_Number_After_Weighting );
> assign("data.weighted.tmp",data.weighted.local,env=globalenv())
>
> sapply(1:NCOL(origDataFrame),
> function(colNo) {
> #print ( "dealing with colomn ");#print(colNo);
> data.weighted.tmp[,colNo] =
> unlist(
> sapply(1:NROW(origDataFrame),
> function(x) rep(origDataFrame[x,colNo],
> times=weightVector[x] )
> )
> )
> names(data.weighted.tmp)[colNo] <- names(origDataFrame)
> [colNo]
>
> assign("data.weighted.tmp",data.weighted.tmp,env=globalenv())
> #print (data.weighted.tmp);
> }
> )
> data.weighted.local = data.weighted.tmp;
> rm(data.weighted.tmp, envir=globalenv());
> return(data.weighted.local);
> }
>
>
>
> data.recieved <- data.frame(
> f1 = factor(c(2,1,1,1), labels = c("Yes", "No")),
> f2 = factor(c(1,2,3,4), labels = c("One", "Two","Three","Four"))
> );
>
> weight=c(10, 1, 1, 1)
>
>
> weightBy(data.recieved,weight);
>
>
>
> On Fri, Feb 24, 2012 at 8:03 AM, Thomas Lumley <tlumley at uw.edu> wrote:
> >On Fri, Feb 24, 2012 at 9:40 AM, David Winsemius <dwinsemius at comcast.net
> > wrote:
> >
> > On Feb 23, 2012, at 3:27 PM, Hed Bar-Nissan wrote:
> >
> >> It's really weighting - it's just that my simplified example was
> too
> >> simplified
> >> Here is my real weight vector:
> >> > sc$W_FSCHWT
> >> [1] 14.8579 61.9528 3.0420 2.9929 5.1239 14.7507 2.7535
> >> 2.2693 3.6658 8.6179 2.5926 2.5390 1.7354 2.9767
> 9.0477
> >> 2.6589 3.4040 3.0519
> >> ....
> >
> >
> > You should always convey the necessary complexity of the problem.
> >
> >>
> >>
> >> And still it should somehow set the case weight.
> >> I could multiply all by 10000 and use maybe your method but it
> would
> >> create such a bloated dataframe
> >>
> >> working with numeric only i could probably create weighted means
> >>
> >> But something simple as WEIGHTED BY would be nice.
> >
> >
> > The survey package by Thomas Lumley provides for a wide variety of
> weighted
> > analyses.
>
> Yes. It doesn't do everything that SPSS WEIGHTED BY will do, but it
> does a lot. SPSS is more general partly because it cheats -- it
> doesn't always compute the right standard errors if the weights are
> sampling weights [SPSS now has some proper survey analysis commands,
> which do get the right standard errors, but are more limited]
>
> - thomas
>
> --
> Thomas Lumley
> Professor of Biostatistics
> University of Auckland
>
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list