[R] Case weighting
David Winsemius
dwinsemius at comcast.net
Thu Feb 23 21:40:04 CET 2012
On Feb 23, 2012, at 3:27 PM, Hed Bar-Nissan wrote:
> It's really weighting - it's just that my simplified example was too
> simplified
> Here is my real weight vector:
> > sc$W_FSCHWT
> [1] 14.8579 61.9528 3.0420 2.9929 5.1239 14.7507
> 2.7535 2.2693 3.6658 8.6179 2.5926 2.5390 1.7354
> 2.9767 9.0477 2.6589 3.4040 3.0519
> ....
You should always convey the necessary complexity of the problem.
>
>
> And still it should somehow set the case weight.
> I could multiply all by 10000 and use maybe your method but it would
> create such a bloated dataframe
>
> working with numeric only i could probably create weighted means
>
> But something simple as WEIGHTED BY would be nice.
The survey package by Thomas Lumley provides for a wide variety of
weighted analyses.
--
David.
>
> tnx
> Hed
>
>
>
>
>
> On Thu, Feb 23, 2012 at 7:43 PM, David Winsemius <dwinsemius at comcast.net
> > wrote:
>
> On Feb 23, 2012, at 10:49 AM, Hed Bar-Nissan wrote:
>
> The need comes from the PISA data. (http://www.pisa.oecd.org)
>
> In the data there are many cases and each of them carries a numeric
> variable that signifies it's weight.
> In SPSS the command would be "WEIGHT BY"
>
> In simpler words here is an R sample ( What is get VS what i want
> to get )
>
>
> data.recieved <- data.frame(
> + kindergarten_attendance = factor(c(2,1,1,1), labels = c("Yes",
> "No")),
> + weight=c(10, 1, 1, 1)
> + );
> data.recieved;
> kindergarten_attendance weight
> 1 No 10
> 2 Yes 1
> 3 Yes 1
> 4 Yes 1
>
>
>
> data.weighted <- data.frame(
> + kindergarten_attendance = factor(c(2,2,2,2,2,2,2,2,2,2,1,1,1),
> labels =
> c("Yes", "No")) );
>
> You want "case repetition" not case weighting, which I would use as
> a term when working on estimation problems:
>
> > ( data.weighted <- unlist(sapply(1:NROW(data.recieved),
> function(x) rep(data.recieved[x,1], times=data.recieved[x,2] )) ) )
> [1] No No No No No No No No No No Yes Yes Yes
> Levels: Yes No
>
>
>
>
> par(mfrow=c(1,2));
> plot(data.recieved$kindergarten_attendance,main="What i get");
> plot(data.weighted$kindergarten_attendance,main="What i want to get");
>
> Seems to work with the factor vector, although I didn't replicate
> dataframe rows, but I guess you could.
>
>
>
> tnx in advance
> Hed
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
>
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list