[R] svyboxplot - library (survey)
Muhuri, Pradip (SAMHSA/CBHSQ)
Pradip.Muhuri at samhsa.hhs.gov
Fri Oct 19 02:56:18 CEST 2012
Hi Dr. Lumley,
Further thoughts: To get the histogram of age with proportions (relative frequencies) on y-axis, I probably need to rescale the weight for each subgroup separately so that the rescaled weight would sum to 1 for the respective subgroup. Am I correct?
Thanks,
Pradip Muhuri
________________________________________
From: Muhuri, Pradip (SAMHSA/CBHSQ)
Sent: Thursday, October 18, 2012 4:45 PM
To: 'Thomas Lumley'
Cc: Anthony Damico; R help; Muhuri, Pradip (SAMHSA/CBHSQ)
Subject: RE: [R] svyboxplot - library (survey)
Hello Dr. Lumley,
Thank you for your advice/suggestions.
I have rescaled the weight (i.e., "original weight" divided by "total weighted count" averaged across 8 surveys - NHIS). As can be seen below (R console), the new weight sums to 1.
I have used the freq=TRUE argument in the svyhist () function along with a new svydesign object which includes the recalled weight. There are two issues:
1) I am getting a warning message: In plot.histogram(h, ..., freq = freq, xlab = xlab, main = main) : the AREAS in the plot are wrong -- rather use freq=FALSE.
2) The scale of two graphs looks different (please see the attachment).
Any thoughts on how to resolve these issues?
Regards,
Pradip Muhuri
###### R console is appended below ######
> options (width=120)
> sum (tor$new_wt)
[1] 1
>
> # object with survey design variables and data with new_wt (rescaled) that sums to 1
> xnhis <- svydesign (id=~psu,strat=~stratum, weights=~new_wt, data=tor, nest=TRUE)
>
> MyBreaks <- c(18, 25, 35, 45, 55, 65, 75, 85, 95)
>
> par(mfrow=c(2,2))
> # Chart 1
>
> options( survey.lonely.psu = "adjust" )
> svyhist (~age_p,
+ subset (xnhis, xspd2=='SPD'), breaks=MyBreaks,
+ #ylim = c(0,0.040),
+ main= " ", freq=TRUE,
+ col="red",
+ xlab="Age at Interview (SPD Category)"
+ )
Warning message:
In plot.histogram(h, ..., freq = freq, xlab = xlab, main = main) :
the AREAS in the plot are wrong -- rather use freq=FALSE
> #lines (svysmooth(~age_p, bandwidth=5,subset(nhis, xspd2=='SPD')), lwd=2)
>
> #Chart 2
>
> options( survey.lonely.psu = "adjust" )
> svyhist (~age_p,
+ subset (xnhis, xspd2=='No SPD'), breaks=MyBreaks,
+ #ylim = c(0,0.040),
+ main= " ", freq=TRUE,
+ col="yellow", xlab="Age at Interview (No SPD Category)"
+ )
Warning message:
In plot.histogram(h, ..., freq = freq, xlab = xlab, main = main) :
the AREAS in the plot are wrong -- rather use freq=FALSE
Pradip K. Muhuri
Statistician
Substance Abuse & Mental Health Services Administration
The Center for Behavioral Health Statistics and Quality
Division of Population Surveys
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857
Tel: 240-276-1070
Fax: 240-276-1260
e-mail: Pradip.Muhuri at samhsa.hhs.gov
The Center for Behavioral Health Statistics and Quality your feedback. Please click on the following link to complete a brief customer survey: http://cbhsqsurvey.samhsa.gov
-----Original Message-----
From: Thomas Lumley [mailto:tlumley at uw.edu]
Sent: Wednesday, October 17, 2012 11:13 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ)
Cc: Anthony Damico; R help
Subject: Re: [R] svyboxplot - library (survey)
On Thu, Oct 18, 2012 at 2:04 PM, Muhuri, Pradip (SAMHSA/CBHSQ)
<Pradip.Muhuri at samhsa.hhs.gov> wrote:
> Hello,
>
> I understand that svyhist () provides density histograms with density values on the y-axis (R code shown below). Is there a way one can have relative relative frequency histograms with relative freqencies on the y-axis?
You get frequencies just by asking for them with freq: compare
svyhist(~enroll, dstrat, main="Survey weighted",col="purple",freq=TRUE)
svyhist(~enroll, dstrat, main="Survey weighted",col="purple")
If you mean that you want the heights of the bars to sum to 1, the
simplest way I know of is to rescale the weights to sum to 1 and use
freq=TRUE
-thomas
> Any advice/help would be appreciated.
>
> Thanks,
>
> Pradip Muhuri
>
>
>
>
>
> ###### svyhist - Density Histogram
>
> options( survey.lonely.psu = "adjust" )
> svyhist (~age_p,
> subset (nhis, xspd2=='SPD'), breaks=MyBreaks,
> ylim = c(0,0.040),
> main= " ",
> col="red",
> xlab="Age at Interview (SPD Category)"
> )
> lines (svysmooth(~age_p, bandwidth=5,subset(nhis, xspd2=='SPD')), lwd=2)
>
>
> ________________________________________
> From: Anthony Damico [ajdamico at gmail.com]
> Sent: Monday, October 01, 2012 10:07 AM
> To: Muhuri, Pradip (SAMHSA/CBHSQ)
> Cc: R help
> Subject: Re: [R] svyboxplot - library (survey)
>
> using a slight modification of the example shown in ?svyboxplot
>
>
> # load survey library
> library(survey)
>
> # load example data
> data(api)
>
> # create an example svydesign
> dstrat <- svydesign(id = ~1, strata = ~stype, weights = ~pw, data = apistrat,
> fpc = ~fpc)
>
> # set the plot window to display 1 plot x 2 plots
> par(mfrow=c(1,2))
>
> # generate two example boxplots
> svyboxplot(enroll~stype,dstrat,all.outliers=TRUE)
> svyboxplot(enroll~1,dstrat)
>
> # done
>
>
>
> # alternative: not as nice
>
> # set the plot window to display 2 plots x 1 plot
> par(mfrow=c(2,1))
>
> # generate two example boxplots
> svyboxplot(enroll~stype,dstrat,all.outliers=TRUE)
> svyboxplot(enroll~1,dstrat)
>
> # done
>
>
>
>
>
>
>
> On Mon, Oct 1, 2012 at 9:50 AM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov<mailto:Pradip.Muhuri at samhsa.hhs.gov>> wrote:
> Hello,
>
> I have used the library (survey) package for boxplots using the following code.
>
> Could anyone please tell me why I am getting only 1 boxplot instead of 2 boxplots (1-SPD, 2-No SPD).
>
> What changes in the following code would be required to get 2 boxplots in the same plot frame?
>
> Thanks,
>
> Pradip
>
> ###################################################
> nhis <- svydesign (id=~psu, strat=~stratum, weights=~wt8,
> data=tor, nest=TRUE)
>
> svyboxplot (dthage~xspd2, subset (nhis, mortstat==1), col="gray80",
> varwidth=TRUE, ylab="Age at Death", xlab="SPD Status: 1-SPD, 2=No SPD")
>
>
> Pradip K. Muhuri
> Statistician
> Substance Abuse & Mental Health Services Administration
> The Center for Behavioral Health Statistics and Quality
> Division of Population Surveys
> 1 Choke Cherry Road, Room 2-1071
> Rockville, MD 20857
>
> Tel: 240-276-1070
> Fax: 240-276-1260
> e-mail: Pradip.Muhuri at samhsa.hhs.gov<mailto:Pradip.Muhuri at samhsa.hhs.gov>
>
> The Center for Behavioral Health Statistics and Quality your feedback. Please click on the following link to complete a brief customer survey: http://cbhsqsurvey.samhsa.gov
>
> vide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org<mailto:R-help at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Thomas Lumley
Professor of Biostatistics
University of Auckland
More information about the R-help
mailing list