[R] Boxplot

Jeffrey Joh johjeffrey at hotmail.com
Sun Nov 27 06:15:38 CET 2011

I'm trying to do the second case among Jim's suggestions.  I used Bert's suggestion and it works great.

I would also like to ask if anyone is familiar with a package for making box-plots.  I would like to bin my datapoints at defined X intervals and display a boxplot for each bin on the same chart.  In Stata, there is a tool for making these, and it varies the width of the boxplot based on the number of points in each plot.  I am hoping there is a similar tool for R.

Thank you,

> Date: Tue, 22 Nov 2011 18:51:05 +1100
> From: jim at bitwrit.com.au
> To: johjeffrey at hotmail.com
> CC: r-help at r-project.org
> Subject: Re: [R] Binned line plot
> On 11/22/2011 04:29 PM, Jeffrey Joh wrote:
> >
> > I have a scatter plot with 10000 points. I would like to add a line that bins every 50 points and connects the average of each bin. I'm looking for something similar to line type "m" in Stata.
> >
> > With this dataset of 10000 points, I would also like to bin the data and make boxplots at certain intervals, so that I have a set of boxplots to represent each bin. I would also like the width of each box to be proportional to the number of points in each bin.
> >
> > How can I make these plots? Is there a simple package to use?
> >
> Hi Jeffrey,
> There are three possibilities that come to mind:
> 1) You want to bin the points based on their order in the data frame.
> 2) You want to bin the points based on the x or y values of the coordinates.
> 3) You want to bin the points based on the x _and_ y values of the
> coordinates.
> Number 1 is trivial and has already been answered (assume a two column
> data frame of coordinates named "xypoints").
> #first point - set up a loop to get a vector of averages
> meanx<-rep(0,200)
> meany<-rep(0,200)
> for(index in 1:200) {
> start<-1+50*(index-1)
> meanx[index]<-mean(xypoints[start:(start+49),"x"])
> meany[index]<-mean(xypoints[start:(start+49),"y"])
> }
> plot(meanx,meany,type="l")
> Number 2 requires that you sort the pairs based on the value of the one
> you want, then apply the same process as 1 to the sorted pairs. Number 3
> is somewhat more difficult.
> I don't do this much, and some of the people who do map analysis will
> probably come up with a much better method.
> Find the most extreme point.
> Find the 49 points closest to that point to constitute group 1.
> Remove those points from the data frame.
> Go back to the first step if there are any points left.
> You will end up with 200 groups of points that are spatially grouped.
> Get the centroids and plot as above.
> Another wild guess from
> Jim

More information about the R-help mailing list