[R] Grouped Histogram (colored)

Greg Snow Greg.Snow at imail.org
Sat Oct 18 03:24:06 CEST 2008


Is this more what you want?

g1 <- rnorm(100, rep( c(50,100,150), c(25,50,25)), 10 )
g2 <- rnorm(135, rep( c(55,95,145), c(30,75,30)), 11 )
g3 <- rnorm(90,  rep( c(45, 105, 150), c(30,40,20)), 9 )


tmp <- c(g1,g2,g3)
br <- hist(tmp, plot=FALSE)$breaks

mydata <- list(g1,g2,g3)
h <- t(sapply( mydata, function(x) hist(x, breaks=br, plot=FALSE)$counts ))

tmp2 <- barplot(h, space=0, width=1,
        legend.text=c('Group 1','Group 2','Group 3'))

axis(1, at=c( tmp2[1]-0.5, tmp2+0.5), labels=br)

# Or

xl <- do.call( range, mydata )
tmp3 <- sapply( mydata, function(x) density(x, from=xl[1], to=xl[2]) )

h2 <- do.call(rbind, tmp3['y',])
h3 <- apply(h2, 2, cumsum)

plot(tmp3[,1], ylim=c(0, max(h3)), type='n', xlab='Time',ylab='')

for( i in rev(seq(along=h3[,1]))) {
        polygon( c(tmp3[['x',i]][1], tmp3[['x',i]], tmp3[['x',i]][512]),
                        c(0, h3[i,], 0), col=i+1)
}

legend('topright', col=2:4, lty=1, legend=paste('Group',1:3))


Note that these types of plots are rarely more informative than the simple versions (and the extra colors can be distracting so they end up not as informative as the simple plots).  It is difficult to see the patterns of the individual groups since you have to compare the heights of non-aligned bars which is much more difficult than if they have a common base when looking at multiple bars and trying to distinguish an overall pattern.

If you are interested in the overall pattern, then just plot the histogram/density of the combined data.  If you want to compare shapes/patterns between the groups, then something like:

yl <- do.call( range, tmp3['y',] )

plot( xl, yl, type='n', xlab='Time',ylab='')
for (i in seq(along=h3[,1]) ) lines( tmp3[['x',i]], tmp3[['y',i]], col=i+1)

legend('topright', col=2:4, lty=1, paste('Group',1:3))


Makes that a lot easier.

Hope this helps,





--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of x0rr0x
> Sent: Friday, October 17, 2008 12:07 PM
> To: r-help at r-project.org
> Subject: Re: [R] Grouped Histogram (colored)
>
>
> here are two graphs from spss which may help illustrate my needs ;-)
>
> http://pics.foruni.de/getimg/balken_time_one_censored.jpg
>
> http://pics.foruni.de/getimg/one_time_sample0_censored.jpg
>
> thanks a lot for your time and energy!
>
> Regards
>
>
>
>
> Greg Snow-2 wrote:
> >
> > I don't understand what you want, do you want 3 different histograms
> on 1
> > plot? Do you want it to look like a barplot with side by side bars
> (rather
> > than stacked)?
> >
> > For labeling the colors, you can use the legend function to add a
> legend
> > to a plot, or you can use the text function to place text directly on
> a
> > plot.
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.snow at imail.org
> > 801.408.8111
> >
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> project.org] On Behalf Of x0rr0x
> >> Sent: Friday, October 17, 2008 3:02 AM
> >> To: r-help at r-project.org
> >> Subject: Re: [R] Grouped Histogram (colored)
> >>
> >>
> >> first of all: thank you for your replies!
> >>
> >>
> >> hadley wrote:
> >> >
> >> > On Thu, Oct 16, 2008 at 11:42 AM, x0rr0x
> <till.salzgeber at gmail.com>
> >> wrote:
> >> >>
> >> >> Hi all,
> >> >>
> >> >> I'm trying to create a histogram which shows the frequency of
> >> variables
> >> >> within a certain timeframe.
> >> >>
> >> >> I've been using SPSS before, but I didn't quite like it...
> >> >>
> >> >> To describe my problem further here are some example variables:
> >> >>
> >> >> the "event" is actually a string which I recoded using:
> >> >> [code]
> >> >> data$event_class = as.numeric(as.factor(data$event))
> >> >> [/code]
> >> >> I've recoded them into numerics
> >> >>
> >> >>
> >> >> csv:
> >> >> [code]
> >> >> time,event,event_class
> >> >> 01,cookies,1
> >> >> 05,cookies,1
> >> >> 06,pie,2
> >> >> 07,coffee,3
> >> >> 08,cookies,1
> >> >> 30,pie,2
> >> >> 31,coffee,3
> >> >> [/code]
> >> >> and so on...
> >> >>
> >> >> Now I'd like to create a histogram where X is the time, the color
> of
> >> the
> >> >> area is the event_class
> >> >> and Y is defined by the frequency of event_class around some
> >> accumulated
> >> >> time
> >> >
> >> > install.packages("ggplot2")
> >> > library(ggplot2)
> >> >
> >> > qplot(time, fill = event, data = mydata, geom = "histogram")
> >> >
> >> > You can find out more about ggplot2 at http://had.co.nz/ggplot2 -
> >> it's
> >> > inspired by the Grammar of Graphics, which is also the theory that
> >> > underlies SPSS's plotting systems.
> >> >
> >> > Hadley
> >> >
> >> > --
> >> > http://had.co.nz/
> >> >
> >> > ______________________________________________
> >> > R-help at r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> >
> >>
> >> In install.packages("ggplot2") : package ‘ggplot2’ is not available
> >> strange....
> >>
> >>
> >> Greg Snow-2 wrote:
> >> >
> >> > Does this do what you want?
> >> >
> >> > colhist <- function(x,col){
> >> >          tmp <- hist(x,plot=F)
> >> >          br <- tmp$breaks
> >> >          w <- as.numeric(cut(x,br,include.lowest=TRUE))
> >> >          sy <- unlist(lapply(tmp$counts,function(x)seq(length=x)))
> >> >          sy <- sy[order(order(x))]
> >> >             plot( range(br), range( 0, sy ),
> >> xlab=deparse(substitute(x)),
> >> >                         ylab='Frequency', type='n')
> >> >          rect(br[w], sy-1, br[w+1], sy,
> >> >             col=col,
> >> >             border=NA)
> >> >          rect(br[-length(br)], 0, br[-1], tmp$counts)
> >> >      }
> >> >
> >> > x <- rnorm(75, rep( c(90,100,110), each=25), 5 )
> >> > g <- rep( c('red','green','blue'), each=25 )
> >> >
> >> > colhist(x,g)
> >> >
> >> > note: this colhist function is a modified version of the one from
> the
> >> help
> >> > file for the tkBrush function in the TeachingDemos package.
> >> >
> >> > --
> >> > Gregory (Greg) L. Snow Ph.D.
> >> > Statistical Data Center
> >> > Intermountain Healthcare
> >> > greg.snow at imail.org
> >> > 801.408.8111
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> >> project.org] On Behalf Of x0rr0x
> >> >> Sent: Thursday, October 16, 2008 10:42 AM
> >> >> To: r-help at r-project.org
> >> >> Subject: [R] Grouped Histogram (colored)
> >> >>
> >> >>
> >> >> Hi all,
> >> >>
> >> >> I'm trying to create a histogram which shows the frequency of
> >> variables
> >> >> within a certain timeframe.
> >> >>
> >> >> I've been using SPSS before, but I didn't quite like it...
> >> >>
> >> >> To describe my problem further here are some example variables:
> >> >>
> >
> >> >> the "event" is actually a string which I recoded using:
> >> >> [code]
> >> >> data$event_class = as.numeric(as.factor(data$event))
> >> >> [/code]
> >> >> I've recoded them into numerics
> >> >>
> >> >>
> >> >> csv:
> >> >> [code]
> >> >> time,event,event_class
> >> >> 01,cookies,1
> >> >> 05,cookies,1
> >> >> 06,pie,2
> >> >> 07,coffee,3
> >> >> 08,cookies,1
> >> >> 30,pie,2
> >> >> 31,coffee,3
> >> >> [/code]
> >> >> and so on...
> >> >>
> >> >> Now I'd like to create a histogram where X is the time, the color
> of
> >> >> the
> >> >> area is the event_class
> >> >> and Y is defined by the frequency of event_class around some
> >> >> accumulated
> >> >> time
> >> >>
> >> >>
> >> >>
> >> >> In SPSS I used this:
> >> >> Graphs -> Chart Builder
> >> >> Gallery->Histogram
> >> >> use some horizontal histogram
> >> >> put "time" on the x-axis
> >> >> select "grouping/stacking variables" in the "groups/point id" tab
> >> >> and then set "Stack: set-color" to event_class
> >> >> the y-axis will be automatically set to "histogram"
> >> >>
> >> >> thanks a lot in advance!
> >> >>
> >> >> Regards,
> >> >> - x0rr0x
> >> >> --
> >> >> View this message in context: http://www.nabble.com/Grouped-
> >> Histogram-
> >> >> %28colored%29-tp20015941p20015941.html
> >> >> Sent from the R help mailing list archive at Nabble.com.
> >> >>
> >> >> ______________________________________________
> >> >> R-help at r-project.org mailing list
> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting guide http://www.R-
> project.org/posting-
> >> >> guide.html
> >> >> and provide commented, minimal, self-contained, reproducible
> code.
> >> >
> >> > ______________________________________________
> >> > R-help at r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> >
> >>
> >> not quiet.
> >>
> >> maybe I described it wrong. the occurences of event_class shouldn't
> add
> >> up
> >> till the end. the graph should create blocks in certain time frames
> and
> >> then
> >> display how often the event_class turn up during this timeframe.
> >>
> >> also, is there a way to label the colors, preferably with their
> >> "data$event"
> >> strings?
> >>
> >>
> >> --
> >> View this message in context: http://www.nabble.com/Grouped-
> Histogram-
> >> %28colored%29-tp20015941p20029605.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> --
> View this message in context: http://www.nabble.com/Grouped-Histogram-
> %28colored%29-tp20015941p20038094.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list