[R] geom_bar with missing data in package ggplot
Dennis Murphy
djmuser at gmail.com
Wed Nov 16 13:22:45 CET 2011
Hi:
Here's one way, but it puts the two countries side by side rather than
stacked (I'm not a big fan of stacked bar charts except in certain
contexts). The first version uses the original data, but one can see
immediately that there is no distinction between NA and 0:
ggplot(g, aes(x = Date, y = value, fill = var2)) +
geom_bar(position = 'dodge', stat = 'identity') +
facet_wrap(~ variable, nrow = 1) +
scale_fill_manual('Country', breaks = levels(g$var2),
values = c('red', 'blue')) +
opts(legend.position = c(0.87, 0.88),
legend.background = theme_rect(fill = 'white'))
To compensate, I copied the data to a new object g2 and imputed a
small negative value to replace the zero:
g2 <- g
g2$value[8] <- -0.01
ggplot(g2, aes(x = Date, y = value, fill = var2)) +
geom_bar(position = 'dodge', stat = 'identity') +
facet_wrap(~ variable, nrow = 1) +
scale_fill_manual('Country', breaks = levels(g2$var2),
values = c('red', 'blue')) +
opts(legend.position = c(0.87, 0.88),
legend.background = theme_rect(fill = 'white'))
An additional improvement could be made by keeping the original data
and adding some text that indicates where the NAs reside; to do this,
we need to offset the date a bit to decently locate the text:
ggplot(g, aes(x = Date, y = value, fill = var2)) +
geom_bar(position = 'dodge', stat = 'identity') +
facet_wrap(~ variable, nrow = 1) +
scale_fill_manual('Country', breaks = levels(g$var2),
values = c('red', 'blue')) +
geom_text(aes(x = as.Date('2001-3-31'), y = 1, label = 'NA'),
size = 6) +
opts(legend.position = c(0.87, 0.88),
legend.background = theme_rect(fill = 'white'))
HTH,
Dennis
On Wed, Nov 16, 2011 at 1:55 AM, Aidan Corcoran
<aidan.corcoran11 at gmail.com> wrote:
> Dear all,
>
> I was hoping someone could help with a ggplot question. I would like
> to generate a faceted bar chart, but missing data are causing
> problems.
>
> g<-structure(list(Date = structure(c(11322, 11687, 12052, 11322,
> 11687, 12052, 11322, 11687, 12052, 11322, 11687, 12052), class = "Date"),
> variable = c("Govt Revenues to GDP", "Govt Revenues to GDP",
> "Govt Revenues to GDP", "Govt Revenues to GDP", "Govt Revenues to GDP",
> "Govt Revenues to GDP", "Structural Budget Position", "Structural
> Budget Position",
> "Structural Budget Position", "Structural Budget Position",
> "Structural Budget Position", "Structural Budget Position"
> ), var2 = c("United States", "United States", "United States",
> "Japan", "Japan", "Japan", "United States", "United States",
> "United States", "Japan", "Japan", "Japan"), value = c(NA,
> 34.288, 31.831, 29.636, 30.539, 29.093, NA, 0, -2.7, -7.4,
> -5.7, -7)), .Names = c("Date", "variable", "var2", "value"
> ), row.names = c(21L, 22L, 23L, 169L, 170L, 171L, 206L, 207L,
> 208L, 354L, 355L, 356L), class = "data.frame")
>
> gp <- ggplot(g, aes(Date, value))
> gp<- gp + geom_line()
> gp <-gp + facet_grid(var2 ~ variable)
> gp
>
> this works, but trying to get a bar chart version
>
> gp <- ggplot(g, aes(Date, value))
> gp<- gp + geom_bar(stat="identity")
> gp <-gp + facet_grid(var2 ~ variable)
> gp
>
> gives the error
> Error in if (!is.null(data$ymin) && !all(data$ymin == 0))
> warning("Stacking not well defined when ymin != 0", :
> missing value where TRUE/FALSE needed
>
> Is there something I can do to have a gap for missing data, as happens
> with the line version?
>
> More generally, I may also have missing data between present data: e.g.
>
> is.na(g[5,4])<-TRUE
>
> and I would like if possible to simply see gaps at these points.
>
> Thanks for any help.
>
> Aidan
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list