[R] Constructing stacked bar plot

Jeff Reichman re|chm@nj @end|ng |rom @bcg|ob@|@net
Sun Jun 27 18:10:26 CEST 2021


R-help Forum

I am attempting to create a stacked bar chart but I have to many categories.
The following code works and I end up plotting all 134 countries but really
only need (say) the top 50 or so.

I am trying to figure out how to further filter out the countries with the
largest total medal counts to plot. The bolded red code is the point where I
am thinking is the point where I would do this . I've tried several
different methods but to no avail. Any suggestions?


# Load data file matching NOCs with mao regions (countries)
noc <- read_csv("~/NGA_Files/JuneMakeoverMonday/noc_regions.csv",
                col_types = cols(
                  NOC = col_character(),
                  region = col_character()
                ))

# Add regions to data and remove missing points
data_regions <- data %>% 
  left_join(noc,by="NOC") %>%
  filter(!is.na(region))

# Subset to variables of interest
medals <- data_regions %>% 
  select(region, Medal)

# count number of medals awarded to each Team
medal_counts_ctry <- medals %>% filter(!is.na(Medal))%>%
  group_by(region, Medal) %>%
  summarize(Count=length(Medal)) 

#head(medal_counts_ctry)

# order Team by total medal count
levs_medal <- medal_counts_ctry %>%
  group_by(region) %>%
  summarize(Total=sum(Count)) %>%
  arrange(desc(Total))

medal_counts_ctry$region <- factor(medal_counts_ctry$region,
levels=levs_medal$region)

medal_data <- medal_counts_ctry %>% filter(medal_counts_ctry$.rows > 100)

# plot
ggplot(medal_data, aes(x=region, y=Count, fill=Medal)) +
  geom_col() +
  coord_flip() +
  scale_fill_manual(values=c("darkorange3","darkgoldenrod1","cornsilk3")) +
  ggtitle("Historical medal counts from Country Teams") +
  theme(plot.title = element_text(hjust = 0.5))


> str(medal_counts_ctry)
grouped_df [323 x 3] (S3: grouped_df/tbl_df/tbl/data.frame)
 $ region: Factor w/ 134 levels "USA","Russia",..: 101 70 70 70 29 29 29 73
73 73 ...
 $ Medal : Factor w/ 3 levels "Bronze","Gold",..: 1 1 2 3 1 2 3 1 2 3 ...
 $ Count : int [1:323] 2 8 5 4 91 91 92 9 2 5 ...
 - attr(*, "groups")= tibble [134 x 2] (S3: tbl_df/tbl/data.frame)
  ..$ region: Factor w/ 134 levels "USA","Russia",..: 1 2 3 4 5 6 7 8 9 10
...
  ..$ .rows : list<int> [1:134] 
  .. ..$ : int [1:3] 307 308 309
  .. ..$ : int [1:3] 235 236 237
  .. ..$ : int [1:3] 102 103 104
  .. ..$ : int [1:3] 296 297 298
  .. ..$ : int [1:3] 95 96 97
  .. ..$ : int [1:3] 138 139 140
  .. ..$ : int [1:3] 263 264 265
  .. ..$ : int [1:3] 46 47 48
  .. ..$ : int [1:3] 11 12 13
  .. ..$ : int [1:3] 117 118 119
  .. ..$ : int [1:3] 194 195 196
  .. ..$ : int [1:3] 208 209 210
  .. ..$ : int [1:3] 52 53 54
  .. ..$ : int [1:3] 147 148 149
  .. ..$ : int [1:3] 92 93 94
  .. ..$ : int [1:3] 266 267 268
  .. ..$ : int [1:3] 232 233 234
  .. ..$ : int [1:3] 69 70 71
  .. ..$ : int [1:3] 253 254 255 ..........

Jeff Reichman

	[[alternative HTML version deleted]]



More information about the R-help mailing list