[R] dotchart and dotplot(lattice) plot with two/three conditioning variables

Anupam Tyagi @nupty@g| @end|ng |rom gm@||@com
Wed Sep 4 15:16:18 CEST 2024


Hello, I am trying to make a Cleaveland Dotplot with two, if possible
three, variables on the vertical axis. I was able to do it in Stata
with two variables, Year and Population (see graph at the link:
https://drive.google.com/file/d/1SiIfmmqk6IFa_OI5i26Ux1ZxkN2oek-o/view?usp=sharing
). I hope the link to the graph works. I have never tried this before.

I want to make a similar (possibly better) graph in R. I tried several
ways to make it in R with dotchart() and dotplot(lattice). I have been
only partially successful thus far. I would like Year, Population and
popGroup on the vertical axis. If popGroup occupies too much space,
then I would like a gap between the groups of Cities and Villages, so
they can be seen as distinct "Populations". My code and a made-up data
are below (in actual data I have 18 categories in "Population",
instead of only six in the made-up data). How can I make this type of
graph?

# Only for 2004-05. How to plot 2011-12 on the same plot?
dotchart(test$"X0_50"[test$"Year"=="2004-05"], labels=test$Population,
xlab = "Income Share ",
         main = "Income shares of percentiles of population", xlim = c(12, 50))
points(test$"X50_90"[test$"Year"=="2004-05"], 1:6, pch = 2)
points(test$"X90_100"[test$"Year"=="2004-05"], 1:6, pch = 16)
legend(x = "topleft",
       legend = c("0-50%", "50-90%", "90-100%"),
       pch = c(1,2, 16)
)

# reorder so Year 2004-05 is plotted before Year 2011-12. This is not
plotting correctly for
# second and third variables. Gap between different Cities and
Villages is quite a bit.
test2 <- test[order(test$seqCode, test$Year, decreasing = T),]

dotchart(test2$"X0_50", labels=test2$Year, xlab = "Income Share ",
         main = "Income shares of percentiles of population", groups =
as.factor(test2$Population), xlim = c(12, 50))
points(test2$"X50_90", 1:12, pch = 2)
points(test2$"X90_100", 1: 12, pch = 16)


# use lattice library
library(lattice)
dotplot(reorder(Population, -seqCode) ~ test$"X0_50" + test$"X50_90" +
test$"X90_100", data = test, auto.key = TRUE)

testLong <- reshape(test, idvar = c("Population", "Year"), varying = list(5:7),
                           v.names = "ptile", direction = "long")

dotplot(reorder(Population, -seqCode) ~ ptile | Year, data = testLong,
groups = time, auto.key = T)

Dataframe is below using dput(). Dataframe is named "test" in my code.

structure(list(seqCode = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L,
4L, 5L, 6L), popGroup = c("City", "City", "City", "Village",
"Village", "Village", "City", "City", "City", "Village", "Village",
"Village"), Population = c("Dallas", "Boston", "Chicago", "Kip",
"Von", "Dan", "Dallas", "Boston", "Chicago", "Kip", "Von", "Dan"
), Year = c("2004-05", "2004-05", "2004-05", "2004-05", "2004-05",
"2004-05", "2011-12", "2011-12", "2011-12", "2011-12", "2011-12",
"2011-12"), X0_50 = c(15.47, 21.29, 18.04, 15.62, 18.89, 24.37,
17.43, 17.99, 18.04, 14.95, 16.33, 28.98), X50_90 = c(44.12,
43.25, 45.72, 46.15, 43.84, 46.24, 44.39, 44.08, 43.62, 42.89,
44.57, 47.14), X90_100 = c(40.42, 35.47, 36.24, 38.24, 37.27,
29.39, 38.18, 37.93, 38.34, 42.16, 39.11, 23.88)), class =
"data.frame", row.names = c(NA,
-12L))

--
Anupam.



More information about the R-help mailing list