[R] graphically representing frequency of words in a speech?
Mike Lawrence
Mike.Lawrence at dal.ca
Mon Jun 8 02:00:16 CEST 2009
Below are various attempts using using ggplot2
(http://had.co.nz/ggplot2/). First I try random positioning, then
random positioning with alpha, then a quasi-random position scheme in
polar coordinates:
#this demo has random number generation
# so best to set a seed to make it
# reproducible.
set.seed(1)
#generate some fake data
a = data.frame(
word = month.name
, freq = sample(1:10,12,replace=TRUE)
)
#add arbitrary location information
a$x = sample(1:12,12)
a$y = sample(1:12,12)
#load ggplot2
library(ggplot2)
#initialize a ggplot object
my_plot = ggplot()
#create an object for the text layer
my_text = geom_text(
data = a
, aes(
x = x
, y = y
, label = word
, size = freq
)
)
#create an object for the text size limits
my_size_scale = scale_size(
to = c(3,20)
)
#create an object to expand the x-axis limits
# (ensures that text isn't cropped)
my_x_scale = scale_x_continuous(
expand = c(.5, 0)
)
#ditto for the y axis
my_y_scale = scale_y_continuous(
expand = c(.5, 0)
)
#create an opts object that removes
# plot elements unnecessary in a tag cloud
my_opts = opts(
legend.position = 'none'
, panel.grid.minor = theme_blank()
, panel.grid.major = theme_blank()
, panel.background = theme_blank()
, axis.line = theme_blank()
, axis.text.x = theme_blank()
, axis.text.y = theme_blank()
, axis.ticks = theme_blank()
, axis.title.x = theme_blank()
, axis.title.y = theme_blank()
)
#show the plot
print(
my_plot+
my_text+
my_size_scale+
my_x_scale+
my_y_scale+
my_opts
)
#to aid readability amidst overlap, set alpha in
# the call to geom_text
my_text_with_alpha = geom_text(
data = a
, aes(
x = x
, y = y
, label = word
, size = freq
)
, alpha = .5
)
#show the version with alpha
print(
my_plot+
my_text_with_alpha+
my_size_scale+
my_x_scale+
my_y_scale+
my_opts
)
#alternatively, in polar coordinates,
# which maps x to angle and y to radius,
# making a nice circle
print(
my_plot+
my_text_with_alpha+
my_size_scale+
my_opts+
coord_polar()
)
#(note omission of my_y_scale &
# my_x_scale, which seem to be ignored
# when coord_polar() is called. I'll
# report this possible bug to the ggplot2
# maintainer)
#a possible way to avoid overlap is to
# map radius (y) to frequency so that
# larger text is in the periphery
# where there is more room. This
# necessitates adding some random
# noise to the frequency so that
# the low frequency words don't
# jumble in the center too badly
a$freq2 = a$freq+rnorm(12)
#now map radius (y) to freq2
my_text_with_alpha_and_freq2 = geom_text(
data = a
, aes(
x = x
, y = freq2
, label = word
, size = freq
)
, alpha = .5
)
#show the version with alpha & radius mapped to freq2
print(
my_plot+
my_text_with_alpha_and_freq2+
my_size_scale+
my_opts+
coord_polar()
)
--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar
~ Certainty is folly... I think. ~
More information about the R-help
mailing list