[R] grey colored lines and overwriting labels i qqplot2

Brian Diggs diggsb at ohsu.edu
Tue Jul 12 21:12:26 CEST 2011


Merging two posts (data and questions); see inline below.

On 7/11/2011 7:55 PM, Sigrid wrote:
> Thank you, Dennis.
>
>
> This is my regenerated dput codes. They should be correct as I closed off R
> and re-ran them based on the dput output.

NB, this is the test dataset used later

> structure(list(year = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), treatment =
> structure(c(1L,
> 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L,
> 6L, 7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L,
> 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L,
> 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L,
> 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
> 7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L,
> 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
> 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 2L,
> 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L,
> 7L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L,
> 6L, 6L, 6L, 7L, 7L, 7L), .Label = c("A", "B", "C", "D", "E",
> "F", "G"), class = "factor"), total = c(135L, 118L, 121L, 64L,
> 53L, 49L, 178L, 123L, 128L, 127L, 62L, 129L, 126L, 99L, 183L,
> 45L, 57L, 45L, 72L, 30L, 71L, 123L, 89L, 102L, 60L, 44L, 59L,
> 124L, 145L, 126L, 103L, 67L, 97L, 66L, 76L, 108L, 36L, 48L, 41L,
> 69L, 47L, 57L, 167L, 136L, 176L, 85L, 36L, 82L, 222L, 149L, 171L,
> 145L, 122L, 192L, 136L, 164L, 154L, 46L, 57L, 57L, 70L, 55L,
> 102L, 111L, 152L, 204L, 41L, 46L, 103L, 156L, 148L, 155L, 103L,
> 124L, 176L, 111L, 142L, 187L, 43L, 52L, 75L, 64L, 91L, 78L, 196L,
> 314L, 265L, 44L, 39L, 98L, 197L, 273L, 274L, 89L, 91L, 74L, 91L,
> 112L, 98L, 140L, 90L, 121L, 120L, 161L, 83L, 230L, 266L, 282L,
> 35L, 53L, 57L, 315L, 332L, 202L, 90L, 79L, 89L, 67L, 116L, 109L,
> 44L, 68L, 75L, 29L, 52L, 52L, 253L, 203L, 87L, 105L, 234L, 152L,
> 247L, 243L, 144L, 167L, 165L, 95L, 300L, 128L, 125L, 84L, 183L,
> 88L, 153L, 185L, 175L, 226L, 216L, 118L, 118L, 94L, 224L, 259L,
> 176L, 175L, 147L, 197L, 141L, 176L, 187L, 87L, 92L, 148L, 86L,
> 139L, 122L), country = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
> ), .Label = c("high", "low"), class = "factor")), .Names = c("year",
> "treatment", "total", "country"), class = "data.frame", row.names = c(NA,
> -167L))
>
> I hope be useful for you when giving me a hand with my difficulties.


On 7/9/2011 8:24 PM, Sigrid wrote:
> I created this graph in ggplot and added ablines to the different facets by
> specifying with subset commands.  As you might see, there are still a few
> issues.
>
>
> 1.)	I would like to have the diamonds in a grey scale instead of colors. I
> accomplished this (see graph 2) until I overwrote the label title for the
> treatments and the colors came back (graph 1). I used these two commands:
> p=ggplot(data = test, aes(x = YEAR, y = TOTAL, colour = TREATMENT)) +
> geom_point() + facet_wrap(~country)+scale_colour_grey()+
> scale_y_continuous("number of votes")+ scale_x_continuous("Years")+
> scale_x_continuous(breaks=1:4) + scale_colour_hue(breaks='A', labels='label
> A')+ scale_colour_hue(breaks='B', labels='label B')
>
> How can I keep the grey scale, but avoid changing back to colors when using
> the scale_colour_hue command?

You should only have one scale_ call for each scale type.  Here, you 
have three scale_colour_ calls, the first selecting a grey scale, the 
second defining a single break with its label (and thus implicitly 
subsetting on that single break value), and a second which defines a 
different break/label/subset.  Only the last one has any effect.

> http://r.789695.n4.nabble.com/file/n3657119/color_graph.gif
>
>
> 2.)	Furthermore, only one of the overwritten labels of the treatments came
> up, despite putting in two commands (graph 1).  What could have happened
> here?
>
> p + scale_colour_hue(breaks='A', labels='label A')+
> scale_colour_hue(breaks='B', labels='label B')

See previous answer.  Presumably you want:

ggplot(data = test, aes(x = year, y = total, colour = treatment)) +
	geom_point() +
	facet_wrap(~country) +
	scale_colour_grey(breaks=c('A','B','C','D','E','F','G'),
		labels=c('label A','label B','label C','label D',
			'label E','label F','label G')) +
	scale_y_continuous("number of votes") +
	scale_x_continuous("Years", breaks=1:4)

Note that I also collapsed the two scale_x_continuous calls.


> 3.) I would like to add the lines so it matches the default grey scale
> (graph 2), but I do not know the name of the different shades in the grey
> scale. I added the lines in the following way:
>> p + geom_abline(intercept = 81.476, slope=47.267, colour = "green", size =
>> 1, subset = .(country == 'low'))+ geom_abline(intercept = 31.809,
>> slope=20.234, colour = "blue", size = 1, subset = .(country == 'low'))
>> +.....
>
> http://r.789695.n4.nabble.com/file/n3657119/color_graph_2.gif
>
> And now I would like to add lines fitting accordingly with the grey scale.
> Where can I find out the names of the grey tones?

Where did you get these slopes and intercepts?  They don't seem to be 
simple linear regressions.  However, in any case, you want to get these 
into a data frame with each slope and intercept associated with the 
appropriate country and treatment.  For a linear regression, that would 
just be:

regressions <-
ddply(test, c("country","treatment"),
	function(x) {
		coef(lm(total~year, x))
	})

Look at the structure of this data.frame to see what you are trying to 
get. You can integrate this into the previous plot as another layer:

ggplot(data = test, aes(x = year, y = total, colour = treatment)) +
	geom_point() +
	geom_abline(aes(slope = year, intercept = `(Intercept)`,
			colour = treatment),
		data = regressions) +
	facet_wrap(~country) +
	scale_colour_grey(breaks=c('A','B','C','D','E','F','G'),
		labels=c('label A','label B','label C','label D',
			'label E','label F','label G')) +
	scale_y_continuous("number of votes") +
	scale_x_continuous("Years", breaks=1:4)

By setting the colour aesthetic on the geom_abline, it will follow the 
same mapping between treatment and the shade of grey.

> 4.)	I would like to add different shapes. However, when I type
>
>> p+ geom_point(aes(shape = factor(TREATMENT))) + scale_shape(solid = FALSE)
>
> I get this error message:
>   Error: scale_shape_discrete can deal with a maximum of 6 discrete values,
> but you have 7.  See ?scale_manual for a possible alternative.
>
> I did not find anything useful looking at the scale_manual pages. Any tips
> on how to add another symbol?

By default, scale_shape will only allow 6 different shapes.  If you want 
more, you have to used the manual version, scale_shape_manual, and 
define which ones you want.  Here, 0-6 are not a bad set of 7; going 
further will make it hard to pick apart different symbols.

ggplot(data = test, aes(x = year, y = total, colour = treatment)) +
	geom_point(aes(shape = treatment)) +
	geom_abline(aes(slope = year, intercept = `(Intercept)`,
			colour = treatment),
		data = regressions) +
	facet_wrap(~country) +
	scale_colour_grey(breaks=c('A','B','C','D','E','F','G'),
		labels=c('label A','label B','label C','label D',
			'label E','label F','label G')) +
	scale_shape_manual(breaks=c('A','B','C','D','E','F','G'),
		labels=c('label A','label B','label C','label D',
			'label E','label F','label G'),
		values = c(0, 1, 2, 3, 4, 5, 6)) +
	scale_y_continuous("number of votes") +
	scale_x_continuous("Years", breaks=1:4)

Do note that I gave the same breaks and labels arguments to the colour 
and shape scales.  Since they are mapped to the same variable, this 
makes sense.  And having them identical allows the legend to be 
collapsed into a single legend.

> 5. ) Finally, how can I remove the grey background in the graph?

If you just want to remove the gray background, add (+)

opts(panel.background = theme_blank())

but another way is to used the bw theme which does this and more:

ggplot(data = test, aes(x = year, y = total, colour = treatment)) +
	geom_point(aes(shape = treatment)) +
	geom_abline(aes(slope = year, intercept = `(Intercept)`,
			colour = treatment),
		data = regressions) +
	facet_wrap(~country) +
	scale_colour_grey(breaks=c('A','B','C','D','E','F','G'),
		labels=c('label A','label B','label C','label D',
			'label E','label F','label G')) +
	scale_shape_manual(breaks=c('A','B','C','D','E','F','G'),
		labels=c('label A','label B','label C','label D',
			'label E','label F','label G'),
		values = c(0, 1, 2, 3, 4, 5, 6)) +
	scale_y_continuous("number of votes") +
	scale_x_continuous("Years", breaks=1:4) +
	theme_bw()


> Thank you for all input!
>


-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University



More information about the R-help mailing list