[R] Illustrating kernel distribution in wheat ears
Carl-Göran CG. Pettersson
CG.Pettersson at vpe.slu.se
Wed Jan 13 13:38:01 CET 2010
Hi,
Thanks a lot for your suggestions and the very detailed instructions, I needed them...
Everything worked fine also in the full dataset, up until the last suggestion (the box plots)
Here I also got an error message, but a different one from what you got. And no output...
Here are the last two command lines and the error message:
> q <- ggplot(spikes.long, aes(side, value))
> q + geom_boxplot() + facet_grid(~ cultivar)
Error in `[.data.frame`(plot$data, , setdiff(cond, names(df)), drop = FALSE) :
undefined columns selected
I used the same variable names and have done the steps suggested up to this point, but with a much bigger dataset than in the question sample.
Sorry to say, I don´t understand the error message..
But the first two variants of plots worked nice and are possible to use for me.
All the best
/CG
________________________________________
Från: Dennis Murphy [djmuser at gmail.com]
Skickat: den 11 januari 2010 15:03
Till: Carl-Göran CG. Pettersson
Kopia: r-help at r-project.org
Ämne: Re: [R] Illustrating kernel distribution in wheat ears
Hi:
It wasn't clear to me precisely what you wanted, but here are a couple of ideas in the hope that it will help.
I used ggplot2 for the graphics, so it requires some manipulation of your dataset from 'wide' format to 'long'.
I also add an indicator for side of the ear (odd is side one (L?), even is side 2) and a variable I call 'loc' to
indicate the value associated with the splxx variable.
I read the data into a data frame called spikelets. The first step is to remove the rows of missing responses:
naind <- apply(spikelets[, -1], 1, function(x) all(is.na<http://is.na>(x)))
spikelets2 <- spikelets[!naind, ]
Next, I use the plyr package and its melt() function to convert the data frame from 'wide' to 'long' form:
library(ggplot2) # attaches the plyr package in the loading process
spikes.long <- melt(spikelets2, id = 'cn')
The variable 'variable' contains the variable names as a vector (spl01, spl02, ..., spl14)
Next, I create a variable called loc, which represents the numeric part of the spl variables, and then
create a variable side to distinguish one side of the awn from the other. 'variable' is then removed...
spikes.long$loc <- as.numeric(substring(spikes.long$variable, 4))
spikes.long$side <- factor(2 - spikes.long$loc %% 2)
spikes.long$variable <- NULL
Now we're in a position to plot. The first is a scatterplot of the response by location, stratified by cultivar;
it contains color to distinguish sides.
# With color:
p <- qplot(loc, value, data = spikes.long, group = cn,
colour = side)
p + facet_grid(cn ~ .)
The color is not terribly informative, so to get rid of it, remove the colour = side argument. One could
also merge the plots together and fit smooths to the different cultivars.
ggplot(spikes.long, aes(loc, value, colour = cn)) +
geom_point() + geom_smooth(se = FALSE)
I also came up with boxplot pairs by side for each cultivar, which is shown below:
q <- ggplot(spikes.long, aes(side, value))
q + geom_boxplot() + facet_grid(~ cultivar)
For some reason, I kept getting these messages from every ggplot2 call:
Error in recordGraphics(drawGTree(x), list(x = x), getNamespace("grid")) :
invalid graphics state
but all of the plots rendered as expected.
HTH,
Dennis
2010/1/10 Carl-Göran CG. Pettersson <CG.Pettersson at vpe.slu.se<mailto:CG.Pettersson at vpe.slu.se>>
Dear all
R2.10 WinXP
I have a dataset dealing with the way different wheat cultivars build their yield.
Wheat ears are organised in spikelets where the spikelets can be numbered from the bottom, with even numbers on one side and odd on the other.
I know how many kernels there were in each spikelet after some months spent counting them...
Now I want to illustrate the differences between the cultivars in how the kernels are distributed in the ears.
In the best of all possible worlds it would be possible to place histograms or boxplots on adjecent sides of vertical lines representing different cultivars.
I have done some experimenting using boxplot() but I am stuck and out of ideas right now.
All ideas are welcome!
/CG
Here is a sample dataset with the countings of kernels for the first 14 spikelets:
cn spl01 spl02 spl03 spl04 spl05 spl06 spl07 spl08 spl09 spl10 spl11 spl12 spl13 spl14
Lans 1.8 3.1 3.5 3.8 3.8 4.1 4.2 4.3 4.4 4.5 4.2 4.1 3.9 3.8
Kranich 0.6 2.4 3.4 4.2 4.5 4.7 4.9 4.9 4.8 4.7 4.4 4.1 4.1 3.9
Loyal 1.1 2.7 3.6 3.7 4.1 4.4 4.4 4.6 4.3 4.5 4.3 4.1 3.8 3.7
Boomer NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Oakley NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Hereford 0.6 2.3 3.3 3.6 3.9 4 4.2 4.1 4.1 3.9 3.9 3.6 3.4 3.2
Kranich 0.3 2.5 3.6 4 4.4 4.5 4.3 4.8 4.7 4.6 4.4 4.3 4.1 4
Oakley 0.5 2.1 3.2 3.4 3.8 4.4 4.3 4.3 4.3 4.2 4.2 3.9 3.8 3.6
Loyal 1.6 3.3 3.9 4.2 4.3 4.4 4.4 4.6 4.6 4.5 4.3 4.3 4.2 4
Hereford NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Oakley 0.5 2.1 3.2 3.6 4 4 4.1 4.4 4.4 4.2 4.1 3.8 3.8 4
Kranich NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Lans 1.4 3 3.3 3.8 3.9 4.3 4 4.3 4.3 4.3 4 4.1 4 4
Hereford 1.2 2.7 3.6 3.8 4 4 4.1 4.2 4.1 4.1 3.9 3.6 3.8 3.3
Boomer 0.3 2.5 3.1 3.8 3.9 4.4 4.1 4.2 4.3 4 4.2 4 3.8 3.7
Lans NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Boomer 0.2 1.9 3 3.4 3.7 3.9 3.9 4 4 4 3.8 3.8 3.6 3.4
Loyal NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Boomer NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Kranich NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Kranich 0.3 1.1 2.9 3.5 3.9 4.3 4.4 4.4 4 4.2 4.2 4 3.9 3.8
Hereford 0.5 2.1 3.1 3.6 3.7 3.9 4 3.8 4 3.8 3.6 3.6 3.1 3
Loyal NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Boomer 0.3 0.8 2.8 3 3.6 3.7 3.8 4 3.8 3.5 3.3 3.2 3.2 2.9
Oakley 0.5 2.7 3.4 3.8 4 3.9 4.2 4.5 4.3 4.4 4 4 3.9 3.9
Loyal 0.9 2.6 3.6 3.8 3.8 4.4 4.2 4.4 4.2 3.9 3.8 4 3.4 3.7
Oakley NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Hereford 0.7 2.9 3.6 4 4 3.9 4 4 4 3.9 3.8 3.7 3 3
Hereford NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Loyal 0.7 2.3 3.5 3.7 3.9 3.8 4.2 4.1 4.1 4.1 4 4 3.4 3.6
Boomer 0.7 2 3.3 3.5 3.9 3.7 4 3.9 3.8 4 3.7 3.8 3.5 3.4
Lans NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Lans 1.9 3 3.7 3.8 3.9 4 3.9 4.3 4.1 4.1 4.1 3.8 3.8 3.9
Lans 1.1 2.6 3.3 3.7 4.1 4 4.2 4.2 4.2 4 4.1 4.1 3.8 3.6
Kranich 0.5 1.3 2.9 3.8 3.8 4.3 4.3 4.4 4.4 4 4.3 3.9 3.6 3.4
Oakley 0.1 2 3.1 3.5 4.1 3.9 4.1 4.2 4.2 4.2 4.1 4 3.9 3.8
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list