[R] Illustrating kernel distribution in wheat ears

Carl-Göran CG. Pettersson CG.Pettersson at vpe.slu.se
Wed Jan 13 13:38:01 CET 2010


Hi,

Thanks a lot for your suggestions and the very detailed instructions, I needed them...
Everything worked fine also in the full dataset, up until the last suggestion (the box plots)

Here I also got an error message, but a different one from what you got. And no output...

Here are the last two command lines and the error message:

> q <- ggplot(spikes.long, aes(side, value))
> q + geom_boxplot() + facet_grid(~ cultivar)
Error in `[.data.frame`(plot$data, , setdiff(cond, names(df)), drop = FALSE) : 
  undefined columns selected

I used the same variable names and have done the steps suggested up to this point, but with a much bigger dataset than in the question sample.

Sorry to say, I don´t understand the error message..
But the first two variants of plots worked nice and are possible to use for me.

All the best
/CG

________________________________________
Från: Dennis Murphy [djmuser at gmail.com]
Skickat: den 11 januari 2010 15:03
Till: Carl-Göran CG. Pettersson
Kopia: r-help at r-project.org
Ämne: Re: [R] Illustrating kernel distribution in wheat ears

Hi:

It wasn't clear to me precisely what you wanted, but here are a couple of ideas in the hope that it will help.
I used ggplot2 for the graphics, so it requires some manipulation of your dataset from 'wide' format to 'long'.
I also add an indicator for side of the ear (odd is side one (L?), even is side 2) and a variable I call 'loc' to
indicate the value associated with the splxx variable.

I read the data into a data frame called spikelets. The first step is to remove the rows of missing responses:

naind <- apply(spikelets[, -1], 1, function(x) all(is.na<http://is.na>(x)))
spikelets2 <- spikelets[!naind, ]

Next, I use the plyr package and its melt() function to convert the data frame from 'wide' to 'long' form:

library(ggplot2)         # attaches the plyr package in the loading process
spikes.long <- melt(spikelets2, id = 'cn')

The variable 'variable' contains the variable names as a vector (spl01, spl02, ..., spl14)
Next, I create a variable called loc, which represents the numeric part of the spl variables, and then
create a variable side to distinguish one side of the awn from the other. 'variable' is then removed...

spikes.long$loc <- as.numeric(substring(spikes.long$variable, 4))
spikes.long$side <- factor(2 - spikes.long$loc %% 2)
spikes.long$variable <- NULL

Now we're in a position to plot. The first is a scatterplot of the response by location, stratified by cultivar;
it contains color to distinguish sides.

# With color:
p <- qplot(loc, value, data = spikes.long, group = cn,
           colour = side)
p + facet_grid(cn ~ .)

The color is not terribly informative, so to get rid of it, remove the colour = side argument. One could
also merge the plots together and fit smooths to the different cultivars.

ggplot(spikes.long, aes(loc, value, colour = cn)) +
    geom_point() + geom_smooth(se = FALSE)

I also came up with boxplot pairs by side for each cultivar, which is shown below:

q <- ggplot(spikes.long, aes(side, value))
q + geom_boxplot() + facet_grid(~ cultivar)

For some reason, I kept getting these messages from every ggplot2 call:

Error in recordGraphics(drawGTree(x), list(x = x), getNamespace("grid")) :
  invalid graphics state

but all of the plots rendered as expected.


HTH,
Dennis

2010/1/10 Carl-Göran CG. Pettersson <CG.Pettersson at vpe.slu.se<mailto:CG.Pettersson at vpe.slu.se>>
Dear all

R2.10  WinXP

I have a dataset dealing with the way different wheat cultivars build their yield.
Wheat ears are organised in spikelets where the spikelets can be numbered from the bottom, with even numbers on one side and odd on the other.
I know how many kernels there were in each spikelet after some months spent counting them...

Now I want to illustrate the differences between the cultivars in how the kernels are distributed in the ears.
In the best of all possible worlds it would be possible to place histograms or boxplots on adjecent sides of vertical lines representing different cultivars.
I have done some experimenting using boxplot() but I am stuck and out of ideas right now.

All ideas are welcome!
/CG


Here is a sample dataset with the countings of kernels for the first 14 spikelets:

cn      spl01   spl02   spl03   spl04   spl05   spl06   spl07   spl08   spl09   spl10   spl11   spl12   spl13   spl14
Lans    1.8     3.1     3.5     3.8     3.8     4.1     4.2     4.3     4.4     4.5     4.2     4.1     3.9     3.8
Kranich 0.6     2.4     3.4     4.2     4.5     4.7     4.9     4.9     4.8     4.7     4.4     4.1     4.1     3.9
Loyal   1.1     2.7     3.6     3.7     4.1     4.4     4.4     4.6     4.3     4.5     4.3     4.1     3.8     3.7
Boomer  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Oakley  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Hereford        0.6     2.3     3.3     3.6     3.9     4       4.2     4.1     4.1     3.9     3.9     3.6     3.4     3.2
Kranich 0.3     2.5     3.6     4       4.4     4.5     4.3     4.8     4.7     4.6     4.4     4.3     4.1     4
Oakley  0.5     2.1     3.2     3.4     3.8     4.4     4.3     4.3     4.3     4.2     4.2     3.9     3.8     3.6
Loyal   1.6     3.3     3.9     4.2     4.3     4.4     4.4     4.6     4.6     4.5     4.3     4.3     4.2     4
Hereford        NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Oakley  0.5     2.1     3.2     3.6     4       4       4.1     4.4     4.4     4.2     4.1     3.8     3.8     4
Kranich NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Lans    1.4     3       3.3     3.8     3.9     4.3     4       4.3     4.3     4.3     4       4.1     4       4
Hereford        1.2     2.7     3.6     3.8     4       4       4.1     4.2     4.1     4.1     3.9     3.6     3.8     3.3
Boomer  0.3     2.5     3.1     3.8     3.9     4.4     4.1     4.2     4.3     4       4.2     4       3.8     3.7
Lans    NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Boomer  0.2     1.9     3       3.4     3.7     3.9     3.9     4       4       4       3.8     3.8     3.6     3.4
Loyal   NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Boomer  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Kranich NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Kranich 0.3     1.1     2.9     3.5     3.9     4.3     4.4     4.4     4       4.2     4.2     4       3.9     3.8
Hereford        0.5     2.1     3.1     3.6     3.7     3.9     4       3.8     4       3.8     3.6     3.6     3.1     3
Loyal   NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Boomer  0.3     0.8     2.8     3       3.6     3.7     3.8     4       3.8     3.5     3.3     3.2     3.2     2.9
Oakley  0.5     2.7     3.4     3.8     4       3.9     4.2     4.5     4.3     4.4     4       4       3.9     3.9
Loyal   0.9     2.6     3.6     3.8     3.8     4.4     4.2     4.4     4.2     3.9     3.8     4       3.4     3.7
Oakley  NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Hereford        0.7     2.9     3.6     4       4       3.9     4       4       4       3.9     3.8     3.7     3       3
Hereford        NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Loyal   0.7     2.3     3.5     3.7     3.9     3.8     4.2     4.1     4.1     4.1     4       4       3.4     3.6
Boomer  0.7     2       3.3     3.5     3.9     3.7     4       3.9     3.8     4       3.7     3.8     3.5     3.4
Lans    NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
Lans    1.9     3       3.7     3.8     3.9     4       3.9     4.3     4.1     4.1     4.1     3.8     3.8     3.9
Lans    1.1     2.6     3.3     3.7     4.1     4       4.2     4.2     4.2     4       4.1     4.1     3.8     3.6
Kranich 0.5     1.3     2.9     3.8     3.8     4.3     4.3     4.4     4.4     4       4.3     3.9     3.6     3.4
Oakley  0.1     2       3.1     3.5     4.1     3.9     4.1     4.2     4.2     4.2     4.1     4       3.9     3.8
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list