[R] How to understand the mentality behind tidyverse and ggplot2?

John jwd @end|ng |rom @urewe@t@net
Thu Nov 19 00:24:52 CET 2020


On Tue, 17 Nov 2020 12:43:21 -0500
C W <tmrsg11 using gmail.com> wrote:

> Dear R list,
> 
> I am an old-school R user. I use apply(), with(), and which() in base
> package instead of filter(), select(), separate() in Tidyverse. The
> idea of pipeline (i.e. %>%) my code was foreign to me for a while. It
> makes the code shorter, but sometimes less readable?
> 
> With ggplot2, I just don't understand how it is organized. Take this
> code:
> 
> > ggplot(diamonds, aes(x=carat, y=price)) +
> > geom_point(aes(color=cut)) +  
> geom_smooth()
> 
> There are three plus signs. How do you know when to "add" and what to
> "add"? I've seen more plus signs.
> 
> To me, aes() stands for aesthetic, meaning looks. So, anything
> related to looks like points and smooth should be in aes().
> Apparently, it's not the case.
> 
> So, how does ggplot2 work? Could someone explain this for an
> old-school R user?
> 
> Thank you!
> 
A really short form is to consider that ggplot2 syntax defines an
object, and then additional simply adds to it, which is what all the
plus signs are.  Ideally, you can start a ggplot call with a
designation of a target:

Instead of:
ggplot(diamonds, aes(x=carat, y=price)) + ...

use something like"

fig1 <- ggplot(diamonds, aes(x=carat, y=price)) + ...

This creates an environment object that can then be further modified.
Learning the syntax is a chore, but the output tends to be fine,
especially for publications and final graphics. One the other hand it's
slower and fussier than some of the more traditional approaches, which
are what I would prefer for EDA. 

JWDougherty



More information about the R-help mailing list