[R] Multi-word column names in a data frame

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Thu Sep 6 09:04:10 CEST 2018


You forgot to reply-all ... I don't do private consulting, so please keep 
the conversation on the mailing list.

Here are some ideas for extending your example. However, whether you WANT 
to or not, you really need to learn to manipulate your data BEFORE you 
give it to ggplot.

#########################################
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
library(rlang)
`RefDate` <- as.Date(c("2010-11-1","2010-12-01","2011-01-01"))
`Number of vegetables` <- c(14,23,45)
`Number of people` <- c(20,30,40)
MyData <- data.frame( RefDate
                     , `Number of vegetables`
                     , `Number of people`
                     , check.names=FALSE
                     )
MyVars <- c("Number of vegetables","Number of people")

# simple approach... notice "RefDate" is a string

for (A in MyVars) {
  g2 <- ggplot( MyData
              , aes_string( x = "RefDate"
                          , y = paste0( "`", A, "`")
                          )
              ) +
    geom_line() +
    labs( title = paste( A, "adjusted" ) )
    print( g2 )
  # ggsave( paste0( A,".jpg" )
  #       ,g2
  #       ,height=5
  #       ,width=8
  #       ,dpi=300
  #       )
}


# Using function FQPC here - works, but an inferior use of ggplot
# because ggplot does not build legends with wide data

FQPC <- function(x) {100*x/lag(x,1)-100} # % change
for (A in MyVars) {
  g2 <- ggplot( MyData
              , aes( x = RefDate
                   , y = FQPC( !!sym( A ) )
                   )
              ) +
    geom_line() +
    labs(title = paste( A,"adjusted" ) )
  print( g2 )
  # ggsave( paste0( A,".jpg" )
  #         ,g2
  #         ,height=5
  #         ,width=8
  #         ,dpi=300
  # )
}
#> Warning: Removed 1 rows containing missing values (geom_path).
#'
#> Warning: Removed 1 rows containing missing values (geom_path).

# superior approach is to do the computations first

for ( A in MyVars ) {
   DF <- MyData[ , c( "RefDate", A ) ]
   DF[[ 2 ]] <- FQPC( DF[[ 2 ]] )
   g2 <- ( ggplot( DF, aes( x = RefDate, y = !!sym( A ) ) )
         + geom_line()
         + labs( title = paste( A,"adjusted" ) )
         )
   print( g2 )
}
#> Warning: Removed 1 rows containing missing values (geom_path).
#'
#> Warning: Removed 1 rows containing missing values (geom_path).

# Another way to do the computations first
resultDF <- data_frame( variable = MyVars
                       , data = lapply( MyVars
                                      , function( A ) {
                                           DF <- setNames( MyData[ , c( "RefDate", A ) ]
                                                         , c( "RefDate", "value" )
                                                         )
                                           DF[[ 2 ]] <- FQPC( DF[[ 2 ]] )
                                           DF
                                        }
                                      )
                       )
for ( i in seq.int( nrow( resultDF ) ) ) {
   A <- resultDF$variable[ i ]
   g2 <- ( ggplot( resultDF$data[[ i ]], aes( x = RefDate, y = value ) )
         + geom_line()
         + ylab( A )
         + labs( title = paste( A, "adjusted" ) )
         )
   print( g2 )
}
#> Warning: Removed 1 rows containing missing values (geom_path).

#'     #> Warning: Removed 1 rows containing missing values (geom_path).

# or put them together in order determined by MyVars:

resultDF %>%
mutate( variable = factor( variable, levels = MyVars ) ) %>%
unnest %>% # flattens separate data frames in data column into one long 
data frame
ggplot( aes( x = RefDate, y = value ) ) +
   geom_line() +
   facet_grid( variable ~ ., scales = "free_y" )
#> Warning: Removed 1 rows containing missing values (geom_path).

#' Created on 2018-09-05 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
#####################################33

On Wed, 5 Sep 2018, philipsm using cpanel1.stormweb.net wrote:

> Thanks again for your help. Your suggested solution using aes_string(), which 
> I was not familiar with, worked perfectly when I plotted the column 
> variables. However, I also want to plot transformations of those variables 
> and the aes_string() approach does not work in that case. I implement the 
> transformations with a function and the function is expecting a numeric 
> rather than a string.
>
> For example, when I use the function:
>
> library(dplyr)
> library(ggplot2)
> `RefDate` <- as.Date(c("2010-11-1","2010-12-01","2011-01-01"))
> `Number of vegetables` <- c(14,23,45)
> `Number of people` <- c(20,30,40)
> MyData <- data.frame(RefDate
>                    ,`Number of vegetables`
>                    ,`Number of people`
>                    ,check.names=FALSE
>                    )
> MyVars <- c("Number of vegetables","Number of people")
>
> # No function here - it works
>
> for (A in MyVars) {
> g2 <- ggplot(MyData
>              ,aes_string( x = RefDate
>                         , y = paste0( "`", A, "`")
>                         )
>              ) +
>   geom_line() +
>   labs(title = paste( A,"adjusted" ) )
> g2
> ggsave( paste0( A,".jpg" )
>       ,g2
>       ,height=5
>       ,width=8
>       ,dpi=300
>       )
> }
>
> # Using function FQPC here - it does not work
>
> FQPC <- function(x) {100*x/lag(x,1)-100} # % change
> for (A in MyVars) {
> g2 <- ggplot(MyData
>              ,aes_string( x = RefDate
>                           , y = FQPC(paste0( "`", A, "`"))
>              )
> ) +
>   geom_line() +
>   labs(title = paste( A,"adjusted" ) )
> g2
> ggsave( paste0( A,".jpg" )
>         ,g2
>         ,height=5
>         ,width=8
>         ,dpi=300
> )
> }
>
>
> # I get the error: "Error in 100*x : non-numeric argument to binary 
> operator".
>
> I need some way to convert the string representation of the variable, 'A', 
> back to a column name representation and, presumably, use aes() instead of 
> aes_string(). Any further thoughts about this?
>
> Philip
>
>
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil using dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k




More information about the R-help mailing list