[R] violin plot help

Jim Lemon drjimlemon at gmail.com
Thu May 18 00:25:04 CEST 2017


Hi Omar,
You may want to try subsetting your data and then passing each morsel
to be plotted as a violin plot, either as separate calls to ggplot or
directly to a violin plotting routine.

vioplot (vioplot)
violin_plot (plotrix)

Jim

On Thu, May 18, 2017 at 5:59 AM, Abdelrahman, Omar (RER)
<Omar.Abdelrahman at miamidade.gov> wrote:
> Thank you, curly quotes got me! I was able to subset the data and produce the violin plot. Now, is there a way to generate multiple plots separately (no facets)? With so many levels of each variable, I am trying to avoid doing it iteratively. Neither ggplot2 books nor web searches have yielded anything (so far).
> Also I want a violin for each year within Geo. I did try to specify year with the following:
> ggplot () +
> facet_grid (PARAMETER ~Wshed~year, scales="free_y") +
> geom_violin (data=subdf, aes(x=Geo, y=RESULT, fill=Geo))
>
> which yielded
> -Error in combine_vars(data, params$plot_env, cols, drop = params$drop) :
>   At least one layer must contain all variables used for faceting
>
> Also tried:
> ggplot () +
> facet_grid (PARAMETER ~Wshed, scales="free_y") +
> geom_violin (data=subdf, aes(x=Geo~year, y=RESULT, fill=Geo))
>
> Do I need to specify "year(date)"; I loaded lubridate?
>
>
> -----Original Message-----
> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
> Sent: Wednesday, May 17, 2017 10:05 AM
> To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>
> Cc: R-help <r-help at r-project.org>
> Subject: RE: [R] violin plot help
>
> Here is an example that works... a reproducible example always includes code AND enough sample data to exercise the code:
>
> ########
> dta <- read.table( text=
> "STATION        Geo     Wshed   DATE            PARAMETER                 RESULT
> BB36            Bay     C-100   1/10/2013       'Phosphorus, Total (TP)'  0.004
> BB36            Bay     C-100   1/10/2013       'Chlorophyll-A'           0.2
> BB52            Bay     C-100   1/10/2013       'Phosphorus, Total (TP)'  0.003
> BB52            Bay     C-100   1/10/2013       'Chlorophyll-A'           0.39
> CD01A           Mouth   C-100   1/10/2013       'Phosphorus, Total (TP)'  0.017
> CD01A           Mouth   C-100   1/10/2013       'Chlorophyll-A'           0.64
> CD02            East    C-100   1/10/2013       'Phosphorus, Total (TP)'  0.01
> CD05            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.005
> CD06            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.01
> CD09            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.007
> BB36            Bay     C-100   2/7/2013        'Chlorophyll-A'           0.18
> BB36            Bay     C-100   2/7/2013        'Phosphorus, Total (TP)'  0.002
> BB52            Bay     C-100   2/7/2013        'Phosphorus, Total (TP)'  0.002
> BB52            Bay     C-100   2/7/2013        'Chlorophyll-A'           0.31
> CD01A           Mouth   C-100   2/7/2013        'Phosphorus, Total (TP)'  0.004
> CD01A           Mouth   C-100   2/7/2013        'Chlorophyll-A'           0.4
> CD02            East    C-100   2/7/2013        'Phosphorus, Total (TP)'  0.011
> CD05            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.007
> CD06            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.015
> CD09            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.008
> CD01A           Mouth   C-100   3/7/2013        'Phosphorus, Total (TP)'  0.007
> ", header=TRUE)
> # prints result to console without assigning it to a new variable subset( dta, Geo == "East" ) ########
>
> Note that [1] and [2] suggest the use of the dput function to help create R code that creates the object just as you have it before the troublesome line of code:
>
> ########
> dta <- structure(list(STATION = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 3L) , .Label = c("BB36", "BB52", "CD01A", "CD02", "CD05", "CD06", "CD09")
>       , class = "factor"),
>      Geo = structure(c(1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L,
>      1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L, 4L), .Label = c("Bay",
>      "Central", "East", "Mouth"), class = "factor"),
>      Wshed = structure(c(1L,
>      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>      1L, 1L, 1L, 1L, 1L), .Label = "C-100", class = "factor"),
>      DATE = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>      2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L), .Label = c("1/10/2013",
>      "2/7/2013", "3/7/2013"), class = "factor"),
>      PARAMETER = structure(c(2L,
>      1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
>      2L, 2L, 2L, 2L, 2L), .Label = c("Chlorophyll-A",
>      "Phosphorus, Total (TP)"
>      ), class = "factor"), RESULT = c(0.004, 0.2, 0.003, 0.39,
>      0.017, 0.64, 0.01, 0.005, 0.01, 0.007, 0.18, 0.002, 0.002,
>      0.31, 0.004, 0.4, 0.011, 0.007, 0.015, 0.008, 0.007)),
>      .Names = c("STATION", "Geo", "Wshed", "DATE", "PARAMETER", "RESULT"),
>      class = "data.frame", row.names = c(NA, -21L)) subset( dta, Geo == "East" ) ########
>
> Note that the "structure" function created by dput is mostly insensitive to extra newlines, except inside quotes.
>
> So the above examples work for me. What doesn't work for you?
>
> One thought: Are you editing your R code with a plain text editor or are you editing it with a word processor that might replace your plain quotes with curly quotes?
>
> [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>
> [2] http://adv-r.had.co.nz/Reproducibility.html
>
> On Wed, 17 May 2017, Abdelrahman, Omar (RER) wrote:
>
>> Thanks again
>> RE: "so all the more reason to give us an example that we can run to trigger the same error." Are you asking for an example of the data? Below is a "small" example, but with so many levels of the different variables I am not sure it can be useful.
>>
>> STATION       Geo     Wshed   DATE            PARAMETER               RESULT
>> BB36          Bay     C-100   1/10/2013       Phosphorus, Total (TP)  0.004
>> BB36          Bay     C-100   1/10/2013       Chlorophyll-A           0.2
>> BB52          Bay     C-100   1/10/2013       Phosphorus, Total (TP)  0.003
>> BB52          Bay     C-100   1/10/2013       Chlorophyll-A           0.39
>> CD01A         Mouth   C-100   1/10/2013       Phosphorus, Total (TP)  0.017
>> CD01A         Mouth   C-100   1/10/2013       Chlorophyll-A   0.64
>> CD02          East    C-100   1/10/2013       Phosphorus, Total (TP)  0.01
>> CD05          Central C-100   1/10/2013       Phosphorus, Total (TP)  0.005
>> CD06          Central C-100   1/10/2013       Phosphorus, Total (TP)  0.01
>> CD09          Central C-100   1/10/2013       Phosphorus, Total (TP)  0.007
>> BB36          Bay     C-100   2/7/2013        Chlorophyll-A           0.18
>> BB36          Bay     C-100   2/7/2013        Phosphorus, Total (TP)  0.002
>> BB52          Bay     C-100   2/7/2013        Phosphorus, Total (TP)  0.002
>> BB52          Bay     C-100   2/7/2013        Chlorophyll-A           0.31
>> CD01A         Mouth   C-100   2/7/2013        Phosphorus, Total (TP)  0.004
>> CD01A         Mouth   C-100   2/7/2013        Chlorophyll-A           0.4
>> CD02          East    C-100   2/7/2013        Phosphorus, Total (TP)  0.011
>> CD05          Central C-100   2/7/2013        Phosphorus, Total (TP)  0.007
>> CD06          Central C-100   2/7/2013        Phosphorus, Total (TP)  0.015
>> CD09          Central C-100   2/7/2013        Phosphorus, Total (TP)  0.008
>> CD01A         Mouth   C-100   3/7/2013        Phosphorus, Total (TP)  0.007
>>
>> Hope this is not too much
>>
>> -----Original Message-----
>> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
>> Sent: Tuesday, May 16, 2017 12:30 PM
>> To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>; R-help
>> <r-help at r-project.org>
>> Subject: RE: [R] violin plot help
>>
>> Please use reply-all or equivalent to keep the list in the conversation. I don't do private online consultation.
>>
>> Your example suggested you did not know the difference, but your error suggests a completely different expression triggered the error, so all the more reason to give us an example that we can run to trigger the same error.
>>
>> Items B and C are recommendations to read the help pages for those syntax elements. You should already have read enough of an introduction to R to have encountered the use of the question mark to bring up the help pages. If not, please do.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On May 16, 2017 9:00:09 AM PDT, "Abdelrahman, Omar (RER)" <Omar.Abdelrahman at miamidade.gov> wrote:
>>> Thanks Jeff. I will send plain text from now on. I am not sure what B
>>> or C mean; is there a guide that I can reference? I know the
>>> difference between "=" and "==" , they work the same in Stata and SAS.
>>>
>>> Omar
>>> -----Original Message-----
>>> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
>>> Sent: Tuesday, May 16, 2017 11:43 AM
>>> To: r-help at r-project.org; Abdelrahman, Omar (RER)
>>> <Omar.Abdelrahman at miamidade.gov>; 'r-help at r-project.org'
>>> <r-help at r-project.org>
>>> Subject: Re: [R] violin plot help
>>>
>>> Read
>>> A) the Posting Guide (re plain text only... your emails may be
>>> damaged by the mailing list if you send html-formatted email... only
>>> you can solve this by figuring out how to use your email software)
>>> B) Help on assignment (?`=`)
>>> C) Help on logical tests (?`==`)
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On May 16, 2017 7:06:40 AM PDT, "Abdelrahman, Omar (RER)"
>>> <Omar.Abdelrahman at miamidade.gov> wrote:
>>>> I am trying to produce multiple violin plots by 3 categorical
>>>> variables, each violin representing 1 year worth of data. The
>>> variables
>>>> are:
>>>>
>>>> Watershed (7 levels: county canals)
>>>>
>>>> Geography (5 levels: west; central; east; mouth; bay)
>>>>
>>>> Parameter (8 levels: water quality chemical parameters)
>>>>
>>>> Year (25 levels: 1992-2017)
>>>>
>>>> I want to produce 1 plot for each Parameter-Watershed subdivided
>>>> into Geography with a violin for each year. I used facets with the
>>> following
>>>> code (not by year):
>>>>
>>>> ggplot () +
>>>>
>>>> facet_grid (PARAMETER ~Wshed, scales="free_y") +
>>>>
>>>> geom_violin (data=merged, aes(x=Geo, y=RESULT))
>>>>
>>>>
>>>>
>>>> I do not want facets, they crowd the information so it is unreadable.
>>> I
>>>> just started with R this week and have not been able to figure out
>>>> the
>>>
>>>> foreach protocol, or any other loop protocol. I tried to subset the
>>>> data to do it iteratively with the following code:
>>>>
>>>>
>>>>
>>>> subdf<-subset (merged, Wshed = "AC")
>>>>
>>>>
>>>>
>>>> but got an error: Error: unexpected input in "subdf=subset (merged,
>>>> Wshed == ""
>>>>
>>>> Any help would be greatly appreciated.
>>>>
>>>> Thanks,
>>>>
>>>> Omar Abdelrahman, Biologist II
>>>> Miami-Dade County, Department of Regulatory and Economic Resources
>>>> Division of Environmental Resources Management (DERM) Overtown
>>>> Transit
>>>
>>>> Village
>>>> 701 NW 1st Court, 5th Floor
>>>> Miami, FL 33136-3912
>>>> (305) 372-6872
>>>> abdelo at miamidade.gov<mailto:abdelo at miamidade.gov>
>>>> www.miamidade.gov/environment<http://www.miamidade.gov/environment/>
>>>>
>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list