[R] violin plot help
Abdelrahman, Omar (RER)
Omar.Abdelrahman at miamidade.gov
Thu May 18 14:31:03 CEST 2017
Many thanks Jeff! I knew it would require a loop approach, so I will now explore that with the code you suggested.
-----Original Message-----
From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
Sent: Wednesday, May 17, 2017 5:19 PM
To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>; R-help <r-help at r-project.org>
Subject: RE: [R] violin plot help
Your request is outside of the scope of ggplot2. There are a variety of ways to achieve your ends, but they all involve loops of one sort or another... e.g.
wsheds <- unique( merged$Wshed )
for ( w in wsheds ) {
print( ggplot( data=subset( merged, w == Wsheds ), ... ) ) }
I have become quite used to embedding my R code into rmarkdown files which allows me to mix multiple graphs, tables and text commentary in one place. RStudio makes this about as easy as it can be, though support for that IDE is off topic here.
--
Sent from my phone. Please excuse my brevity.
On May 17, 2017 12:59:36 PM PDT, "Abdelrahman, Omar (RER)" <Omar.Abdelrahman at miamidade.gov> wrote:
>Thank you, curly quotes got me! I was able to subset the data and
>produce the violin plot. Now, is there a way to generate multiple plots
>separately (no facets)? With so many levels of each variable, I am
>trying to avoid doing it iteratively. Neither ggplot2 books nor web
>searches have yielded anything (so far).
>Also I want a violin for each year within Geo. I did try to specify
>year with the following:
>ggplot () +
>facet_grid (PARAMETER ~Wshed~year, scales="free_y") + geom_violin
>(data=subdf, aes(x=Geo, y=RESULT, fill=Geo))
>
>which yielded
>-Error in combine_vars(data, params$plot_env, cols, drop = params$drop)
>:
> At least one layer must contain all variables used for faceting
>
>Also tried:
>ggplot () +
>facet_grid (PARAMETER ~Wshed, scales="free_y") + geom_violin
>(data=subdf, aes(x=Geo~year, y=RESULT, fill=Geo))
>
>Do I need to specify "year(date)"; I loaded lubridate?
>
>
>-----Original Message-----
>From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
>Sent: Wednesday, May 17, 2017 10:05 AM
>To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>
>Cc: R-help <r-help at r-project.org>
>Subject: RE: [R] violin plot help
>
>Here is an example that works... a reproducible example always includes
>code AND enough sample data to exercise the code:
>
>########
>dta <- read.table( text=
>"STATION Geo Wshed DATE PARAMETER
> RESULT
>BB36 Bay C-100 1/10/2013 'Phosphorus, Total
>(TP)' 0.004
>BB36 Bay C-100 1/10/2013 'Chlorophyll-A'
> 0.2
>BB52 Bay C-100 1/10/2013 'Phosphorus, Total
>(TP)' 0.003
>BB52 Bay C-100 1/10/2013 'Chlorophyll-A'
> 0.39
>CD01A Mouth C-100 1/10/2013 'Phosphorus, Total
>(TP)' 0.017
>CD01A Mouth C-100 1/10/2013 'Chlorophyll-A'
> 0.64
>CD02 East C-100 1/10/2013 'Phosphorus, Total
>(TP)' 0.01
>CD05 Central C-100 1/10/2013 'Phosphorus, Total
>(TP)' 0.005
>CD06 Central C-100 1/10/2013 'Phosphorus, Total
>(TP)' 0.01
>CD09 Central C-100 1/10/2013 'Phosphorus, Total
>(TP)' 0.007
>BB36 Bay C-100 2/7/2013 'Chlorophyll-A'
> 0.18
>BB36 Bay C-100 2/7/2013 'Phosphorus, Total
>(TP)' 0.002
>BB52 Bay C-100 2/7/2013 'Phosphorus, Total
>(TP)' 0.002
>BB52 Bay C-100 2/7/2013 'Chlorophyll-A'
> 0.31
>CD01A Mouth C-100 2/7/2013 'Phosphorus, Total
>(TP)' 0.004
>CD01A Mouth C-100 2/7/2013 'Chlorophyll-A'
> 0.4
>CD02 East C-100 2/7/2013 'Phosphorus, Total
>(TP)' 0.011
>CD05 Central C-100 2/7/2013 'Phosphorus, Total
>(TP)' 0.007
>CD06 Central C-100 2/7/2013 'Phosphorus, Total
>(TP)' 0.015
>CD09 Central C-100 2/7/2013 'Phosphorus, Total
>(TP)' 0.008
>CD01A Mouth C-100 3/7/2013 'Phosphorus, Total
>(TP)' 0.007
>", header=TRUE)
># prints result to console without assigning it to a new variable
>subset( dta, Geo == "East" ) ########
>
>Note that [1] and [2] suggest the use of the dput function to help
>create R code that creates the object just as you have it before the
>troublesome line of code:
>
>########
>dta <- structure(list(STATION = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
>5L, 6L, 7L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 3L) , .Label =
>c("BB36", "BB52", "CD01A", "CD02", "CD05", "CD06", "CD09")
> , class = "factor"),
> Geo = structure(c(1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L,
> 1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L, 4L), .Label = c("Bay",
> "Central", "East", "Mouth"), class = "factor"),
> Wshed = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L), .Label = "C-100", class = "factor"),
> DATE = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L), .Label = c("1/10/2013",
> "2/7/2013", "3/7/2013"), class = "factor"),
> PARAMETER = structure(c(2L,
> 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
> 2L, 2L, 2L, 2L, 2L), .Label = c("Chlorophyll-A",
> "Phosphorus, Total (TP)"
> ), class = "factor"), RESULT = c(0.004, 0.2, 0.003, 0.39,
> 0.017, 0.64, 0.01, 0.005, 0.01, 0.007, 0.18, 0.002, 0.002,
> 0.31, 0.004, 0.4, 0.011, 0.007, 0.015, 0.008, 0.007)),
> .Names = c("STATION", "Geo", "Wshed", "DATE", "PARAMETER", "RESULT"),
>class = "data.frame", row.names = c(NA, -21L)) subset( dta, Geo ==
>"East" ) ########
>
>Note that the "structure" function created by dput is mostly
>insensitive to extra newlines, except inside quotes.
>
>So the above examples work for me. What doesn't work for you?
>
>One thought: Are you editing your R code with a plain text editor or
>are you editing it with a word processor that might replace your plain
>quotes with curly quotes?
>
>[1]
>http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reprod
>ucible-example
>
>[2] http://adv-r.had.co.nz/Reproducibility.html
>
>On Wed, 17 May 2017, Abdelrahman, Omar (RER) wrote:
>
>> Thanks again
>> RE: "so all the more reason to give us an example that we can run to
>trigger the same error." Are you asking for an example of the data?
>Below is a "small" example, but with so many levels of the different
>variables I am not sure it can be useful.
>>
>> STATION Geo Wshed DATE PARAMETER RESULT
>> BB36 Bay C-100 1/10/2013 Phosphorus, Total (TP) 0.004
>> BB36 Bay C-100 1/10/2013 Chlorophyll-A 0.2
>> BB52 Bay C-100 1/10/2013 Phosphorus, Total (TP) 0.003
>> BB52 Bay C-100 1/10/2013 Chlorophyll-A 0.39
>> CD01A Mouth C-100 1/10/2013 Phosphorus, Total (TP) 0.017
>> CD01A Mouth C-100 1/10/2013 Chlorophyll-A 0.64
>> CD02 East C-100 1/10/2013 Phosphorus, Total (TP) 0.01
>> CD05 Central C-100 1/10/2013 Phosphorus, Total (TP) 0.005
>> CD06 Central C-100 1/10/2013 Phosphorus, Total (TP) 0.01
>> CD09 Central C-100 1/10/2013 Phosphorus, Total (TP) 0.007
>> BB36 Bay C-100 2/7/2013 Chlorophyll-A 0.18
>> BB36 Bay C-100 2/7/2013 Phosphorus, Total (TP) 0.002
>> BB52 Bay C-100 2/7/2013 Phosphorus, Total (TP) 0.002
>> BB52 Bay C-100 2/7/2013 Chlorophyll-A 0.31
>> CD01A Mouth C-100 2/7/2013 Phosphorus, Total (TP) 0.004
>> CD01A Mouth C-100 2/7/2013 Chlorophyll-A 0.4
>> CD02 East C-100 2/7/2013 Phosphorus, Total (TP) 0.011
>> CD05 Central C-100 2/7/2013 Phosphorus, Total (TP) 0.007
>> CD06 Central C-100 2/7/2013 Phosphorus, Total (TP) 0.015
>> CD09 Central C-100 2/7/2013 Phosphorus, Total (TP) 0.008
>> CD01A Mouth C-100 3/7/2013 Phosphorus, Total (TP) 0.007
>>
>> Hope this is not too much
>>
>> -----Original Message-----
>> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
>> Sent: Tuesday, May 16, 2017 12:30 PM
>> To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>; R-help
>> <r-help at r-project.org>
>> Subject: RE: [R] violin plot help
>>
>> Please use reply-all or equivalent to keep the list in the
>conversation. I don't do private online consultation.
>>
>> Your example suggested you did not know the difference, but your
>error suggests a completely different expression triggered the error,
>so all the more reason to give us an example that we can run to trigger
>the same error.
>>
>> Items B and C are recommendations to read the help pages for those
>syntax elements. You should already have read enough of an introduction
>to R to have encountered the use of the question mark to bring up the
>help pages. If not, please do.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On May 16, 2017 9:00:09 AM PDT, "Abdelrahman, Omar (RER)"
><Omar.Abdelrahman at miamidade.gov> wrote:
>>> Thanks Jeff. I will send plain text from now on. I am not sure what
>B
>>> or C mean; is there a guide that I can reference? I know the
>>> difference between "=" and "==" , they work the same in Stata and
>SAS.
>>>
>>> Omar
>>> -----Original Message-----
>>> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
>>> Sent: Tuesday, May 16, 2017 11:43 AM
>>> To: r-help at r-project.org; Abdelrahman, Omar (RER)
>>> <Omar.Abdelrahman at miamidade.gov>; 'r-help at r-project.org'
>>> <r-help at r-project.org>
>>> Subject: Re: [R] violin plot help
>>>
>>> Read
>>> A) the Posting Guide (re plain text only... your emails may be
>>> damaged by the mailing list if you send html-formatted email... only
>
>>> you can solve this by figuring out how to use your email software)
>>> B) Help on assignment (?`=`)
>>> C) Help on logical tests (?`==`)
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On May 16, 2017 7:06:40 AM PDT, "Abdelrahman, Omar (RER)"
>>> <Omar.Abdelrahman at miamidade.gov> wrote:
>>>> I am trying to produce multiple violin plots by 3 categorical
>>>> variables, each violin representing 1 year worth of data. The
>>> variables
>>>> are:
>>>>
>>>> Watershed (7 levels: county canals)
>>>>
>>>> Geography (5 levels: west; central; east; mouth; bay)
>>>>
>>>> Parameter (8 levels: water quality chemical parameters)
>>>>
>>>> Year (25 levels: 1992-2017)
>>>>
>>>> I want to produce 1 plot for each Parameter-Watershed subdivided
>>>> into Geography with a violin for each year. I used facets with the
>>> following
>>>> code (not by year):
>>>>
>>>> ggplot () +
>>>>
>>>> facet_grid (PARAMETER ~Wshed, scales="free_y") +
>>>>
>>>> geom_violin (data=merged, aes(x=Geo, y=RESULT))
>>>>
>>>>
>>>>
>>>> I do not want facets, they crowd the information so it is
>unreadable.
>>> I
>>>> just started with R this week and have not been able to figure out
>>>> the
>>>
>>>> foreach protocol, or any other loop protocol. I tried to subset the
>
>>>> data to do it iteratively with the following code:
>>>>
>>>>
>>>>
>>>> subdf<-subset (merged, Wshed = "AC")
>>>>
>>>>
>>>>
>>>> but got an error: Error: unexpected input in "subdf=subset (merged,
>
>>>> Wshed == ""
>>>>
>>>> Any help would be greatly appreciated.
>>>>
>>>> Thanks,
>>>>
>>>> Omar Abdelrahman, Biologist II
>>>> Miami-Dade County, Department of Regulatory and Economic Resources
>>>> Division of Environmental Resources Management (DERM) Overtown
>>>> Transit
>>>
>>>> Village
>>>> 701 NW 1st Court, 5th Floor
>>>> Miami, FL 33136-3912
>>>> (305) 372-6872
>>>> abdelo at miamidade.gov<mailto:abdelo at miamidade.gov>
>>>>
>www.miamidade.gov/environment<http://www.miamidade.gov/environment/>
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>---------------------------------------------------------------------------
>Jeff Newmiller The ..... ..... Go
>Live...
>DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
>Go...
> Live: OO#.. Dead: OO#.. Playing
>Research Engineer (Solar/Batteries O.O#. #.O#. with
>/Software/Embedded Controllers) .OO#. .OO#.
>rocks...1k
>-----------------------------------------------------------------------
>----
More information about the R-help
mailing list