[R] violin plot help
Abdelrahman, Omar (RER)
Omar.Abdelrahman at miamidade.gov
Wed May 17 21:59:36 CEST 2017
Thank you, curly quotes got me! I was able to subset the data and produce the violin plot. Now, is there a way to generate multiple plots separately (no facets)? With so many levels of each variable, I am trying to avoid doing it iteratively. Neither ggplot2 books nor web searches have yielded anything (so far).
Also I want a violin for each year within Geo. I did try to specify year with the following:
ggplot () +
facet_grid (PARAMETER ~Wshed~year, scales="free_y") +
geom_violin (data=subdf, aes(x=Geo, y=RESULT, fill=Geo))
which yielded
-Error in combine_vars(data, params$plot_env, cols, drop = params$drop) :
At least one layer must contain all variables used for faceting
Also tried:
ggplot () +
facet_grid (PARAMETER ~Wshed, scales="free_y") +
geom_violin (data=subdf, aes(x=Geo~year, y=RESULT, fill=Geo))
Do I need to specify "year(date)"; I loaded lubridate?
-----Original Message-----
From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
Sent: Wednesday, May 17, 2017 10:05 AM
To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>
Cc: R-help <r-help at r-project.org>
Subject: RE: [R] violin plot help
Here is an example that works... a reproducible example always includes code AND enough sample data to exercise the code:
########
dta <- read.table( text=
"STATION Geo Wshed DATE PARAMETER RESULT
BB36 Bay C-100 1/10/2013 'Phosphorus, Total (TP)' 0.004
BB36 Bay C-100 1/10/2013 'Chlorophyll-A' 0.2
BB52 Bay C-100 1/10/2013 'Phosphorus, Total (TP)' 0.003
BB52 Bay C-100 1/10/2013 'Chlorophyll-A' 0.39
CD01A Mouth C-100 1/10/2013 'Phosphorus, Total (TP)' 0.017
CD01A Mouth C-100 1/10/2013 'Chlorophyll-A' 0.64
CD02 East C-100 1/10/2013 'Phosphorus, Total (TP)' 0.01
CD05 Central C-100 1/10/2013 'Phosphorus, Total (TP)' 0.005
CD06 Central C-100 1/10/2013 'Phosphorus, Total (TP)' 0.01
CD09 Central C-100 1/10/2013 'Phosphorus, Total (TP)' 0.007
BB36 Bay C-100 2/7/2013 'Chlorophyll-A' 0.18
BB36 Bay C-100 2/7/2013 'Phosphorus, Total (TP)' 0.002
BB52 Bay C-100 2/7/2013 'Phosphorus, Total (TP)' 0.002
BB52 Bay C-100 2/7/2013 'Chlorophyll-A' 0.31
CD01A Mouth C-100 2/7/2013 'Phosphorus, Total (TP)' 0.004
CD01A Mouth C-100 2/7/2013 'Chlorophyll-A' 0.4
CD02 East C-100 2/7/2013 'Phosphorus, Total (TP)' 0.011
CD05 Central C-100 2/7/2013 'Phosphorus, Total (TP)' 0.007
CD06 Central C-100 2/7/2013 'Phosphorus, Total (TP)' 0.015
CD09 Central C-100 2/7/2013 'Phosphorus, Total (TP)' 0.008
CD01A Mouth C-100 3/7/2013 'Phosphorus, Total (TP)' 0.007
", header=TRUE)
# prints result to console without assigning it to a new variable subset( dta, Geo == "East" ) ########
Note that [1] and [2] suggest the use of the dput function to help create R code that creates the object just as you have it before the troublesome line of code:
########
dta <- structure(list(STATION = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 3L) , .Label = c("BB36", "BB52", "CD01A", "CD02", "CD05", "CD06", "CD09")
, class = "factor"),
Geo = structure(c(1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L, 4L), .Label = c("Bay",
"Central", "East", "Mouth"), class = "factor"),
Wshed = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), .Label = "C-100", class = "factor"),
DATE = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L), .Label = c("1/10/2013",
"2/7/2013", "3/7/2013"), class = "factor"),
PARAMETER = structure(c(2L,
1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("Chlorophyll-A",
"Phosphorus, Total (TP)"
), class = "factor"), RESULT = c(0.004, 0.2, 0.003, 0.39,
0.017, 0.64, 0.01, 0.005, 0.01, 0.007, 0.18, 0.002, 0.002,
0.31, 0.004, 0.4, 0.011, 0.007, 0.015, 0.008, 0.007)),
.Names = c("STATION", "Geo", "Wshed", "DATE", "PARAMETER", "RESULT"),
class = "data.frame", row.names = c(NA, -21L)) subset( dta, Geo == "East" ) ########
Note that the "structure" function created by dput is mostly insensitive to extra newlines, except inside quotes.
So the above examples work for me. What doesn't work for you?
One thought: Are you editing your R code with a plain text editor or are you editing it with a word processor that might replace your plain quotes with curly quotes?
[1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
[2] http://adv-r.had.co.nz/Reproducibility.html
On Wed, 17 May 2017, Abdelrahman, Omar (RER) wrote:
> Thanks again
> RE: "so all the more reason to give us an example that we can run to trigger the same error." Are you asking for an example of the data? Below is a "small" example, but with so many levels of the different variables I am not sure it can be useful.
>
> STATION Geo Wshed DATE PARAMETER RESULT
> BB36 Bay C-100 1/10/2013 Phosphorus, Total (TP) 0.004
> BB36 Bay C-100 1/10/2013 Chlorophyll-A 0.2
> BB52 Bay C-100 1/10/2013 Phosphorus, Total (TP) 0.003
> BB52 Bay C-100 1/10/2013 Chlorophyll-A 0.39
> CD01A Mouth C-100 1/10/2013 Phosphorus, Total (TP) 0.017
> CD01A Mouth C-100 1/10/2013 Chlorophyll-A 0.64
> CD02 East C-100 1/10/2013 Phosphorus, Total (TP) 0.01
> CD05 Central C-100 1/10/2013 Phosphorus, Total (TP) 0.005
> CD06 Central C-100 1/10/2013 Phosphorus, Total (TP) 0.01
> CD09 Central C-100 1/10/2013 Phosphorus, Total (TP) 0.007
> BB36 Bay C-100 2/7/2013 Chlorophyll-A 0.18
> BB36 Bay C-100 2/7/2013 Phosphorus, Total (TP) 0.002
> BB52 Bay C-100 2/7/2013 Phosphorus, Total (TP) 0.002
> BB52 Bay C-100 2/7/2013 Chlorophyll-A 0.31
> CD01A Mouth C-100 2/7/2013 Phosphorus, Total (TP) 0.004
> CD01A Mouth C-100 2/7/2013 Chlorophyll-A 0.4
> CD02 East C-100 2/7/2013 Phosphorus, Total (TP) 0.011
> CD05 Central C-100 2/7/2013 Phosphorus, Total (TP) 0.007
> CD06 Central C-100 2/7/2013 Phosphorus, Total (TP) 0.015
> CD09 Central C-100 2/7/2013 Phosphorus, Total (TP) 0.008
> CD01A Mouth C-100 3/7/2013 Phosphorus, Total (TP) 0.007
>
> Hope this is not too much
>
> -----Original Message-----
> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
> Sent: Tuesday, May 16, 2017 12:30 PM
> To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>; R-help
> <r-help at r-project.org>
> Subject: RE: [R] violin plot help
>
> Please use reply-all or equivalent to keep the list in the conversation. I don't do private online consultation.
>
> Your example suggested you did not know the difference, but your error suggests a completely different expression triggered the error, so all the more reason to give us an example that we can run to trigger the same error.
>
> Items B and C are recommendations to read the help pages for those syntax elements. You should already have read enough of an introduction to R to have encountered the use of the question mark to bring up the help pages. If not, please do.
> --
> Sent from my phone. Please excuse my brevity.
>
> On May 16, 2017 9:00:09 AM PDT, "Abdelrahman, Omar (RER)" <Omar.Abdelrahman at miamidade.gov> wrote:
>> Thanks Jeff. I will send plain text from now on. I am not sure what B
>> or C mean; is there a guide that I can reference? I know the
>> difference between "=" and "==" , they work the same in Stata and SAS.
>>
>> Omar
>> -----Original Message-----
>> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
>> Sent: Tuesday, May 16, 2017 11:43 AM
>> To: r-help at r-project.org; Abdelrahman, Omar (RER)
>> <Omar.Abdelrahman at miamidade.gov>; 'r-help at r-project.org'
>> <r-help at r-project.org>
>> Subject: Re: [R] violin plot help
>>
>> Read
>> A) the Posting Guide (re plain text only... your emails may be
>> damaged by the mailing list if you send html-formatted email... only
>> you can solve this by figuring out how to use your email software)
>> B) Help on assignment (?`=`)
>> C) Help on logical tests (?`==`)
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On May 16, 2017 7:06:40 AM PDT, "Abdelrahman, Omar (RER)"
>> <Omar.Abdelrahman at miamidade.gov> wrote:
>>> I am trying to produce multiple violin plots by 3 categorical
>>> variables, each violin representing 1 year worth of data. The
>> variables
>>> are:
>>>
>>> Watershed (7 levels: county canals)
>>>
>>> Geography (5 levels: west; central; east; mouth; bay)
>>>
>>> Parameter (8 levels: water quality chemical parameters)
>>>
>>> Year (25 levels: 1992-2017)
>>>
>>> I want to produce 1 plot for each Parameter-Watershed subdivided
>>> into Geography with a violin for each year. I used facets with the
>> following
>>> code (not by year):
>>>
>>> ggplot () +
>>>
>>> facet_grid (PARAMETER ~Wshed, scales="free_y") +
>>>
>>> geom_violin (data=merged, aes(x=Geo, y=RESULT))
>>>
>>>
>>>
>>> I do not want facets, they crowd the information so it is unreadable.
>> I
>>> just started with R this week and have not been able to figure out
>>> the
>>
>>> foreach protocol, or any other loop protocol. I tried to subset the
>>> data to do it iteratively with the following code:
>>>
>>>
>>>
>>> subdf<-subset (merged, Wshed = "AC")
>>>
>>>
>>>
>>> but got an error: Error: unexpected input in "subdf=subset (merged,
>>> Wshed == ""
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Thanks,
>>>
>>> Omar Abdelrahman, Biologist II
>>> Miami-Dade County, Department of Regulatory and Economic Resources
>>> Division of Environmental Resources Management (DERM) Overtown
>>> Transit
>>
>>> Village
>>> 701 NW 1st Court, 5th Floor
>>> Miami, FL 33136-3912
>>> (305) 372-6872
>>> abdelo at miamidade.gov<mailto:abdelo at miamidade.gov>
>>> www.miamidade.gov/environment<http://www.miamidade.gov/environment/>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list