[R] violin plot help

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Wed May 17 16:05:06 CEST 2017


Here is an example that works... a reproducible example always includes 
code AND enough sample data to exercise the code:

########
dta <- read.table( text=
"STATION        Geo     Wshed   DATE            PARAMETER                 RESULT
BB36            Bay     C-100   1/10/2013       'Phosphorus, Total (TP)'  0.004
BB36            Bay     C-100   1/10/2013       'Chlorophyll-A'           0.2
BB52            Bay     C-100   1/10/2013       'Phosphorus, Total (TP)'  0.003
BB52            Bay     C-100   1/10/2013       'Chlorophyll-A'           0.39
CD01A           Mouth   C-100   1/10/2013       'Phosphorus, Total (TP)'  0.017
CD01A           Mouth   C-100   1/10/2013       'Chlorophyll-A'           0.64
CD02            East    C-100   1/10/2013       'Phosphorus, Total (TP)'  0.01
CD05            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.005
CD06            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.01
CD09            Central C-100   1/10/2013       'Phosphorus, Total (TP)'  0.007
BB36            Bay     C-100   2/7/2013        'Chlorophyll-A'           0.18
BB36            Bay     C-100   2/7/2013        'Phosphorus, Total (TP)'  0.002
BB52            Bay     C-100   2/7/2013        'Phosphorus, Total (TP)'  0.002
BB52            Bay     C-100   2/7/2013        'Chlorophyll-A'           0.31
CD01A           Mouth   C-100   2/7/2013        'Phosphorus, Total (TP)'  0.004
CD01A           Mouth   C-100   2/7/2013        'Chlorophyll-A'           0.4
CD02            East    C-100   2/7/2013        'Phosphorus, Total (TP)'  0.011
CD05            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.007
CD06            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.015
CD09            Central C-100   2/7/2013        'Phosphorus, Total (TP)'  0.008
CD01A           Mouth   C-100   3/7/2013        'Phosphorus, Total (TP)'  0.007
", header=TRUE)
# prints result to console without assigning it to a new variable
subset( dta, Geo == "East" )
########

Note that [1] and [2] suggest the use of the dput function to help create 
R code that creates the object just as you have it before the troublesome 
line of code:

########
dta <- structure(list(STATION = structure(c(1L, 1L, 2L, 2L, 3L, 3L,
4L, 5L, 6L, 7L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 3L)
, .Label = c("BB36", "BB52", "CD01A", "CD02", "CD05", "CD06", "CD09")
      , class = "factor"),
     Geo = structure(c(1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L,
     1L, 1L, 1L, 1L, 4L, 4L, 3L, 2L, 2L, 2L, 4L), .Label = c("Bay",
     "Central", "East", "Mouth"), class = "factor"),
     Wshed = structure(c(1L,
     1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
     1L, 1L, 1L, 1L, 1L), .Label = "C-100", class = "factor"),
     DATE = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
     2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L), .Label = c("1/10/2013",
     "2/7/2013", "3/7/2013"), class = "factor"),
     PARAMETER = structure(c(2L,
     1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
     2L, 2L, 2L, 2L, 2L), .Label = c("Chlorophyll-A",
     "Phosphorus, Total (TP)"
     ), class = "factor"), RESULT = c(0.004, 0.2, 0.003, 0.39,
     0.017, 0.64, 0.01, 0.005, 0.01, 0.007, 0.18, 0.002, 0.002,
     0.31, 0.004, 0.4, 0.011, 0.007, 0.015, 0.008, 0.007)),
     .Names = c("STATION", "Geo", "Wshed", "DATE", "PARAMETER", "RESULT"),
     class = "data.frame", row.names = c(NA, -21L))
subset( dta, Geo == "East" )
########

Note that the "structure" function created by dput is mostly insensitive
to extra newlines, except inside quotes.

So the above examples work for me. What doesn't work for you?

One thought: Are you editing your R code with a plain text editor or are 
you editing it with a word processor that might replace your plain quotes 
with curly quotes?

[1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

[2] http://adv-r.had.co.nz/Reproducibility.html

On Wed, 17 May 2017, Abdelrahman, Omar (RER) wrote:

> Thanks again
> RE: "so all the more reason to give us an example that we can run to trigger the same error." Are you asking for an example of the data? Below is a "small" example, but with so many levels of the different variables I am not sure it can be useful.
>
> STATION	Geo	Wshed	DATE		PARAMETER		RESULT
> BB36		Bay	C-100	1/10/2013	Phosphorus, Total (TP)	0.004
> BB36		Bay	C-100	1/10/2013	Chlorophyll-A		0.2
> BB52		Bay	C-100	1/10/2013	Phosphorus, Total (TP)	0.003
> BB52		Bay	C-100	1/10/2013	Chlorophyll-A		0.39
> CD01A		Mouth	C-100	1/10/2013	Phosphorus, Total (TP)	0.017
> CD01A		Mouth	C-100	1/10/2013	Chlorophyll-A	0.64
> CD02		East	C-100	1/10/2013	Phosphorus, Total (TP)	0.01
> CD05		Central	C-100	1/10/2013	Phosphorus, Total (TP)	0.005
> CD06		Central	C-100	1/10/2013	Phosphorus, Total (TP)	0.01
> CD09		Central	C-100	1/10/2013	Phosphorus, Total (TP)	0.007
> BB36		Bay	C-100	2/7/2013	Chlorophyll-A		0.18
> BB36		Bay	C-100	2/7/2013	Phosphorus, Total (TP)	0.002
> BB52		Bay	C-100	2/7/2013	Phosphorus, Total (TP)	0.002
> BB52		Bay	C-100	2/7/2013	Chlorophyll-A		0.31
> CD01A		Mouth	C-100	2/7/2013	Phosphorus, Total (TP)	0.004
> CD01A		Mouth	C-100	2/7/2013	Chlorophyll-A		0.4
> CD02		East	C-100	2/7/2013	Phosphorus, Total (TP)	0.011
> CD05		Central	C-100	2/7/2013	Phosphorus, Total (TP)	0.007
> CD06		Central	C-100	2/7/2013	Phosphorus, Total (TP)	0.015
> CD09		Central	C-100	2/7/2013	Phosphorus, Total (TP)	0.008
> CD01A		Mouth	C-100	3/7/2013	Phosphorus, Total (TP)	0.007
>
> Hope this is not too much
>
> -----Original Message-----
> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
> Sent: Tuesday, May 16, 2017 12:30 PM
> To: Abdelrahman, Omar (RER) <Omar.Abdelrahman at miamidade.gov>; R-help <r-help at r-project.org>
> Subject: RE: [R] violin plot help
>
> Please use reply-all or equivalent to keep the list in the conversation. I don't do private online consultation.
>
> Your example suggested you did not know the difference, but your error suggests a completely different expression triggered the error, so all the more reason to give us an example that we can run to trigger the same error.
>
> Items B and C are recommendations to read the help pages for those syntax elements. You should already have read enough of an introduction to R to have encountered the use of the question mark to bring up the help pages. If not, please do.
> --
> Sent from my phone. Please excuse my brevity.
>
> On May 16, 2017 9:00:09 AM PDT, "Abdelrahman, Omar (RER)" <Omar.Abdelrahman at miamidade.gov> wrote:
>> Thanks Jeff. I will send plain text from now on. I am not sure what B
>> or C mean; is there a guide that I can reference? I know the difference
>> between "=" and "==" , they work the same in Stata and SAS.
>>
>> Omar
>> -----Original Message-----
>> From: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
>> Sent: Tuesday, May 16, 2017 11:43 AM
>> To: r-help at r-project.org; Abdelrahman, Omar (RER)
>> <Omar.Abdelrahman at miamidade.gov>; 'r-help at r-project.org'
>> <r-help at r-project.org>
>> Subject: Re: [R] violin plot help
>>
>> Read
>> A) the Posting Guide (re plain text only... your emails may be damaged
>> by the mailing list if you send html-formatted email... only you can
>> solve this by figuring out how to use your email software)
>> B) Help on assignment (?`=`)
>> C) Help on logical tests (?`==`)
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On May 16, 2017 7:06:40 AM PDT, "Abdelrahman, Omar (RER)"
>> <Omar.Abdelrahman at miamidade.gov> wrote:
>>> I am trying to produce multiple violin plots by 3 categorical
>>> variables, each violin representing 1 year worth of data. The
>> variables
>>> are:
>>>
>>> Watershed (7 levels: county canals)
>>>
>>> Geography (5 levels: west; central; east; mouth; bay)
>>>
>>> Parameter (8 levels: water quality chemical parameters)
>>>
>>> Year (25 levels: 1992-2017)
>>>
>>> I want to produce 1 plot for each Parameter-Watershed subdivided into
>>> Geography with a violin for each year. I used facets with the
>> following
>>> code (not by year):
>>>
>>> ggplot () +
>>>
>>> facet_grid (PARAMETER ~Wshed, scales="free_y") +
>>>
>>> geom_violin (data=merged, aes(x=Geo, y=RESULT))
>>>
>>>
>>>
>>> I do not want facets, they crowd the information so it is unreadable.
>> I
>>> just started with R this week and have not been able to figure out the
>>
>>> foreach protocol, or any other loop protocol. I tried to subset the
>>> data to do it iteratively with the following code:
>>>
>>>
>>>
>>> subdf<-subset (merged, Wshed = "AC")
>>>
>>>
>>>
>>> but got an error: Error: unexpected input in "subdf=subset (merged,
>>> Wshed == ""
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Thanks,
>>>
>>> Omar Abdelrahman, Biologist II
>>> Miami-Dade County, Department of Regulatory and Economic Resources
>>> Division of Environmental Resources Management (DERM) Overtown Transit
>>
>>> Village
>>> 701 NW 1st Court, 5th Floor
>>> Miami, FL 33136-3912
>>> (305) 372-6872
>>> abdelo at miamidade.gov<mailto:abdelo at miamidade.gov>
>>> www.miamidade.gov/environment<http://www.miamidade.gov/environment/>
>>>
>>>
>>> 	[[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list