[R-sig-teaching] Graph Two Series over Time

Michael Weylandt michael.weylandt at gmail.com
Wed Dec 30 03:34:32 CET 2015


https://cran.r-project.org/web/packages/tidyr/index.html

On Tue, Dec 29, 2015 at 6:30 PM, Steven Stoline <sstoline at gmail.com> wrote:
> Dear Randall:
>
> I could not find package 9or function) called "*tidyr*". I install all
> other packages, but could not find tidyr.
>
> with many thanks
> steve
>
> On Tue, Dec 29, 2015 at 5:43 PM, Randall Pruim <rpruim at calvin.edu> wrote:
>
>> A few more suggestions and an update to my ggplot2 plot.
>>
>>   1) I recommend using SPACES in your code to make things more readable.
>>   2) Coding things with COLOR isn’t really very useful.  This is an
>> additional variable and should be coded as such.
>>   3) I don’t really know what detected means, but I’ve coded it as a
>> logical variable.  You could use a factor or character vector instead.
>>   4) You have used inconsistent date formatting which (without my edits)
>> will cause some years to be 0005 and others to be 2005.  (This will be
>> immediately clear when the plot spans 2000 years — that’s how I detected
>> the problem.)
>>
>> Here’s what my first draft would look like:
>>
>>
>> ### Put data into a data frame -- avoid loose vectors
>> library(dplyr); library(lubridate); require(tidyr)
>> library(ggplot2)
>>
>> # recreate your data in a data frame
>> MyData <- data_frame(
>>   Well1 =
>> c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25),
>>   Well2 =
>> c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11),
>>   dateString =
>> c("2Jan05","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06",
>>
>>  "2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07"),
>>   date = dmy(dateString)
>> )
>>
>> # put the data into "long" format
>> MyData2 <-
>>   MyData %>%
>>   gather(location, concentration, Well1, Well2) %>%
>>   mutate(detected = TRUE)
>>
>> # hand-code your colored values (should be double checked for accuracy)
>>
>> MyData2$detected[c(1, 2, 5, 15 + 1, 15 + 5, 15 + 10)] <- FALSE
>>
>> # Create plot using ggplot2
>>
>> ggplot( data = MyData2 %>% filter(!is.na(concentration)),
>>         aes(x = date, y = concentration, colour = location)) +
>>   geom_line(alpha = 0.8) +
>>   geom_point( aes(shape = detected, group = location), size = 3, alpha =
>> 0.8) +
>>   scale_shape_manual(values = c(1, 16)) +
>>   theme_minimal()
>>
>>
>>
>>
>> > On Dec 26, 2015, at 6:02 AM, Steven Stoline <sstoline at gmail.com> wrote:
>> >
>> > Dear Randall:
>> >
>> >
>> > Thank you very much for the details and for your support and patience.
>> >
>> >
>> >
>> > ### This how are the original data look like:
>> > ### ---------------------------------------------------
>> >
>> >
>> >
>> >
>> Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25)
>> >
>> >
>> >
>> >
>> Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11)
>> >
>> >
>> >
>> >
>> date<-c("2Jan2005","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06","2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07")
>> >
>> >
>> >
>> > The data values in red font are Non-detected. So I need to make
>> difference between these non-detected values and the detected ones in the
>> graph.
>> >
>> >
>> >
>> > For example, solid circle for the detected ones, and open circles for
>> the non-detected one (the ones in red font).
>> >
>> >
>> > So, I was trying to use pch for.
>> >
>> >
>> >
>> > Please notice that, now, both data sets Well1 and Well2, and date have
>> the same length of 15, but Well1 has one NA, and Well2 has two NA.
>> >
>> >
>> > Happy Holiday and Happy Christmas (if you are celebrating)
>> >
>> > with many thanks
>> > steve
>> >
>> > On Thu, Dec 24, 2015 at 9:31 AM, Randall Pruim <rpruim at calvin.edu>
>> wrote:
>> > Steve,
>> >
>> > This is on the edge of what R-sig-teaching is for (since it isn’t really
>> about teaching).  But since I think there are elements of what you are
>> doing that lead students to think that R is terrible, I’ll show you how I
>> might approach things.
>> >
>> > First a few comments about my solution.
>> >
>> > 1) I generally avoid loose vectors.  I prefer to use data frames to keep
>> related vectors related.
>> >
>> > 2) I prefer to code dates as dates.  I would be very nervous about code
>> that manually sets the axis labels differently from the data.  That can
>> lead to all sorts of bad errors down the road if you change the data and
>> forget to change the labels and often indicates you don’t have the data
>> formatted the way you should.  (Note:  I added day of month values to your
>> dates that had none.)  The lubridate package makes it easy to create dates
>> from strings.
>> >
>> > 3) I rarely use base graphics, so I’ll show you solutions using lattice
>> and ggplot2.  There may be nice ways to do this in base graphics as well.
>> >
>> > 4) I’m ignoring the color choices, title, etc.  All that can be easily
>> added, but I’m focusing on getting the data display correct.  That’s
>> generally the approach I take to plotting:  First get the data display
>> correct, then fancy up titles, colors, fonts, etc.  It’s saves lots of
>> times, because often once I see the plot, I realize it isn’t what I need,
>> so there is no reason to gussy it up.
>> >
>> > 5) I prefer (and lattice and ggplot2) encourage keeping the data
>> manipulation in one location and the plotting after that rather than going
>> back and forth between those two types of operations.  I find that it makes
>> the code easier to read.
>> >
>> > 6) One of your series as fewer points than the other.  I made the
>> assumption that the missing value was at the end.  That should be changed
>> to whatever is correct for your data.
>> >
>> > 7) I don’t know what you were using pch to indicate, so I created a
>> variable called “group” with values 0 and 15.  The variable and its values
>> should ideally be renamed to reflect what they represent.  That will make
>> your code easier to read and produce better labeling of the plot.
>> >
>> > And one note about your code.
>> >
>> >> 6*0:max_y
>> >
>> > probably doesn’t do what you expect since the 6 does nothing here
>> (because 6 * 0 = 0).  You could do 6 * (0:max_y), but isn’t clear why you
>> would want the range of the plot to be six times that of the data.  Maybe
>> you were thinking something like seq(0, max_y, length.out = 6), but that
>> will give pretty ugly breakpoints.  In any case, the plots below do a fine
>> job of setting the axes by default, and each system allows you to tune them
>> if you disagree with the default for a particular plot.
>> >
>> >
>> > With that much preamble, the code is now shorter than the introduction.
>> >
>> >
>> > ### Put data into a data frame -- avoid loose vectors
>> > library(dplyr); library(lubridate)
>> >
>> > # if i knew what you were using pch for, i would name group and its
>> values to match
>> > MyData <- data_frame(
>> >   Well1 =
>> c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25),
>> >   Well2 =
>> c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11,NA),
>> >   dateString =
>> c("1Jan05","1April05","1Jul05","1Oct05","1Jan06","1March06","1Jun06","2Oct06","17Oct06","1Jan07","1April07","1Jul07","1Oct07","1Dec07"),
>> >   date = dmy(dateString),
>> >   group = factor(c(0,0,15,15,0,15,15,15,15,15,15,15,15,15))
>> > )
>> >
>> > ## using lattice
>> > ## lattice makes plotting two series easy
>> > ## but doesn't make it as easy to have different symbols along the same
>> series
>> >
>> > library(lattice)
>> > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"), auto.key
>> = TRUE)
>> > ## better legend
>> > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"),
>> >        auto.key = list(points = TRUE, lines = TRUE))
>> >
>> > ## using ggplot2
>> > ## for highly customized plots, i generally find ggplot2 works better
>> > ## i would reshape the data with tidyr before plotting (could be don in
>> lattice as well)
>> >
>> > library(ggplot2); library(tidyr)
>> >
>> > MyData2 <-
>> >   MyData %>%
>> >   gather(location, concentration, Well1, Well2)
>> >
>> > ggplot( data = MyData2, aes(x = date, y = concentration, colour =
>> location)) +
>> >   geom_line() +
>> >   geom_point( aes(shape = group), size = 2)
>> >
>> > xyplot(concentration ~ date, data = MyData2, groups = location, type =
>> c("p", "l"),
>> >        auto.key = TRUE)
>> >
>> > ## without reshaping, you can plot 4 layers well manually, but the
>> default labeling isn’t as nice
>> >
>> > ggplot(data = MyData) +
>> >   geom_line(aes(x = date, y = Well1, colour = "Well1")) +
>> >   geom_line(aes(x = date, y = Well2, colour = "Well2")) +
>> >   geom_point(aes(x = date, y = Well1, colour = "Well1", shape = group)) +
>> >   geom_point(aes(x = date, y = Well2, colour = "Well2", shape = group))
>> >
>> >
>> > Happy Holidays.  I hope one of these approaches will get you headed in
>> the right direction.
>> >
>> > —rjp
>> >
>> >
>> >
>> >> On Dec 24, 2015, at 7:51 AM, Steven Stoline <sstoline at gmail.com> wrote:
>> >>
>> >> Dear All:
>> >>
>> >> I am trying to plot two series in one graph. But I have some
>> difficulties
>> >> to set up the y-axis lim. Also, the second series is not correctly
>> graphed.
>> >>
>> >> *Here is what I tried to do:*
>> >>
>> >>
>> >> ### Define 2 vectors
>> >>
>> >>
>> Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25)
>> >>
>> Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11)
>> >>
>> >> ### Calculate range from 0 to max value of Well1 and Well2
>> >> ### g_range <- range(0, Well1, Well2)
>> >>
>> >> max_y <- max(Well1, Well2)
>> >>
>> >> ### Graph Groundwater Concentrations using y axis that ranges from 0 to
>> max
>> >> ### value in Well1 or Well2 vector.  Turn off axes and
>> >> ### annotations (axis labels) so we can specify them yourself
>> >>
>> >> plot(Well1, type="o", pch=c(0,0,15,15,0,15,15,15,15,15,15,15,15,15),
>> >> col="blue", ylim=c(0,max_y), axes=FALSE, ann=FALSE, , lwd=3, cex=1.25)
>> ###
>> >> axes=FALSE,
>> >>
>> >> ### Make x axis using Jan 2005 - Dec 2008 labels
>> >>
>> >> axis(1, at=1:14,
>> >>
>> lab=c("Jan05","April05","Jul05","Oct05","Jan06","March06","Jun06","2Oct06","17Oct06","Jan07","April07","Jul07","Oct07","Dec07"))
>> >>
>> >>
>> >>
>> >> *### Make y axis with horizontal labels , Here what I have the major
>> >> problem*
>> >>
>> >> ### I want the y-axis looks like: 0, 0.05, 0.10, 0.15, 20, 0.25
>> >>
>> >> axis(2, las=0, at=6*0:max_y)  ### max_y
>> >>
>> >>
>> >> ### Create box around plot
>> >>
>> >> box()
>> >>
>> >> ### Graph Well2 with red dashed line and square points
>> >>
>> >> ### lines(Well2, type="o", pch=22, lty=2, col="red", lwd=3, cex=1.0)
>> >>
>> >> lines(Well2, type="o", pch=c(0,15,15,15,0,15,15,15,0,15,15,15,15),
>> lty=2,
>> >> col="red", lwd=3, cex=1.25)
>> >>
>> >> ### Create a title with a red, bold/italic font
>> >>
>> >> title(main="Trichloroethene mg/L from Wells 1 and 2 - 2005-2007",
>> >> col.main="red", font.main=2)
>> >>
>> >> ### Label the x and y axes with dark green text
>> >>
>> >> title(xlab="Time Points", col.lab=rgb(0,0.5,0))
>> >>
>> >>
>> >> title(ylab="Trichloroethene mg/L", col.lab=rgb(0,0.5,0))
>> >>
>> >> ### Create a legend
>> >>
>> >> legend(1, g_range[2], c("Well1","Well2"), cex=1.0, col=c("blue","red"),
>> >> pch=15:15, lty=1:2);
>> >>
>> >>
>> >>
>> >>
>> >> with thanks
>> >> steve
>> >> -------------------------
>> >> Steven M. Stoline
>> >> 1123 Forest Avenue
>> >> Portland, ME 04112
>> >> sstoline at gmail.com
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> _______________________________________________
>> >> R-sig-teaching at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
>> >
>> >
>> >
>> >
>> > --
>> > Steven M. Stoline
>> > 1123 Forest Avenue
>> > Portland, ME 04112
>> > sstoline at gmail.com
>>
>>
>
>
> --
> Steven M. Stoline
> 1123 Forest Avenue
> Portland, ME 04112
> sstoline at gmail.com
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-teaching at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching



More information about the R-sig-teaching mailing list