[R-sig-teaching] Graph Two Series over Time

Steven Stoline sstoline at gmail.com
Wed Dec 30 01:30:51 CET 2015


Dear Randall:

I could not find package 9or function) called "*tidyr*". I install all
other packages, but could not find tidyr.

with many thanks
steve

On Tue, Dec 29, 2015 at 5:43 PM, Randall Pruim <rpruim at calvin.edu> wrote:

> A few more suggestions and an update to my ggplot2 plot.
>
>   1) I recommend using SPACES in your code to make things more readable.
>   2) Coding things with COLOR isn’t really very useful.  This is an
> additional variable and should be coded as such.
>   3) I don’t really know what detected means, but I’ve coded it as a
> logical variable.  You could use a factor or character vector instead.
>   4) You have used inconsistent date formatting which (without my edits)
> will cause some years to be 0005 and others to be 2005.  (This will be
> immediately clear when the plot spans 2000 years — that’s how I detected
> the problem.)
>
> Here’s what my first draft would look like:
>
>
> ### Put data into a data frame -- avoid loose vectors
> library(dplyr); library(lubridate); require(tidyr)
> library(ggplot2)
>
> # recreate your data in a data frame
> MyData <- data_frame(
>   Well1 =
> c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25),
>   Well2 =
> c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11),
>   dateString =
> c("2Jan05","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06",
>
>  "2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07"),
>   date = dmy(dateString)
> )
>
> # put the data into "long" format
> MyData2 <-
>   MyData %>%
>   gather(location, concentration, Well1, Well2) %>%
>   mutate(detected = TRUE)
>
> # hand-code your colored values (should be double checked for accuracy)
>
> MyData2$detected[c(1, 2, 5, 15 + 1, 15 + 5, 15 + 10)] <- FALSE
>
> # Create plot using ggplot2
>
> ggplot( data = MyData2 %>% filter(!is.na(concentration)),
>         aes(x = date, y = concentration, colour = location)) +
>   geom_line(alpha = 0.8) +
>   geom_point( aes(shape = detected, group = location), size = 3, alpha =
> 0.8) +
>   scale_shape_manual(values = c(1, 16)) +
>   theme_minimal()
>
>
>
>
> > On Dec 26, 2015, at 6:02 AM, Steven Stoline <sstoline at gmail.com> wrote:
> >
> > Dear Randall:
> >
> >
> > Thank you very much for the details and for your support and patience.
> >
> >
> >
> > ### This how are the original data look like:
> > ### ---------------------------------------------------
> >
> >
> >
> >
> Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,NA,0.20,0.25)
> >
> >
> >
> >
> Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,NA,0.10,0.115,0.14,0.17,NA,0.11)
> >
> >
> >
> >
> date<-c("2Jan2005","7April05","17July05","24Oct05","7Jan06","30March06","28Jun06","2Oct06","17Oct06","15Jan07","10April07","9July07","5Oct07","29Oct07","30Dec07")
> >
> >
> >
> > The data values in red font are Non-detected. So I need to make
> difference between these non-detected values and the detected ones in the
> graph.
> >
> >
> >
> > For example, solid circle for the detected ones, and open circles for
> the non-detected one (the ones in red font).
> >
> >
> > So, I was trying to use pch for.
> >
> >
> >
> > Please notice that, now, both data sets Well1 and Well2, and date have
> the same length of 15, but Well1 has one NA, and Well2 has two NA.
> >
> >
> > Happy Holiday and Happy Christmas (if you are celebrating)
> >
> > with many thanks
> > steve
> >
> > On Thu, Dec 24, 2015 at 9:31 AM, Randall Pruim <rpruim at calvin.edu>
> wrote:
> > Steve,
> >
> > This is on the edge of what R-sig-teaching is for (since it isn’t really
> about teaching).  But since I think there are elements of what you are
> doing that lead students to think that R is terrible, I’ll show you how I
> might approach things.
> >
> > First a few comments about my solution.
> >
> > 1) I generally avoid loose vectors.  I prefer to use data frames to keep
> related vectors related.
> >
> > 2) I prefer to code dates as dates.  I would be very nervous about code
> that manually sets the axis labels differently from the data.  That can
> lead to all sorts of bad errors down the road if you change the data and
> forget to change the labels and often indicates you don’t have the data
> formatted the way you should.  (Note:  I added day of month values to your
> dates that had none.)  The lubridate package makes it easy to create dates
> from strings.
> >
> > 3) I rarely use base graphics, so I’ll show you solutions using lattice
> and ggplot2.  There may be nice ways to do this in base graphics as well.
> >
> > 4) I’m ignoring the color choices, title, etc.  All that can be easily
> added, but I’m focusing on getting the data display correct.  That’s
> generally the approach I take to plotting:  First get the data display
> correct, then fancy up titles, colors, fonts, etc.  It’s saves lots of
> times, because often once I see the plot, I realize it isn’t what I need,
> so there is no reason to gussy it up.
> >
> > 5) I prefer (and lattice and ggplot2) encourage keeping the data
> manipulation in one location and the plotting after that rather than going
> back and forth between those two types of operations.  I find that it makes
> the code easier to read.
> >
> > 6) One of your series as fewer points than the other.  I made the
> assumption that the missing value was at the end.  That should be changed
> to whatever is correct for your data.
> >
> > 7) I don’t know what you were using pch to indicate, so I created a
> variable called “group” with values 0 and 15.  The variable and its values
> should ideally be renamed to reflect what they represent.  That will make
> your code easier to read and produce better labeling of the plot.
> >
> > And one note about your code.
> >
> >> 6*0:max_y
> >
> > probably doesn’t do what you expect since the 6 does nothing here
> (because 6 * 0 = 0).  You could do 6 * (0:max_y), but isn’t clear why you
> would want the range of the plot to be six times that of the data.  Maybe
> you were thinking something like seq(0, max_y, length.out = 6), but that
> will give pretty ugly breakpoints.  In any case, the plots below do a fine
> job of setting the axes by default, and each system allows you to tune them
> if you disagree with the default for a particular plot.
> >
> >
> > With that much preamble, the code is now shorter than the introduction.
> >
> >
> > ### Put data into a data frame -- avoid loose vectors
> > library(dplyr); library(lubridate)
> >
> > # if i knew what you were using pch for, i would name group and its
> values to match
> > MyData <- data_frame(
> >   Well1 =
> c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25),
> >   Well2 =
> c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11,NA),
> >   dateString =
> c("1Jan05","1April05","1Jul05","1Oct05","1Jan06","1March06","1Jun06","2Oct06","17Oct06","1Jan07","1April07","1Jul07","1Oct07","1Dec07"),
> >   date = dmy(dateString),
> >   group = factor(c(0,0,15,15,0,15,15,15,15,15,15,15,15,15))
> > )
> >
> > ## using lattice
> > ## lattice makes plotting two series easy
> > ## but doesn't make it as easy to have different symbols along the same
> series
> >
> > library(lattice)
> > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"), auto.key
> = TRUE)
> > ## better legend
> > xyplot(Well1 + Well2 ~ date, data = MyData, type = c("p","l"),
> >        auto.key = list(points = TRUE, lines = TRUE))
> >
> > ## using ggplot2
> > ## for highly customized plots, i generally find ggplot2 works better
> > ## i would reshape the data with tidyr before plotting (could be don in
> lattice as well)
> >
> > library(ggplot2); library(tidyr)
> >
> > MyData2 <-
> >   MyData %>%
> >   gather(location, concentration, Well1, Well2)
> >
> > ggplot( data = MyData2, aes(x = date, y = concentration, colour =
> location)) +
> >   geom_line() +
> >   geom_point( aes(shape = group), size = 2)
> >
> > xyplot(concentration ~ date, data = MyData2, groups = location, type =
> c("p", "l"),
> >        auto.key = TRUE)
> >
> > ## without reshaping, you can plot 4 layers well manually, but the
> default labeling isn’t as nice
> >
> > ggplot(data = MyData) +
> >   geom_line(aes(x = date, y = Well1, colour = "Well1")) +
> >   geom_line(aes(x = date, y = Well2, colour = "Well2")) +
> >   geom_point(aes(x = date, y = Well1, colour = "Well1", shape = group)) +
> >   geom_point(aes(x = date, y = Well2, colour = "Well2", shape = group))
> >
> >
> > Happy Holidays.  I hope one of these approaches will get you headed in
> the right direction.
> >
> > —rjp
> >
> >
> >
> >> On Dec 24, 2015, at 7:51 AM, Steven Stoline <sstoline at gmail.com> wrote:
> >>
> >> Dear All:
> >>
> >> I am trying to plot two series in one graph. But I have some
> difficulties
> >> to set up the y-axis lim. Also, the second series is not correctly
> graphed.
> >>
> >> *Here is what I tried to do:*
> >>
> >>
> >> ### Define 2 vectors
> >>
> >>
> Well1<-c(0.005,0.005,0.004,0.006,0.004,0.009,0.017,0.045,0.05,0.07,0.12,0.10,0.20,0.25)
> >>
> Well2<-c(0.10,0.12,0.125,0.107,0.099,0.11,0.13,0.109,0.10,0.115,0.14,0.17,0.11)
> >>
> >> ### Calculate range from 0 to max value of Well1 and Well2
> >> ### g_range <- range(0, Well1, Well2)
> >>
> >> max_y <- max(Well1, Well2)
> >>
> >> ### Graph Groundwater Concentrations using y axis that ranges from 0 to
> max
> >> ### value in Well1 or Well2 vector.  Turn off axes and
> >> ### annotations (axis labels) so we can specify them yourself
> >>
> >> plot(Well1, type="o", pch=c(0,0,15,15,0,15,15,15,15,15,15,15,15,15),
> >> col="blue", ylim=c(0,max_y), axes=FALSE, ann=FALSE, , lwd=3, cex=1.25)
> ###
> >> axes=FALSE,
> >>
> >> ### Make x axis using Jan 2005 - Dec 2008 labels
> >>
> >> axis(1, at=1:14,
> >>
> lab=c("Jan05","April05","Jul05","Oct05","Jan06","March06","Jun06","2Oct06","17Oct06","Jan07","April07","Jul07","Oct07","Dec07"))
> >>
> >>
> >>
> >> *### Make y axis with horizontal labels , Here what I have the major
> >> problem*
> >>
> >> ### I want the y-axis looks like: 0, 0.05, 0.10, 0.15, 20, 0.25
> >>
> >> axis(2, las=0, at=6*0:max_y)  ### max_y
> >>
> >>
> >> ### Create box around plot
> >>
> >> box()
> >>
> >> ### Graph Well2 with red dashed line and square points
> >>
> >> ### lines(Well2, type="o", pch=22, lty=2, col="red", lwd=3, cex=1.0)
> >>
> >> lines(Well2, type="o", pch=c(0,15,15,15,0,15,15,15,0,15,15,15,15),
> lty=2,
> >> col="red", lwd=3, cex=1.25)
> >>
> >> ### Create a title with a red, bold/italic font
> >>
> >> title(main="Trichloroethene mg/L from Wells 1 and 2 - 2005-2007",
> >> col.main="red", font.main=2)
> >>
> >> ### Label the x and y axes with dark green text
> >>
> >> title(xlab="Time Points", col.lab=rgb(0,0.5,0))
> >>
> >>
> >> title(ylab="Trichloroethene mg/L", col.lab=rgb(0,0.5,0))
> >>
> >> ### Create a legend
> >>
> >> legend(1, g_range[2], c("Well1","Well2"), cex=1.0, col=c("blue","red"),
> >> pch=15:15, lty=1:2);
> >>
> >>
> >>
> >>
> >> with thanks
> >> steve
> >> -------------------------
> >> Steven M. Stoline
> >> 1123 Forest Avenue
> >> Portland, ME 04112
> >> sstoline at gmail.com
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> R-sig-teaching at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
> >
> >
> >
> >
> > --
> > Steven M. Stoline
> > 1123 Forest Avenue
> > Portland, ME 04112
> > sstoline at gmail.com
>
>


-- 
Steven M. Stoline
1123 Forest Avenue
Portland, ME 04112
sstoline at gmail.com

	[[alternative HTML version deleted]]



More information about the R-sig-teaching mailing list