[R] Help with plotting and date-times for climate data

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Sep 13 00:06:43 CEST 2023


Às 21:50 de 12/09/2023, Kevin Zembower via R-help escreveu:
> Hello,
> 
> I'm trying to calculate the mean temperature max from a file of climate
> date, and plot it over a range of days in the year. I've downloaded the
> data, and cleaned it up the way I think it should be. However, when I
> plot it, the geom_smooth line doesn't show up. I think that's because
> my x axis is characters or factors. Here's what I have so far:
> ========================================
> library(tidyverse)
> 
> data <- read_csv("Ely_MN_Weather.csv")
> 
> start_day = yday(as_date("2023-09-22"))
> end_day = yday(as_date("2023-10-15"))
>                 
> d <- as_tibble(data) %>%
>      select(DATE,TMAX,TMIN) %>%
>      mutate(DATE = as_date(DATE),
>             yday = yday(DATE),
>             md = sprintf("%02d-%02d", month(DATE), mday(DATE))
>             ) %>%
>      filter(yday >= start_day & yday <= end_day) %>%
>      mutate(md = as.factor(md))
> 
> d_sum <- d %>%
>      group_by(md) %>%
>      summarize(tmax_mean = mean(TMAX, na.rm=TRUE))
> 
> ## Here's the filtered data:
> dput(d_sum)
> 
>> structure(list(md = structure(1:25, levels = c("09-21", "09-22",
> "09-23", "09-24", "09-25", "09-26", "09-27", "09-28", "09-29",
> "09-30", "10-01", "10-02", "10-03", "10-04", "10-05", "10-06",
> "10-07", "10-08", "10-09", "10-10", "10-11", "10-12", "10-13",
> "10-14", "10-15"), class = "factor"), tmax_mean = c(65,
> 62.2222222222222,
> 61.3, 63.8888888888889, 64.3, 60.1111111111111, 62.3, 60.5, 61.9,
> 61.2, 63.6666666666667, 59.5, 59.5555555555556, 61.5555555555556,
> 59.4444444444444, 58.7777777777778, 55.8888888888889, 58.125,
> 58, 55.6666666666667, 57, 55.4444444444444, 49.7777777777778,
> 48.75, 43.6666666666667)), class = c("tbl_df", "tbl", "data.frame"
> ), row.names = c(NA, -25L))
>>
> ggplot(data = d_sum, aes(x = md)) +
>      geom_point(aes(y = tmax_mean, color = "blue")) +
>      geom_smooth(aes(y = tmax_mean, color = "blue"))
> =====================================
> My questions are:
> 1. Why isn't my geom_smooth plotting? How can I fix it?
> 2. I don't think I'm handling the month and day combination correctly.
> Is there a way to encode month and day (but not year) as a date?
> 3. (Minor point) Why does my graph of tmax_mean come out red when I
> specify "blue"?
> 
> Thanks for any advice or guidance you can offer. I really appreciate
> the expertise of this group.
> 
> -Kevin
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,

The problem is that the dates are factors, not real dates. And 
geom_smooth is not interpolating along a discrete axis (the x axis).

Paste a fake year with md, coerce to date and plot.
I have simplified the aes() calls and added a date scale in order to 
make the x axis more readable.

Without the formula and method arguments, geom_smooth will print a 
message, they are now made explicit.



suppressPackageStartupMessages({
   library(dplyr)
   library(ggplot2)
})

d_sum %>%
   mutate(md = paste("2023", md, sep = "-"),
          md = as.Date(md)) %>%
   ggplot(aes(x = md, y = tmax_mean)) +
   geom_point(color = "blue") +
   geom_smooth(
     formula = y ~ x,
     method = loess,
     color = "blue"
   ) +
   scale_x_date(date_breaks = "7 days", date_labels = "%m-%d")



Hope this helps,

Rui Barradas



More information about the R-help mailing list