[R] Timeseries Data Plotted as Monthly Boxplots

Thomas Adams Thomas.Adams at noaa.gov
Thu Feb 17 04:55:07 CET 2011


Katrina,

What I have done, if I understand what you are after, was to create a list for each month of data - in order. Then, create a boxplot - in order - by month/year. I do this for our ensemble streamflow forecasts. The key us to create the list of values by month.

Regards,
Tom

Sent from my iPhone

On Feb 16, 2011, at 3:31 PM, Katrina Bennett <kebennett at alaska.edu> wrote:

> Hello, I'm trying to develop a box plot of time series data to look at the
> range in the data values over the entire period of record.
> 
> My data initially starts out as a list of hourly data, and then I've been
> using this code to make this data into the final ts array.
> 
> # Read in the station list
> stn.list <- read.csv("/home/kbennett/fews/stnlist3", as.is=T, header=F)
> 
> # Read in all three variables.
> vars <- c("MAT", "MAP", "MAP06")
> 
> for (stn in stn.list) {
>  for (v in 1:length(vars) {
>    # Read in year month start and end dates table & name it
>    ym.table <- read.csv("/home/kbennett/fews/", stn, var, ".ym.txt", as.is=T,
> header=F)
>    names(ym.table) <- c("yearstart", "monthstart", "yearend", "monthend")
> 
>    fn <- paste(stn, ".", vars[v], ".FIN", sep="")
>      if(file.exists(fn)) {
>        clim.dat <- read.csv(fn, header=F)
>        names(clim.dat) <- c("cdata")
>        year.start <- ym.table$yearstart
>        year.end <- ym.table$yearend
> 
>        mo.start <- ym.table$monthstart
>        mo.end <- ym.table$monthend
> 
>        regts.start = ISOdatetime(year.start, mo.start, 1, hour=0, min=0,
> sec=0, tz="GMT")
>        regts.end = ISOdatetime(year.end, mo.end, 1, hour=18, min=0, sec=0,
> tz="GMT")
> 
>        zts <- zooreg(clim.dat$cdata, start = regts.start, end = regts.end,
> frequency = 4, deltat = 21600)
> 
>        #Create a daily average from the timeseries
>        zta <- aggregate(zts, as.POSIXct(cut(time(zts), "24 hours",
> include=T)), mean)
> 
>        #Select hourly data from the timeseries based on a specific time
>        zt.hr <- aggregate(zts, as.Date, head, 4)
>        zt.hr.ym <- aggregate(zt.hr, as.yearmon, head, 4)
>        zt.hr.1 <- zt.hr.ym[,1]
>        zt.hr.2 <- zt.hr.ym[,2]
>        zt.hr.3 <- zt.hr.ym[,3]
>        zt.hr.4 <- zt.hr.ym[,4]
> 
>        zt.hr.1a <- aggregate(zt.hr.1, as.yearmon)
>        min.y <- min(zt.hr)
>        max.y <- max(zt.hr)
> 
>        frequency(zt.hr.1) <- 12
>        zt.1.mo <-  as.ts(zt.hr.1)
> 
>        #Monthly boxplots of daily averages, for the months
>        boxplot(zt.1.mo ~ month,                   ##THIS IS WHAT DOESN'T
> WORK HERE
>         boxwex=0.25, at=(1:12)-0.2,
>         outline = F,
>         col = "gray",
>         xlab = "Month",
>         ylab = expression(paste("( ",T^o,"C )") ),
>         ylim = c(min.y-5,max.y+5),
>         yaxs = "i",
>         xaxt = "n",
>         main = vars)
>        axis(1, at=c(1:12), labels=month.abb, cex.axis = 0.65)
>        legend("topright", c("Hour 00"), fill = c("gray"))
>    }
> 
> 
> 
>        #write the results to a csv file
>        write.csv(cdat, paste(stn, "_", vars[v], ".csv", sep=""),
> row.names=T, col.names=T)
> 
>    }
> }
> 
> 
> The final array looks like this:
> 
>        Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep
> Oct    Nov    Dec
> 1948 28.719  4.977 39.037  9.746  8.348 36.672 47.660 54.076 38.062 34.486
> 11.938 39.666
> 1949 11.698 -6.675 16.844  0.950 10.349 38.752 39.785 40.544 57.603 35.476
> 2.308 -7.960
> 1950  0.340 45.206  6.385 17.132 19.074 38.465 48.711 54.686 48.743 33.978
> 23.090 10.007
> 1951 12.398 31.304 47.182  4.539 23.223 45.668 50.516 53.239 59.402 28.081
> 16.427 14.839
> 1952 -7.693 30.561 33.478 14.799 12.750 35.359 43.180 57.840 44.593 43.768
> 8.574 14.587
> 1953 -9.875 38.726 26.393 12.881 19.228 48.833 49.903 56.224 48.829 23.783
> 19.308 14.292
> 1954 35.943 16.706 16.021  7.806 23.593 40.418 45.310 53.113 49.203 29.480
> 17.228 33.068
> 1955 23.363 15.706 14.100 17.271 19.258 36.969 47.301 51.826 40.446 35.201
> 16.463 11.132
> 1956 45.868 -8.504 48.167 10.746 25.024 36.247 47.741 52.160 41.781 29.115
> 25.414 21.954
> 
> 
> 
> My main problem is that I can't access the rows (i.e. months) to subset the
> data by.
> 
> Could someone point out how I am able to get at the months in this array and
> subset them for plotting using the boxplot function?
> 
> 
> Thank you,
> 
> Katrina
> 
> -- 
> Katrina E. Bennett
> PhD Student
> University of Alaska Fairbanks
> International Arctic Research Center
> 930 Koyukuk Drive, PO Box 757340
> Fairbanks, Alaska 99775-7340
> 907-474-1939 office
> 907-385-7657 cell
> kebennett at alaska.edu
> 
> 
> Personal Address:
> UAF, PO Box 752525
> Fairbanks, Alaska 99775-2525
> bennett.katrina at gmail.com
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list