[R] ggplot2 applying a function based on facet

David Winsemius dwinsemius at comcast.net
Tue Oct 6 05:12:07 CEST 2009


On Oct 5, 2009, at 10:45 PM, stephen sefick wrote:

> Sorry, I want the cumsum of precipitation by gauge name.
> that can then be used with the appropriate datetime stamp to pot a
> cumulative rainfall plot.
>

I dunno. Here is what I get when I extract the data-gathering part of  
that extended function.

 > str(both)
'data.frame':	3973 obs. of  8 variables:
  $ gauge        : int  2102908 2102908 2102908 2102908 2102908  
2102908 2102908 2102908 2102908 2102908 ...
  $ agency       : Factor w/ 1 level "USGS": 1 1 1 1 1 1 1 1 1 1 ...
  $ date         : Factor w/ 8 levels "2009-09-28","2009-09-29",..: 1  
1 1 1 1 1 1 1 1 1 ...
  $ time         : Factor w/ 96 levels "00:00","00:15",..: 1 2 3 4 5 6  
7 8 9 10 ...
  $ gauge_height : num  0.89 0.89 0.89 0.89 0.89 0.89 0.89 0.89 0.89  
0.88 ...
  $ discharge    : num  8.4 8.4 8.4 8.4 8.4 8.4 8.4 8.4 8.4 8.1 ...
  $ precipitation: num  0 0 0 0 0 0 0 0 0 0 ...
  $ gauge_name   : Factor w/ 6 levels "CANOOCHEE RIVER NEAR CLAXTON,  
GA",..: 3 3 3 3 3 3 3 3 3 3 ...
 > df$c_sum_precip <- ave(DF$precipitation, DF$guage_name, cumsum)
Error in as.vector(x, mode) : invalid 'mode' argument
 > describe(DF$precipitation)  # describe from Hmisc package
DF$precipitation
        n  missing   unique     Mean      .05      .10      .25      . 
50      .75      .90
     2255     1718       15 0.001610        0        0        0         
0        0        0
      .95
        0

              0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.14 0.21  
0.22 0.37 0.46
Frequency 2146   60   16   10    6    3    4    3    1    1    1     
1    1    1    1
%           95    3    1    0    0    0    0    0    0    0    0     
0    0    0    0
 >-------------

Seems that a cumsum on a variable that has >60% missing values is  
going to present certain problems. I'm also not sure how this approach  
will carry across the time variable which at the moment appears to be  
a factor rather than any of the date or datetime classes.

-- 
David.


> On Mon, Oct 5, 2009 at 9:03 PM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>>
>> On Oct 5, 2009, at 9:39 PM, stephen sefick wrote:
>>
>>> Look at the bottom of the message for my question
>>> #here is a little function that I wrote
>>> USGS <- function(input="discharge", days=7){
>>> library(chron)
>>> library(gsubfn)
>>> #021973269 is the Waynesboro Gauge on the Savannah River Proper  
>>> (SRS)
>>> #02102908 is the Flat Creek Gauge (ftbrfcms)
>>> #02133500 is the Drowning Creek (ftbrbmcm)
>>> #02341800 is the Upatoi Creek Near Columbus (ftbn)
>>> #02342500 is the Uchee Creek Near Fort Mitchell (ftbn)
>>> #02203000 is the Canoochee River Near Claxton (ftst)
>>>
>>> a <- "http://waterdata.usgs.gov/nwis/uv?format=rdb&period="
>>> b <-  
>>> "&site_no=021973269,02102908,02133500,02341800,02342500,02203000"
>>> z <- paste(a, days, b, sep="")
>>> L <- readLines(z)
>>
>> #trimmed long comment that broke function
>>
>>> L.USGS <- grep("^USGS", L, value = TRUE)
>>> DF <- read.table(textConnection(L.USGS), fill = TRUE)
>>> colnames(DF) <- c("agency", "gauge", "date", "time", "gauge_height",
>>> "discharge", "precipitation")
>>> pat <- "^# +USGS +([0-9]+) +(.*)"
>>> L.DD <- grep(pat, L, value = TRUE)
>>> library(gsubfn)
>>> DD <- strapply(L.DD, pat, c, simplify = rbind)
>>> DDdf <- data.frame(gauge = as.numeric(DD[,1]), gauge_name = DD[,2])
>>> both <- merge(DF, DDdf, by = "gauge", all.x = TRUE)
>>>
>>> dts <- as.character(both[,"date"])
>>> tms <- as.character(both[,"time"])
>>> date_time <- as.chron(paste(dts, tms), "%Y-%m-%d %H:%M")
>>> DF <- data.frame(date_time, both)
>>> library(ggplot2)
>>> #discharge
>>> if(input=="discharge"){
>>> qplot(as.POSIXct(date_time), discharge, data=DF,
>>> geom="line")+facet_wrap(~gauge_name,
>>> scales="free_y")+coord_trans(y="log10")
>>> }else{
>>> #precipitation
>>>
>>> qplot(as.POSIXct(date_time),
>>> precipitation, data=subset(DF, precipitation!="NA"),
>>> geom="line")+facet_wrap(~gauge_name, scales="free_y")
>>> }
>>> }
>>>
>>> USGS("precip")
>>>
>>> I would like to have the cumsum based on the facet gauge_name - in
>>> other words a cummulative rainfall amount for each gauge_name
>>
>> You want "the cumsum" of <something> but you have wrapped so much  
>> up in that
>> function (inlcuding library calls???)  that I cannot see what that
>> <something> would be. Not all of us read ggplot calls. The  
>> canonical route
>> to getting cumsums by a factor is with ave. Something like:
>>
>>  DF$cum_x <- ave(DF$x, DF$fac, cumsum)
>>
>> --
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>
>
>
>
> -- 
> Stephen Sefick
>
> Let's not spend our time and resources thinking about things that are
> so little or so large that all they really do for us is puff us up and
> make us feel like gods.  We are mammals, and have not exhausted the
> annoying little problems of being mammals.
>
> 								-K. Mullis

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list