[R] plot means ?

Sam Steingold sds at gnu.org
Mon Jul 11 23:16:43 CEST 2011


> * David Winsemius <qjvafrzvhf at pbzpnfg.arg> [2011-07-11 15:50:04 -0400]:
>
> On Jul 11, 2011, at 3:36 PM, Sam Steingold wrote:
>
>>> * David Winsemius <qjvafrzvhf at pbzpnfg.arg> [2011-07-11 15:32:26
>>> -0400]:
>>>
>>> On Jul 11, 2011, at 3:18 PM, Sam Steingold wrote:
>>>
>>>> I need this plot:
>>>> given: x,y - numerical vectors of length N
>>>> plot xi vs mean(yj such that |xj - xi|<epsilon)
>>>> (running mean?)
>>>> alternatively, discretize X as if for histogram plotting and plot
>>>> mean
>>>> y
>>>> over the center of the histogram group.
>>>
>>> It sounds as though you asking for smoothing splines with an
>>> adjustable
>>> band width ... something that is very easy in R.
>>
>> Unlikely.  I do not need smoothing.  I have far too many points far
>> too
>> densely for splines to make much sense.
>
> That in turn does not make sense (to me anyway.) Just narrow the
> bandwidth.

wdym?
this is not (really) a time series

>> I just need the mean (with confidence bounds, if possible) over many
>> small intervals.
>
> Also sounds easy to implement. Post an example. At the moment it is not
> clear if your intervals are disjoint or overlapping.

the intervals should partition the domain, like the result of hist:

plot3 <- function(x,m,s,good) {
  x <- x[good];
  m <- m[good];
  s <- s[good];
  t <- m+s;
  b <- m-s;
  r <- c(min(b),max(t));
  plot(x,m,ylim=r,xlab="",ylab="");
  points(x,b,col="red");
  points(x,t,col="blue");
}

plot.mean <- function (x,y, scale = 2, breaks = exp(log(length(x))/scale),
                       name="all") {
  dev.new();
  hi <- hist(x,breaks=breaks,main="histogram of time of day");
  br <- length(hi$mids);
  cat("breaks=",breaks," br=",br,"\n");
  hi$breaks1 <- hi$breaks[2:(br+1)];
  hi$means <- vector(mode="numeric", length = br);
  hi$stdevs <- vector(mode="numeric", length = br);
  for (i in 1:br) {
    good <- hi$breaks[i] <= x & x < hi$breaks1[i];
    tmp <- y[good];
    hi$means[i] <- mean(tmp);
    hi$total[i] <- sum(tmp);
    hi$stdevs[i] <- sd(tmp);
  }
  hi$good <- hi$counts >= 5;
  str(hi);
  dev.new();
  plot3(hi$mids,hi$means,hi$stdevs,hi$good);
  xlab="time of day";
  title(main=paste(name,"mean pnl vs time of day"),xlab,ylab="mean pnl");
  dev.new();
  plot(hi$mids,hi$total,main=paste(name,"total pnl vs time of day"),
       xlab=xlab,ylab="total pnl");
  dev.new();
  plot(hi$mids,cumsum(hi$total),xlab=xlab,ylab="cumulative pnl",
       main=paste(name,"cumulative pnl vs time of day"));
  hi
}


-- 
Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031
http://jihadwatch.org http://camera.org http://ffii.org
http://memri.org http://pmw.org.il http://honestreporting.com http://dhimmi.com
This message is rot13 encrypted (twice!); reading it violates DMCA.



More information about the R-help mailing list