[R] Different result on using apply.

peter dalgaard pdalgd at gmail.com
Fri Jul 29 10:25:19 CEST 2011


(oops, forgot to cc. the list)

On Jul 29, 2011, at 08:09 , Ashim Kapoor wrote:

> Dear R-helpers,
> 
> In the following example I compute ret and returns the SAME way. In ret I
> use compute returns for EACH column and in returns I do it for the whole
> data frame. Could someone please tell me why I see a lagged result,by which
> I mean ret and returns are different by one lag.
> 
> 
> getSymbols("GOOG",src="yahoo")
> ret<-apply(GOOG,2,function(x) diff(log(x)) / lag(x,1) )
> returns<-diff(log(GOOG))/lag(GOOG,1)
> tail(ret)
> tail(returns)

Well, first of all: This is an issue with contributed packages (at least quantmod, xts) and you should tell us so. Secondly, you also forgot to tell us about the warning about recycling, which is a pretty strong hint about what is going on. It is against your own interests to leave readers guessing like that.

In essence, diff'ed _vectors_ are one element shorter than the lagged ones, so the ratio will recycle the first diff into the last position (and create garbage). diff.xts tries to be smarter and pads the differences with NA so that this doesn't happen. Notice that if you just remove the last element instead of lagging x, then you do indeed get the same results (save for a line of NA at the top)

> tail(apply(GOOG,2,function(x)diff(log(x))/x[-1152]))
              GOOG.Open     GOOG.High      GOOG.Low    GOOG.Close
2011-07-21 -2.262875e-05  1.432963e-05 -3.784855e-06  3.252347e-05
2011-07-22  3.188905e-05  3.065345e-05  2.882942e-05  3.022824e-05
2011-07-25  2.160452e-05  1.532645e-05  2.373743e-05  1.961091e-06
2011-07-26  1.241901e-05  5.334479e-06  1.119182e-05  9.213213e-06
2011-07-27 -2.279176e-06 -1.672208e-05 -3.306823e-05 -3.997397e-05
2011-07-28 -3.178693e-05 -1.294157e-05 -4.791985e-06  1.005828e-05
            GOOG.Volume GOOG.Adjusted
2011-07-21  1.988491e-07  3.252347e-05
2011-07-22  4.835664e-09  3.022824e-05
2011-07-25 -3.379734e-08  1.961091e-06
2011-07-26 -9.265378e-08  9.213213e-06
2011-07-27  2.212510e-07 -3.997397e-05
2011-07-28 -5.989484e-08  1.005828e-05

> tail(diff(log(GOOG))/lag(GOOG,1))
              GOOG.Open     GOOG.High      GOOG.Low    GOOG.Close
2011-07-21 -2.262875e-05  1.432963e-05 -3.784855e-06  3.252347e-05
2011-07-22  3.188905e-05  3.065345e-05  2.882942e-05  3.022824e-05
2011-07-25  2.160452e-05  1.532645e-05  2.373743e-05  1.961091e-06
2011-07-26  1.241901e-05  5.334479e-06  1.119182e-05  9.213213e-06
2011-07-27 -2.279176e-06 -1.672208e-05 -3.306823e-05 -3.997397e-05
2011-07-28 -3.178693e-05 -1.294157e-05 -4.791985e-06  1.005828e-05
            GOOG.Volume GOOG.Adjusted
2011-07-21  1.988491e-07  3.252347e-05
2011-07-22  4.835664e-09  3.022824e-05
2011-07-25 -3.379734e-08  1.961091e-06
2011-07-26 -9.265378e-08  9.213213e-06
2011-07-27  2.212510e-07 -3.997397e-05
2011-07-28 -5.989484e-08  1.005828e-05




-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
"Døden skal tape!" --- Nordahl Grieg



More information about the R-help mailing list