[R-SIG-Finance] Test for new event and save data in new data.frame

R. Michael Weylandt michael.weylandt at gmail.com
Thu Oct 4 16:49:43 CEST 2012


On Thu, Oct 4, 2012 at 12:12 AM, Mark Knecht <markknecht at gmail.com> wrote:
> Hi again,
>    Again, I haven't touched R in a couple of years and am just getting
> going on a little idea. Thanks in advance.
>
>    If you all think this is better done somewhere else like
> StackOverflow let me know and I'll post there instead.
>
>    The following very simplified code is meant to represent reading
> through price data (PriceData) to extract what is happening with
> prices after a trade has started and as it progresses. The column MP
> (MarketPosition) is lagged. When MP switches from 0 to 1 a new long
> trade is entered. I'd like to collect the next 5 bars of data for that
> trade in a data.frame called TradeData. Each new trade gets a new
> column. In this example there are 4 trades occurring at bars 3, 8, 11
> & 15. The results for this data are a data.frame with 5 rows and 4
> columns. (The real one will be much larger...)
>
> QUESTIONS:
>
> 1) How do I do a logical test something like ((c3 = 1) AND (c4 = 0))
> for each row in PriceData to determine when a trade started?
>
> 2) If the test above is true, how do I copy c2[row:(row+4)] into a
> element an add it to TradeData?
>
> I'm guessing this might be a job for one of the apply functions but
> I'm not sure which one or how to do it. The couple of R books I've got
> aren't making it clear yet,
>
>    In the code below EVERYTHING below the comment block is only to
> show what I want to create.  It will all go away when the questions
> above turn into R code.
>
>    I hope this is reasonably clear. Let me know if it isn't.
>
> Thanks,
> Mark
>
>
>
>
> MyLag <- function(x, k) c(rep(NA, k), x[1:(length(x)-k)] )
>
> c1 = 1:20
> c2 = c(5,6,7,8,1,2,3,4,7,2,3,4,5,7,8,9,1,2,1,1)
> c3 = c(0,0,1,1,1,0,0,1,0,0,1,1,0,0,1,1,1,0,0,0)
> c4 = MyLag(c3, 1)
>
>
> PriceData = data.frame(cbind(c1,c2,c3,c4))
> colnames(PriceData) = c("BarNum","Price", "MP","LagMP")
>
> PriceData
>
> # This represents the data.frame I'd like to create -> (TradeData)
> #
> # When ((c3 = 1) AND (c4 = 0)) then copy c2[row:row+4] and
> # put it in a new column in the data frame
> # Eventually there is one column in the data.frame for
> # each trade
>
> t1 = c(7,8,1,2,3)
> t2 = c(4,7,2,3,4)
> t3 = c(3,4,5,7,8)
> t4 = c(8,9,1,2,1)
>
> TradeData = data.frame(cbind(t1,t2,t3,t4))

This is a terrible awful and altogether wretched idiom. Instead just
do data.frame(t1, t2, t3, t4). cbind()-ing completely invalidates the
entire point of a data.frame() by coercing all its inputs to a common
class.  [Don't take it personally: we've just been working to stamp it
out every where we see it on the main R-help list for a few months
now]

>
> TradeData
>

Best advice: don't use a data.frame() for this sort of data. Instead,
use a proper time series object, like xts or zoo. These also provide a
lag function for you:

library(xts)

PriceData <- xts(cbind(BarNum = 1:20, Price =
c(5,6,7,8,1,2,3,4,7,2,3,4,5,7,8,9,1,2,1,1), MP =
c(0,0,1,1,1,0,0,1,0,0,1,1,0,0,1,1,1,0,0,0)), Sys.Date() + 1:20)

print(PriceData)

# Now we can add on the lag column:

PriceData <- cbind(PriceData, LagMP = lag(PriceData[,3], 3))

with(PriceData, (MP == 1) & (MP.1== 0)) # Gives you the rows you want
to manipulate

Cheers,
Michael



More information about the R-SIG-Finance mailing list