[R] using foumula to calculate a column in dataframe

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Tue Jul 29 00:42:32 CEST 2014


On Mon, 28 Jul 2014, Jeff Newmiller wrote:

> On Mon, 28 Jul 2014, Pavneet Arora wrote:
>
>> Hello All,
>> I need to calculate a column (Vupper) using a formula, but I am not sure
>> how to. It will be easier to explain with an example.
>> 
>> Again this is my dataset:
>> dput(nd)
>> structure(list(week = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
>> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
>> 29, 30), value = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04,
>> 11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62,
>> 10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62,
>> 11.31, 10.52), cusum = c(-0.550000000000001, -2.56, -3.27, -1.61,
>> 0.549999999999999, 0.729999999999999, -1.23, 0.229999999999999,
>> -0.570000000000002, -0.230000000000002, -1.2, 0.269999999999998,
>> 0.779999999999998, 0.179999999999998, 0.259999999999998,
>> -0.370000000000003,
>> 0.249999999999996, 0.559999999999997, 0.559999999999997, 3.56,
>> 4.46, 3.79, 6.08, 7.58, 8.18, 9.26, 9.64, 11.26, 12.57, 13.09
>> )), .Names = c("week", "value", "cusum"), row.names = c(NA, -30L
>> ), class = "data.frame")
>> 
>> I have some constants in my data. These are:
>> sigma =1, h = 5, k = 0.5
>> 
>> The formula requires me to start from the bottom row (30th in this case).
>> The formula for the last row will be row 30th Cusi value (13.09) + h(5) *
>> sigma(1) = giving me the value of 18.1
>> 
>> Then the formula for the 29th row for Vupper uses the value of 30th Vupper
>> (18.1) + k(0.5) * sigma(1) = giving me the value of 18.6
>> 
>> Similarly the formula for the 28th row for Vupper will use value of 29th
>> Vupper(18.6) + k(0.5) * sigma(1) = giving me the value of 19.1
>> 
>> And so on?.
>
> This is a recurrence formula... each value depends on the previous value in 
> the sequence. In general these can be computationally expensive in R, but 
> there are certain very common cases that have built-in functions with which 
> you can build many of the real-world cases you might encounter (such as this 
> one).
>
>> 
>> Also, is there any way to make the formula generalised using loop or
>> functions? Because I really don?t want to have to re-write the program if
>> my number of rows increase or decrease or if I use another dataset?
>> 
>> So far my function looks like following (Without the Vupper formula in
>> there):
>> vmask2 <- function(data,target,sigma,h,k){
>>  data$deviation <- data$value - target
>>  data$cusums <- cumsum(data$deviation)
>>  data$ma <- c(NA,abs(diff(data$value)))
>>  data$Vupper <- *not sure what to put here*
>>
>>  data
>> }
>
> I avoid using the variable name "data" because there is a base function of 
> that name.
>
> sigma <- 1
> h <- 5
> k <- 0.5
>
> dta$Vupper <- rev( cumsum( c( dta[ nrow(dta), "cusum" ] + h * sigma
>                            , rep( 0, nrow(dta) - 1 )
>                            )
>                            + seq( 0, by=k * sigma, length.out=30L )
>                         )
>                 )

Oops... accounted for accumulation twice, once with cumsum and once with 
seq.

dta$Vupper <- rev( rep( dta[ nrow(dta), "cusum" ] + h * sigma, nrow(dta) )
                  + k * sigma * seq( 0L, by=1L, length.out=30L )
                  )


> Note how the terms in your algorithm are re-grouped into vectors that c and 
> seq and rep can generate, and cumsum is used to implement the recurrence, and 
> the rev function is used to reverse the vector.
> If you are going to apply this to long sequences of data, you might want to 
> fix the accumulation of floating-point error in the seq call by using 
> integers:
>
> dta$Vupper <- rev( cumsum( c( dta[ nrow(dta), "cusum" ] + h * sigma
>                            , rep( 0, nrow(dta) - 1 )
>                            )
>                            + k * sigma * seq( 0L, by=1L, length.out=30L )
>                         )
>                 )
>
>> ***********************************************************************************************************************************************************************************************************************
>> MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No. 
>> 93792). Registered in England and Wales at St. Mark???s Court, Chart Way, 
>> Horsham, West Sussex, RH12 1XL.
>> 
>> Authorised by the Prudential Regulation Authority and regulated by the 
>> Financial Conduct Authority and the Prudential Regulation Authority.
>> ************************************************************************************************************************************************************************************************************************
>>
>> 	[[alternative HTML version deleted]]
>
> Please send your emails in plain text, as the Posting Guide requests. HTML 
> often corrupts what you send to the list.
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                      Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list