[R] using foumula to calculate a column in dataframe
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Tue Jul 29 00:42:32 CEST 2014
On Mon, 28 Jul 2014, Jeff Newmiller wrote:
> On Mon, 28 Jul 2014, Pavneet Arora wrote:
>
>> Hello All,
>> I need to calculate a column (Vupper) using a formula, but I am not sure
>> how to. It will be easier to explain with an example.
>>
>> Again this is my dataset:
>> dput(nd)
>> structure(list(week = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
>> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
>> 29, 30), value = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04,
>> 11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62,
>> 10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62,
>> 11.31, 10.52), cusum = c(-0.550000000000001, -2.56, -3.27, -1.61,
>> 0.549999999999999, 0.729999999999999, -1.23, 0.229999999999999,
>> -0.570000000000002, -0.230000000000002, -1.2, 0.269999999999998,
>> 0.779999999999998, 0.179999999999998, 0.259999999999998,
>> -0.370000000000003,
>> 0.249999999999996, 0.559999999999997, 0.559999999999997, 3.56,
>> 4.46, 3.79, 6.08, 7.58, 8.18, 9.26, 9.64, 11.26, 12.57, 13.09
>> )), .Names = c("week", "value", "cusum"), row.names = c(NA, -30L
>> ), class = "data.frame")
>>
>> I have some constants in my data. These are:
>> sigma =1, h = 5, k = 0.5
>>
>> The formula requires me to start from the bottom row (30th in this case).
>> The formula for the last row will be row 30th Cusi value (13.09) + h(5) *
>> sigma(1) = giving me the value of 18.1
>>
>> Then the formula for the 29th row for Vupper uses the value of 30th Vupper
>> (18.1) + k(0.5) * sigma(1) = giving me the value of 18.6
>>
>> Similarly the formula for the 28th row for Vupper will use value of 29th
>> Vupper(18.6) + k(0.5) * sigma(1) = giving me the value of 19.1
>>
>> And so on?.
>
> This is a recurrence formula... each value depends on the previous value in
> the sequence. In general these can be computationally expensive in R, but
> there are certain very common cases that have built-in functions with which
> you can build many of the real-world cases you might encounter (such as this
> one).
>
>>
>> Also, is there any way to make the formula generalised using loop or
>> functions? Because I really don?t want to have to re-write the program if
>> my number of rows increase or decrease or if I use another dataset?
>>
>> So far my function looks like following (Without the Vupper formula in
>> there):
>> vmask2 <- function(data,target,sigma,h,k){
>> data$deviation <- data$value - target
>> data$cusums <- cumsum(data$deviation)
>> data$ma <- c(NA,abs(diff(data$value)))
>> data$Vupper <- *not sure what to put here*
>>
>> data
>> }
>
> I avoid using the variable name "data" because there is a base function of
> that name.
>
> sigma <- 1
> h <- 5
> k <- 0.5
>
> dta$Vupper <- rev( cumsum( c( dta[ nrow(dta), "cusum" ] + h * sigma
> , rep( 0, nrow(dta) - 1 )
> )
> + seq( 0, by=k * sigma, length.out=30L )
> )
> )
Oops... accounted for accumulation twice, once with cumsum and once with
seq.
dta$Vupper <- rev( rep( dta[ nrow(dta), "cusum" ] + h * sigma, nrow(dta) )
+ k * sigma * seq( 0L, by=1L, length.out=30L )
)
> Note how the terms in your algorithm are re-grouped into vectors that c and
> seq and rep can generate, and cumsum is used to implement the recurrence, and
> the rev function is used to reverse the vector.
> If you are going to apply this to long sequences of data, you might want to
> fix the accumulation of floating-point error in the seq call by using
> integers:
>
> dta$Vupper <- rev( cumsum( c( dta[ nrow(dta), "cusum" ] + h * sigma
> , rep( 0, nrow(dta) - 1 )
> )
> + k * sigma * seq( 0L, by=1L, length.out=30L )
> )
> )
>
>> ***********************************************************************************************************************************************************************************************************************
>> MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No.
>> 93792). Registered in England and Wales at St. Mark???s Court, Chart Way,
>> Horsham, West Sussex, RH12 1XL.
>>
>> Authorised by the Prudential Regulation Authority and regulated by the
>> Financial Conduct Authority and the Prudential Regulation Authority.
>> ************************************************************************************************************************************************************************************************************************
>>
>> [[alternative HTML version deleted]]
>
> Please send your emails in plain text, as the Posting Guide requests. HTML
> often corrupts what you send to the list.
>
> ---------------------------------------------------------------------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list