[R] Filling NA with cumprod?
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Fri May 25 18:18:15 CEST 2012
This calls for a trick I have seen before on this list. Once you
understand it, you will be able to apply it to many similar problems.
The key is the "ave" function, which applies a function to various groups
of values in a vector.
a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)
at <- ifelse( is.na(a), f, a )
lt <- cumsum( !is.na( a ) )
cbind( lt, at ) # see the pattern of levels that will control ave
ave( at, lt, FUN=cumprod )
or in one statement
ave( ifelse( is.na(a), f, a ), cumsum( !is.na( a ) ), FUN=cumprod )
When learning, the trickiest step is defining the vector of levels.
Usually a cumsum of booleans that mark transitions is involved. Sometimes
rev(test(rev(data)))) can be useful.
On Fri, 25 May 2012, David L Carlson wrote:
> This will loop only as many times as the largest number of consecutive NA's
> but uses vectorization within the loop. As currently defined, it will loop
> forever if the first value is NA.
>
> a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
> f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)
>
> a1 <- a
> alag <- c(NA, a1[1:length(a1)-1])
> # change NA to the value to use if the first value in a is NA
>
> while (sum(is.na(a1)) > 0) {
> a1 <- ifelse(is.na(a1), f*alag, a1)
> alag <- c(NA, a1[1:length(a1)-1])
> }
>
> ----------------------------------------------
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Igor Reznikovsky
>> Sent: Friday, May 25, 2012 9:08 AM
>> To: Petr Savicky
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Filling NA with cumprod?
>>
>> Hello Petr,
>>
>> Yes, I was hoping to avoid using loops. If nothing else works, I will
>> take
>> approach as the last resort.
>>
>> Thank you,
>> Igor.
>> On May 25, 2012 2:26 AM, "Petr Savicky" <savicky at cs.cas.cz> wrote:
>>
>>> On Thu, May 24, 2012 at 08:24:38PM -0700, igorre25 wrote:
>>>> Hello,
>>>>
>>>> I need to build certain interpolation logic using R.
>> Unfortunately, I
>>> just
>>>> started using R, and I'm not familiar with lots of advanced or just
>>>> convenient features of the language to make this simpler. So I
>> struggled
>>>> for few days and pretty much reduced the whole exercise to the
>> following
>>>> problem, which I cannot resolve:
>>>>
>>>> Assume we have a vector of some values with NA:
>>>> a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
>>>>
>>>> and some coefficients as a vector of the same length:
>>>>
>>>> f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)
>>>>
>>>> I need to come up with function to get the following output
>>>>
>>>> o[1] = a[1]
>>>> o[2] = a[2]
>>>> o[3] = a[3]
>>>> o[4] = o[3]*[f3] # Because a[3] is NA
>>>> o[5] = o[4]*[f4] # Because a[4] is NA; This looks like recursive
>>>> calculations; If the rest of the elements we NA, I would use a *
>>> c(rep(1,
>>>> 3), cumprod(f[3:9])), but that's not the case
>>>> o[6] = a[6] # Not NA anymore
>>>> o[7] = a[7]
>>>> o[8] = o[7]*f[7] # Again a[8] is NA
>>>> o[9] = o[8]*f[8]
>>>> o[10] = a[10] # Not NA
>>>>
>>>> Even though my explanation may seems complex, in reality the
>> requirement
>>> is
>>>> pretty simple and in Excel is achieved with a very short formula.
>>>>
>>>> The need to use R is to demonstrate capabilities of the language
>> and
>>> then to
>>>> expand to more complex problems.
>>>
>>> Hello:
>>>
>>> How is the output defined, if a[1] is NA?
>>>
>>> I think, you are not asking for a loop solution. However, in this
>> case,
>>> it can be a reasonable option. For example
>>>
>>> a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
>>> f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)
>>> n <- length(a)
>>> o <- rep(NA, times=n)
>>>
>>> prev <- 1
>>> for (i in 1:n) {
>>> if (is.na(a[i])) {
>>> o[i] <- f[i]*prev
>>> } else {
>>> o[i] <- a[i]
>>> }
>>> prev <- o[i]
>>> }
>>>
>>> A more straightforward translation of the Excel formulas is
>>>
>>> getCell <- function(i)
>>> {
>>> if (i == 0) return(1)
>>> if (is.na(a[i])) {
>>> return(f[i]*getCell(i-1))
>>> } else {
>>> return(a[i])
>>> }
>>> }
>>>
>>> x <- rep(NA, times=n)
>>> for (i in 1:n) {
>>> x[i] <- getCell(i)
>>> }
>>>
>>> identical(o, x) # [1] TRUE
>>>
>>> Petr Savicky.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list