[R] HELP ME: Fill NA Values from the previous Non-NA Values

Allan Tanaka allantanaka11 at yahoo.com
Thu Mar 16 03:16:19 CET 2017


 Hi. Thanks for the function. My bad, after looking at the csv file, it seems that NA values come not only from previous Non-NA values but also from the next Non-NA values. Example:
| NCQ05 | 11.395 |
| NCQ05 | 11.395 |
| NCQ05 |  |
| NCQ06 |  |
| NCQ06 | 13 |
| NCQ06 | 13 |


If i use the function, then the blank row would be filled with 11.395, instead of filling with 11.395 and 13.
Does it mean that the function can be modified like this?
locf2 <- function(x, initial=NA, IS_BAD = is.na) {
    # Replace 'bad' values in 'x' with last previous non-bad value.
    # If no previous non-bad value, replace with 'initial'.
    stopifnot(is.function(IS_BAD))
    good <- !IS_BAD(x)
    stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
    i <- cumsum(good)
    x <- x[c(1,which(good))][i+1]
    x <- x[c(1,which(good))][i+2]
    x[i==0] <- initial
    x
}    On Thursday, 16 March 2017, 1:17, William Dunlap <wdunlap at tibco.com> wrote:
 

 You could use the following function

locf2 <- function(x, initial=NA, IS_BAD = is.na) {
    # Replace 'bad' values in 'x' with last previous non-bad value.
    # If no previous non-bad value, replace with 'initial'.
    stopifnot(is.function(IS_BAD))
    good <- !IS_BAD(x)
    stopifnot(is.logical(good), length(good) == length(x), !anyNA(good))
    i <- cumsum(good)
    x <- x[c(1,which(good))][i+1]
    x[i==0] <- initial
    x
}

as in

> locf2(c("", "A", "B", "", "", "C", ""), IS_BAD=function(x)x=="", initial="---")
[1] "---" "A"  "B"  "B"  "B"  "C"  "C"
> locf2(factor(c(NA,"Small","Medium",NA,"Large",NA,NA,NA,"Small")))
[1] <NA>  Small  Medium Medium Large  Large  Large  Large  Small
Levels: Large Medium Small
> locf2(c(12, NA, 10, 11, NA, NA))
[1] 12 12 10 11 11 11

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Mar 15, 2017 at 4:08 AM, Allan Tanaka <allantanaka11 at yahoo.com> wrote:
> The following is an example:
>
> | Item_Identifier | Item_Weight |
> | FDP10 | 19 |
> | FDP10 |  |
> | DRI11 | 8.26 |
> | DRI11 |  |
> | FDW12 | 8.315 |
> | FDW12 |  |
>
>
> The following is the one that i want to be. That is, filling NA values from the previous Non-NA values.
> | Item_Identifier | Item_Weight |
> | FDP10 | 19 |
> | FDP10 | 19 |
> | DRI11 | 8.26 |
> | DRI11 | 8.26 |
> | FDW12 | 8.315 |
> | FDW12 | 8.315 |
>
>
> My current code data frame: train <- read.csv("Train.csv", header=T,sep = ",",na.strings = c(""," ",NA))
>
>
> Some people suggest to use na.locf function but in my case, i don't have numeric unique values in my Item_Identifier coloumn but rather it's characters. Not sure what to solve this problem.
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


   
	[[alternative HTML version deleted]]



More information about the R-help mailing list