[R] Replace missing value within group with non-missing value

Rui Barradas ruipbarradas at sapo.pt
Sat Apr 6 20:27:07 CEST 2013


Hello,

With the attached file, I could reproduce the error but I think the 
added line does the trick.

Rui Barradas

Em 06-04-2013 19:20, arun escreveu:
> Hi,
>
> dat<- read.csv("test1.csv",sep=",",stringsAsFactors=FALSE)
> sp <- split(dat, list(dat$dn, dat$obs))
>
> sp1<-sp[lapply(sp,nrow)!=0] #added here
> names(sp1) <- NULL
>
>
> tmp<- lapply(sp1,function(x){
>   idx<- which(!is.na(x$mth))[1]
>   x$mth<- x$mth[idx]
> x
>
>   }
>   )
>
>
> res<- do.call(rbind,tmp)
>   row.names(res)<-1:nrow(res)
>   dim(res)
> #[1] 1200    6
>   dim(dat)
> #[1] 1200    6
> head(res)
> #  X dn obs choice br mth
> #1 1  4   1      0  1 487
> #2 2  4   1      0  2 487
> #3 3  4   1      0  3 487
> #4 4  4   1      0  4 487
> #5 5  4   1      0  5 487
> #6 6  4   1      1  6 487
>
> A.K.
>
> ----- Original Message -----
> From: "Leask, Graham" <g.leask at aston.ac.uk>
> To: Rui Barradas <ruipbarradas at sapo.pt>
> Cc: arun <smartpink111 at yahoo.com>; "r-help at r-project.org" <r-help at r-project.org>
> Sent: Saturday, April 6, 2013 1:44 PM
> Subject: RE: [R] Replace missing value within group with non-missing value
>
> Hi Rui,
>
> I have just pasted this direct and rerun. I still get the same error.
> I am running this on the full length dataset.
>
> dim  [1] 255030      5
>
> I attach the first few rows of the file.
>
> Error pasted below.
> I am using version 2.15.2 on a windows 7 machine.
>
>> sp <- split(dat, list(dat$dn, dat$obs))
>>    names(sp) <- NULL
>> tmp <- lapply(sp, function(x){
> + idx <- which(!is.na(x$mth))[1]
> + if(length(idx) > 0)
> + x$mth <- x$mth[idx]
> + x
> + })
> Error in `$<-.data.frame`(`*tmp*`, "mth", value = NA_real_) :
>    replacement has 1 rows, data has 0
>> do.call(rbind,tmp)
> Error in do.call(rbind, tmp) : object 'tmp' not found
>
> Any thoughts on what could be causing this anomaly?
>
>
> -----Original Message-----
> From: Rui Barradas [mailto:ruipbarradas at sapo.pt]
> Sent: 06 April 2013 18:24
> To: Leask, Graham
> Cc: arun; r-help at r-project.org
> Subject: Re: [R] Replace missing value within group with non-missing value
>
> Hello,
>
> I've just run my code with your data and found no error. Anyway, try replacing the lapply instructi
>
>
> on with this.
>
>
> tmp <- lapply(sp, function(x){
>          idx <- which(!is.na(x$mth))[1]
>          if(length(idx) > 0)
>              x$mth <- x$mth[idx]
>          x
>      })
>
>
> Rui Barradas
>
> Em 06-04-2013 18:12, Leask, Graham escreveu:
>> Hi Arun,
>>
>> How odd. Directly pasting the code from your email precisely repeats the error.
>> See below. Any thoughts on the cause of this anomaly?
>>
>>> dput(head(dat,50))
>> structure(list(dn = c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
>> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
>> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), obs = c(1, 1,
>> 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4,
>> 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8,
>> 8, 8, 8, 8, 9, 9), choice = c(0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,
>> 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
>> 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0), br = c(1,
>> 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4,
>> 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1,
>> 2, 3, 4, 5, 6, 1, 2), mth = c(NA, NA, NA, NA, NA, 487, NA, NA,
>> 488, NA, NA, NA, NA, NA, NA, NA, NA, 488, NA, NA, 489, NA, NA,
>> NA, NA, NA, NA, NA, NA, 489, NA, NA, NA, NA, NA, 489, NA, NA,
>> NA, NA, NA, 490, NA, NA, NA, NA, NA, 491, NA, NA)), .Names = c("dn",
>> "obs", "choice", "br", "mth"), row.names = c("1", "2", "3", "4",
>> "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
>> "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26",
>> "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37",
>> "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48",
>> "49", "50"), class = "data.frame")
>>> sp <- split(dat, list(dat$dn, dat$obs))
>>>     names(sp) <- NULL
>>>     tmp <- lapply(sp, function(x){
>> +          idx <- which(!is.na(x$mth))[1]
>> +          x$mth <- x$mth[idx]
>> +          x
>> +      })
>> Error in `$<-.data.frame`(`*tmp*`, "mth", value = NA_real_) :
>>      replacement has 1 rows, data has 0
>>>     head(do.call(rbind, tmp),7)
>> Error in do.call(rbind, tmp) : object 'tmp' not found
>>
>> Best wishes
>>
>>
>> Graham
>>
>> -----Original Message-----
>> From: arun [mailto:smartpink111 at yahoo.com]
>> Sent: 06 April 2013 17:25
>> To: Leask, Graham
>> Cc: Rui Barradas
>> Subject: Re: [R] Replace missing value within group with non-missing value
>>
>> Hello,
>> By running Rui's code, I am getting this:
>> sp <- split(dat, list(dat$dn, dat$obs))
>>     names(sp) <- NULL
>>     tmp <- lapply(sp, function(x){
>>             idx <- which(!is.na(x$mth))[1]
>>             x$mth <- x$mth[idx]
>>             x
>>         })
>>     head(do.call(rbind, tmp),7)
>>       dn obs choice br mth
>> 1   4   1      0  1 487
>> 2   4   1      0  2 487
>> 3   4   1      0  3 487
>> 4   4   1      0  4 487
>> 5   4   1      0  5 487
>> 6   4   1      1  6 487
>> 7   4   2      0  1 488
>>
>> Couldn't reproduce the error you cited.
>> A.K.
>>
>>
>>
>>
>> ----- Original Message -----
>> From: "Leask, Graham" <g.leask at aston.ac.uk>
>> To: Rui Barradas <ruipbarradas at sapo.pt>
>> Cc: "r-help at r-project.org" <r-help at r-project.org>
>> Sent: Saturday, April 6, 2013 12:16 PM
>> Subject: Re: [R] Replace missing value within group with non-missing value
>>
>> Hi Rui,
>>
>> Data as follows
>>
>> structure(list(dn = c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), obs = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 9, 9), choice = c(0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0), br = c(1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2), mth = c(NA, NA, NA, NA, NA, 487, NA, NA, 488, NA, NA, NA, NA, NA, NA, NA, NA, 488, NA, NA, 489, NA, NA, NA, NA, NA, NA, NA, NA, 489, NA, NA, NA, NA, NA, 489, NA, NA, NA, NA, NA, 490, NA, NA, NA, NA, NA, 491, NA, NA)), .Names = c("dn", "obs", "choice", "br", "mth"), row.names = c("1", "2", "3", "4", "5", "6", "7",
>   "8", "9", "10", "11",
>    "12",
> "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50"), class = "data.frame")
>>
>> Best wishes
>>
>>
>> Graham
>>
>> -----Original Message-----
>> From: Rui Barradas [mailto:ruipbarradas at sapo.pt]
>> Sent: 06 April 2013 16:32
>> To: Leask, Graham
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Replace missing value within group with non-missing value
>>
>> Hello,
>>
>> Can't you post a data example? If your dataset is named 'dat' use
>>
>> dput(head(dat, 50))  # paste the output of this in a post
>>
>>
>> Rui Barradas
>>
>> Em 06-04-2013 15:34, Leask, Graham escreveu:
>>> Hi Rui,
>>>
>>> Thank you for your suggestion which is very much appreciated. Unfortunately running this code produces the following error.
>>>
>>> error in '$<-.data.frame' ('*tmp*', "mth", value = NA_real_) :
>>>          replacement has 1 rows, data has 0
>>>
>>> I'm sure there must be an elegant solution to this problem?
>>>
>>> Best wishes
>>>
>>>
>>>
>>> Graham
>>>
>>> On 6 Apr 2013, at 12:15, "Rui Barradas" <ruipbarradas at sapo.pt> wrote:
>>>
>>>> Hello,
>>>>
>>>> That's not a very good way of posting your data, preferably paste the output of ?dput in a post.
>>>> Some thing along the lines of the following might do what you want.
>>>> It seems that the groups are established by 'dn' and 'obs' numbers.
>>>> If so, try
>>>>
>>>>
>>>> # Make up some data
>>>> dat <- data.frame(dn = 4, obs = rep(1:5, each = 6), mth = NA)
>>>> dat$mth[6] <- 487 dat$mth[9] <- 488 dat$mth[18] <- 488 dat$mth[21] <-
>>>> 489 dat$mth[30] <- 489
>>>>
>>>>
>>>> sp <- split(dat, list(dat$dn, dat$obs))
>>>> names(sp) <- NULL
>>>> tmp <- lapply(sp, function(x){
>>>>             idx <- which(!is.na(x$mth))[1]
>>>>             x$mth <- x$mth[idx]
>>>>             x
>>>>         })
>>>> do.call(rbind, tmp)
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>> Rui Barradas
>>>>
>>>>
>>>> Em 06-04-2013 11:33, Leask, Graham escreveu:
>>>>> Dear List members
>>>>>
>>>>> I have a large dataset organised in choice groups see sample below
>>>>>
>>>>>
>>>>> +--------------------------------------------------------------------
>>>>> -----------------------------+
>>>>>           | dn   obs   choice      acid   br                 date
>>>>> cdate   situat~n   mth   year   set |
>>>>>
>>>>> |--------------------------------------------------------------------
>>>>> -----------------------------|
>>>>>        1. |  4     1        0     LOSEC    1                    .
>>>>> .                .      .     1 |
>>>>>        2. |  4     1        0    NEXIUM    2                    .
>>>>> .                .      .     1 |
>>>>>        3. |  4     1        0    PARIET    3                    .
>>>>> .                .      .     1 |
>>>>>        4. |  4     1        0   PROTIUM    4                    .
>>>>> .                .      .     1 |
>>>>>        5. |  4     1        0    ZANTAC    5                    .
>>>>> .                .      .     1 |
>>>>>
>>>>> |--------------------------------------------------------------------
>>>>> -----------------------------|
>>>>>        6. |  4     1        1     ZOTON    6   23aug2000 01:00:00
>>>>> 23aug2000         NS   487   2000     1 |
>>>>>        7. |  4     2        0     LOSEC    1                    .
>>>>> .                .      .     2 |
>>>>>        8. |  4     2        0    NEXIUM    2                    .
>>>>> .                .      .     2 |
>>>>>        9. |  4     2        1    PARIET    3   25sep2000 01:00:00
>>>>> 25sep2000          L   488   2000     2 |  10. |  4     2        0
>>>>> PROTIUM    4                    .           .                .      .
>>>>> 2 |
>>>>>
>>>>> |--------------------------------------------------------------------
>>>>> -----------------------------|  11. |  4     2        0    ZANTAC
>>>>> 5                    .           .                .      .     2 |
>>>>> 12. |  4     2        0     ZOTON    6                    .
>>>>> .                .      .     2 |  13. |  4     3        0     LOSEC
>>>>> 1                    .           .                .      .     3 |
>>>>> 14. |  4     3        0    NEXIUM    2                    .
>>>>> .                .      .     3 |  15. |  4     3        0    PARIET
>>>>> 3                    .           .                .      .     3 |
>>>>>
>>>>> |--------------------------------------------------------------------
>>>>> -----------------------------|  16. |  4     3        0   PROTIUM
>>>>> 4                    .           .                .      .     3 |
>>>>> 17. |  4     3        0    ZANTAC    5                    .
>>>>> .                .      .     3 |  18. |  4     3        1     ZOTON
>>>>> 6   20sep2000 00:00:00   20sep2000          R   488   2000     3 |
>>>>> 19. |  4     4        0     LOSEC    1                    .
>>>>> .                .      .     4 |  20. |  4     4        0    NEXIUM
>>>>> 2                    .           .                .      .     4 |
>>>>>
>>>>> |--------------------------------------------------------------------
>>>>> -----------------------------|  21. |  4     4        1    PARIET
>>>>> 3   27oct2000 00:00:00   27oct2000         NL   489   2000     4 |
>>>>> 22. |  4     4        0   PROTIUM    4                    .
>>>>> .                .      .     4 |  23. |  4     4        0    ZANTAC
>>>>> 5                    .           .                .      .     4 |
>>>>> 24. |  4     4        0     ZOTON    6                    .
>>>>> .                .      .     4 |  25. |  4     5        0     LOSEC
>>>>> 1                    .           .                .      .     5 |
>>>>>
>>>>> |--------------------------------------------------------------------
>>>>> -----------------------------|  26. |  4     5        0    NEXIUM
>>>>> 2                    .           .                .      .     5 |
>>>>> 27. |  4     5        0    PARIET    3                    .
>>>>> .                .      .     5 |  28. |  4     5        0   PROTIUM
>>>>> 4                    .           .                .      .     5 |
>>>>> 29. |  4     5        0    ZANTAC    5                    .
>>>>> .                .      .     5 |  30. |  4     5        1     ZOTON
>>>>> 6   23oct2000 03:00:00   23oct2000         NS   489   2000     5 |
>>>>>
>>>>> I wish to fill in the missing values in each choice set - delineated by dn (Doctor) obs (Observation number) and choices (1 to 6).
>>>>> For each choice set one choice is chosen which contains full time
>>>>> information for that choice set ie in set 1 choice 6 was chosen and shows the month 487. The other 5 choices show mth as missing. I want to fill these with the correct mth.
>>>>>
>>>>> I am sure there must be an elegant way to do this in R?
>>>>>
>>>>>
>>>>> Best wishes
>>>>>
>>>>>
>>>>>
>>>>> Graham
>>>>>
>>>>>
>>>>>         [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>



More information about the R-help mailing list