[R] Replace missing value within group with non-missing value
arun
smartpink111 at yahoo.com
Sat Apr 6 20:20:05 CEST 2013
Hi,
dat<- read.csv("test1.csv",sep=",",stringsAsFactors=FALSE)
sp <- split(dat, list(dat$dn, dat$obs))
sp1<-sp[lapply(sp,nrow)!=0] #added here
names(sp1) <- NULL
tmp<- lapply(sp1,function(x){
idx<- which(!is.na(x$mth))[1]
x$mth<- x$mth[idx]
x
}
)
res<- do.call(rbind,tmp)
row.names(res)<-1:nrow(res)
dim(res)
#[1] 1200 6
dim(dat)
#[1] 1200 6
head(res)
# X dn obs choice br mth
#1 1 4 1 0 1 487
#2 2 4 1 0 2 487
#3 3 4 1 0 3 487
#4 4 4 1 0 4 487
#5 5 4 1 0 5 487
#6 6 4 1 1 6 487
A.K.
----- Original Message -----
From: "Leask, Graham" <g.leask at aston.ac.uk>
To: Rui Barradas <ruipbarradas at sapo.pt>
Cc: arun <smartpink111 at yahoo.com>; "r-help at r-project.org" <r-help at r-project.org>
Sent: Saturday, April 6, 2013 1:44 PM
Subject: RE: [R] Replace missing value within group with non-missing value
Hi Rui,
I have just pasted this direct and rerun. I still get the same error.
I am running this on the full length dataset.
dim [1] 255030 5
I attach the first few rows of the file.
Error pasted below.
I am using version 2.15.2 on a windows 7 machine.
> sp <- split(dat, list(dat$dn, dat$obs))
> names(sp) <- NULL
> tmp <- lapply(sp, function(x){
+ idx <- which(!is.na(x$mth))[1]
+ if(length(idx) > 0)
+ x$mth <- x$mth[idx]
+ x
+ })
Error in `$<-.data.frame`(`*tmp*`, "mth", value = NA_real_) :
replacement has 1 rows, data has 0
> do.call(rbind,tmp)
Error in do.call(rbind, tmp) : object 'tmp' not found
Any thoughts on what could be causing this anomaly?
-----Original Message-----
From: Rui Barradas [mailto:ruipbarradas at sapo.pt]
Sent: 06 April 2013 18:24
To: Leask, Graham
Cc: arun; r-help at r-project.org
Subject: Re: [R] Replace missing value within group with non-missing value
Hello,
I've just run my code with your data and found no error. Anyway, try replacing the lapply instructi
on with this.
tmp <- lapply(sp, function(x){
idx <- which(!is.na(x$mth))[1]
if(length(idx) > 0)
x$mth <- x$mth[idx]
x
})
Rui Barradas
Em 06-04-2013 18:12, Leask, Graham escreveu:
> Hi Arun,
>
> How odd. Directly pasting the code from your email precisely repeats the error.
> See below. Any thoughts on the cause of this anomaly?
>
>> dput(head(dat,50))
> structure(list(dn = c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), obs = c(1, 1,
> 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4,
> 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8,
> 8, 8, 8, 8, 9, 9), choice = c(0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,
> 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
> 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0), br = c(1,
> 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4,
> 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1,
> 2, 3, 4, 5, 6, 1, 2), mth = c(NA, NA, NA, NA, NA, 487, NA, NA,
> 488, NA, NA, NA, NA, NA, NA, NA, NA, 488, NA, NA, 489, NA, NA,
> NA, NA, NA, NA, NA, NA, 489, NA, NA, NA, NA, NA, 489, NA, NA,
> NA, NA, NA, 490, NA, NA, NA, NA, NA, 491, NA, NA)), .Names = c("dn",
> "obs", "choice", "br", "mth"), row.names = c("1", "2", "3", "4",
> "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
> "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26",
> "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37",
> "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48",
> "49", "50"), class = "data.frame")
>> sp <- split(dat, list(dat$dn, dat$obs))
>> names(sp) <- NULL
>> tmp <- lapply(sp, function(x){
> + idx <- which(!is.na(x$mth))[1]
> + x$mth <- x$mth[idx]
> + x
> + })
> Error in `$<-.data.frame`(`*tmp*`, "mth", value = NA_real_) :
> replacement has 1 rows, data has 0
>> head(do.call(rbind, tmp),7)
> Error in do.call(rbind, tmp) : object 'tmp' not found
>
> Best wishes
>
>
> Graham
>
> -----Original Message-----
> From: arun [mailto:smartpink111 at yahoo.com]
> Sent: 06 April 2013 17:25
> To: Leask, Graham
> Cc: Rui Barradas
> Subject: Re: [R] Replace missing value within group with non-missing value
>
> Hello,
> By running Rui's code, I am getting this:
> sp <- split(dat, list(dat$dn, dat$obs))
> names(sp) <- NULL
> tmp <- lapply(sp, function(x){
> idx <- which(!is.na(x$mth))[1]
> x$mth <- x$mth[idx]
> x
> })
> head(do.call(rbind, tmp),7)
> dn obs choice br mth
> 1 4 1 0 1 487
> 2 4 1 0 2 487
> 3 4 1 0 3 487
> 4 4 1 0 4 487
> 5 4 1 0 5 487
> 6 4 1 1 6 487
> 7 4 2 0 1 488
>
> Couldn't reproduce the error you cited.
> A.K.
>
>
>
>
> ----- Original Message -----
> From: "Leask, Graham" <g.leask at aston.ac.uk>
> To: Rui Barradas <ruipbarradas at sapo.pt>
> Cc: "r-help at r-project.org" <r-help at r-project.org>
> Sent: Saturday, April 6, 2013 12:16 PM
> Subject: Re: [R] Replace missing value within group with non-missing value
>
> Hi Rui,
>
> Data as follows
>
> structure(list(dn = c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), obs = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 9, 9), choice = c(0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0), br = c(1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2), mth = c(NA, NA, NA, NA, NA, 487, NA, NA, 488, NA, NA, NA, NA, NA, NA, NA, NA, 488, NA, NA, 489, NA, NA, NA, NA, NA, NA, NA, NA, 489, NA, NA, NA, NA, NA, 489, NA, NA, NA, NA, NA, 490, NA, NA, NA, NA, NA, 491, NA, NA)), .Names = c("dn", "obs", "choice", "br", "mth"), row.names = c("1", "2", "3", "4", "5", "6", "7",
"8", "9", "10", "11",
"12",
"13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50"), class = "data.frame")
>
> Best wishes
>
>
> Graham
>
> -----Original Message-----
> From: Rui Barradas [mailto:ruipbarradas at sapo.pt]
> Sent: 06 April 2013 16:32
> To: Leask, Graham
> Cc: r-help at r-project.org
> Subject: Re: [R] Replace missing value within group with non-missing value
>
> Hello,
>
> Can't you post a data example? If your dataset is named 'dat' use
>
> dput(head(dat, 50)) # paste the output of this in a post
>
>
> Rui Barradas
>
> Em 06-04-2013 15:34, Leask, Graham escreveu:
>> Hi Rui,
>>
>> Thank you for your suggestion which is very much appreciated. Unfortunately running this code produces the following error.
>>
>> error in '$<-.data.frame' ('*tmp*', "mth", value = NA_real_) :
>> replacement has 1 rows, data has 0
>>
>> I'm sure there must be an elegant solution to this problem?
>>
>> Best wishes
>>
>>
>>
>> Graham
>>
>> On 6 Apr 2013, at 12:15, "Rui Barradas" <ruipbarradas at sapo.pt> wrote:
>>
>>> Hello,
>>>
>>> That's not a very good way of posting your data, preferably paste the output of ?dput in a post.
>>> Some thing along the lines of the following might do what you want.
>>> It seems that the groups are established by 'dn' and 'obs' numbers.
>>> If so, try
>>>
>>>
>>> # Make up some data
>>> dat <- data.frame(dn = 4, obs = rep(1:5, each = 6), mth = NA)
>>> dat$mth[6] <- 487 dat$mth[9] <- 488 dat$mth[18] <- 488 dat$mth[21] <-
>>> 489 dat$mth[30] <- 489
>>>
>>>
>>> sp <- split(dat, list(dat$dn, dat$obs))
>>> names(sp) <- NULL
>>> tmp <- lapply(sp, function(x){
>>> idx <- which(!is.na(x$mth))[1]
>>> x$mth <- x$mth[idx]
>>> x
>>> })
>>> do.call(rbind, tmp)
>>>
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>>
>>> Em 06-04-2013 11:33, Leask, Graham escreveu:
>>>> Dear List members
>>>>
>>>> I have a large dataset organised in choice groups see sample below
>>>>
>>>>
>>>> +--------------------------------------------------------------------
>>>> -----------------------------+
>>>> | dn obs choice acid br date
>>>> cdate situat~n mth year set |
>>>>
>>>> |--------------------------------------------------------------------
>>>> -----------------------------|
>>>> 1. | 4 1 0 LOSEC 1 .
>>>> . . . 1 |
>>>> 2. | 4 1 0 NEXIUM 2 .
>>>> . . . 1 |
>>>> 3. | 4 1 0 PARIET 3 .
>>>> . . . 1 |
>>>> 4. | 4 1 0 PROTIUM 4 .
>>>> . . . 1 |
>>>> 5. | 4 1 0 ZANTAC 5 .
>>>> . . . 1 |
>>>>
>>>> |--------------------------------------------------------------------
>>>> -----------------------------|
>>>> 6. | 4 1 1 ZOTON 6 23aug2000 01:00:00
>>>> 23aug2000 NS 487 2000 1 |
>>>> 7. | 4 2 0 LOSEC 1 .
>>>> . . . 2 |
>>>> 8. | 4 2 0 NEXIUM 2 .
>>>> . . . 2 |
>>>> 9. | 4 2 1 PARIET 3 25sep2000 01:00:00
>>>> 25sep2000 L 488 2000 2 | 10. | 4 2 0
>>>> PROTIUM 4 . . . .
>>>> 2 |
>>>>
>>>> |--------------------------------------------------------------------
>>>> -----------------------------| 11. | 4 2 0 ZANTAC
>>>> 5 . . . . 2 |
>>>> 12. | 4 2 0 ZOTON 6 .
>>>> . . . 2 | 13. | 4 3 0 LOSEC
>>>> 1 . . . . 3 |
>>>> 14. | 4 3 0 NEXIUM 2 .
>>>> . . . 3 | 15. | 4 3 0 PARIET
>>>> 3 . . . . 3 |
>>>>
>>>> |--------------------------------------------------------------------
>>>> -----------------------------| 16. | 4 3 0 PROTIUM
>>>> 4 . . . . 3 |
>>>> 17. | 4 3 0 ZANTAC 5 .
>>>> . . . 3 | 18. | 4 3 1 ZOTON
>>>> 6 20sep2000 00:00:00 20sep2000 R 488 2000 3 |
>>>> 19. | 4 4 0 LOSEC 1 .
>>>> . . . 4 | 20. | 4 4 0 NEXIUM
>>>> 2 . . . . 4 |
>>>>
>>>> |--------------------------------------------------------------------
>>>> -----------------------------| 21. | 4 4 1 PARIET
>>>> 3 27oct2000 00:00:00 27oct2000 NL 489 2000 4 |
>>>> 22. | 4 4 0 PROTIUM 4 .
>>>> . . . 4 | 23. | 4 4 0 ZANTAC
>>>> 5 . . . . 4 |
>>>> 24. | 4 4 0 ZOTON 6 .
>>>> . . . 4 | 25. | 4 5 0 LOSEC
>>>> 1 . . . . 5 |
>>>>
>>>> |--------------------------------------------------------------------
>>>> -----------------------------| 26. | 4 5 0 NEXIUM
>>>> 2 . . . . 5 |
>>>> 27. | 4 5 0 PARIET 3 .
>>>> . . . 5 | 28. | 4 5 0 PROTIUM
>>>> 4 . . . . 5 |
>>>> 29. | 4 5 0 ZANTAC 5 .
>>>> . . . 5 | 30. | 4 5 1 ZOTON
>>>> 6 23oct2000 03:00:00 23oct2000 NS 489 2000 5 |
>>>>
>>>> I wish to fill in the missing values in each choice set - delineated by dn (Doctor) obs (Observation number) and choices (1 to 6).
>>>> For each choice set one choice is chosen which contains full time
>>>> information for that choice set ie in set 1 choice 6 was chosen and shows the month 487. The other 5 choices show mth as missing. I want to fill these with the correct mth.
>>>>
>>>> I am sure there must be an elegant way to do this in R?
>>>>
>>>>
>>>> Best wishes
>>>>
>>>>
>>>>
>>>> Graham
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list