[R] how to replace NA with a specific score that is dependant on another indicator variable
David Winsemius
dwinsemius at comcast.net
Wed Sep 1 15:55:27 CEST 2010
On Sep 1, 2010, at 9:20 AM, Chris Howden wrote:
> Hi everyone,
>
>
>
> I’m looking for a clever bit of code to replace NA’s with a specific
> score
> depending on an indicator variable.
>
> I can see how to do it using lots of if statements but I’m sure
> there most
> be a neater, better way of doing it.
>
> Any ideas at all will be much appreciated, I’m dreading coding up
> all those
> if statements!!!!!
>
> My problem is as follows:
>
> I have a data set with lots of missing data:
>
> EG Raw Data Set
>
> Category variable1 variable2
> variable3
>
> 1 5 NA
> NA
>
> 1 NA
> 3 4
>
> 2 NA
> 7 NA
This does not do its work by category (since I got tired of fixing
mangled htmlized datasets) but it seems to me that a tapply "wrap"
could do either of these operations within categories:
> egraw
Category variable1 variable2 variable3
1 1 5 NA NA
2 1 NA 3 4
3 2 NA 7 NA
> lapply(egraw, function(x) {mnx <- mean(x, na.rm=TRUE)
sapply(x, function(z) if (is.na(z))
{mnx}else{z})
}
)
$Category
[1] 1 1 2
$variable1
[1] 5 5 5
$variable2
[1] 5 3 7
$variable3
[1] 4 4 4
> sapply(egraw, function(x) {mnx <- mean(x, na.rm=TRUE)
sapply(x, function(z) if (is.na(z))
{mnx}else{z})
}
)
Category variable1 variable2 variable3
[1,] 1 5 5 4
[2,] 1 5 3 4
[3,] 2 5 7 4
>
> etc
>
> Now I want to replace the NA’s with the average for each category,
> so if
> these averages were:
>
> EG Averages
>
> Category variable1 variable2
> variable3
>
> 1 4.5
> 3.2 2.5
>
> 2 3.5
> 7.4 5.9
>
>
>
> So I’d like my data set to look like the following once I’ve
> replaced the
> NA’s with the appropriate category average:
>
> EG Imputed Data Set
>
> Category variable1 variable2
> variable3
>
> 1 5 3.2
> 2.5
>
> 1 4.5
> 3 4
>
> 2 3.5
> 7 5.9
>
> etc
>
> Any ideas would be very much appreciated!!!!!
You might add reading the Posing Guide and setting up your reader to
post in plain text to your TODO list.
>
> thankyou
>
> Chris Howden
> .
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list