[R] How to replace missing values by mean of subgroup of a group

Bert Gunter bgunter.4567 at gmail.com
Wed May 10 02:44:28 CEST 2017


Of course, statistically, one should not do this. But that's another thread
on another site.

Cheers,
Bert



On May 9, 2017 3:36 PM, "Boris Steipe" <boris.steipe at utoronto.ca> wrote:

Great.

I am CC'ing the list - this is important so that others who may come across
this thread in the archives know that this question has been resolved.

Cheers,

B.


> On May 9, 2017, at 5:18 PM, Olu Ola <oluola2011 at yahoo.com> wrote:
>
> Thank you!!! It worked.
>
> Regards,
> Olu
>
>
> On Tuesday, May 9, 2017 4:20 PM, Boris Steipe <boris.steipe at utoronto.ca>
wrote:
>
>
> Pedestrian code, so you can analyze this easily. However entirely
untested since I have no ambitionto recreate your input data as a data
frame. This code assumes:
> - your data _is_ a data frame
> - the desired column is called food.price, not "food price" (cf.
?make.names )
>
> # define a function that imputes NA values in the same city, for the same
food
> imputeFoodPrice <- function(DF, i) {
>   sel <- DF$city == DF$city[i] & DF$food == DF$food[i]
>   imputed <- mean(DF$food.price[sel], na.rm = TRUE)
>   if (is.nan(imputed)) { # careful, there might be no other match
>     imputed <- NA
>   }
>   return(imputed)
> }
>
>
> # apply the function to replace NA values
> for (iMissing in which(is.na(myDF$food.price))) {
>   myDF$food.price[iMissing] <- imputeFoodPrice(myDF, iMissing)
> }
>
>
> B.
>
>
>
> > On May 9, 2017, at 3:14 PM, Olu Ola via R-help <r-help at r-project.org>
wrote:
> >
> > Hello,I have the following food data with some NA values in the food
prices. I will like to replace the NA values in the food price column for
each food item by the mean price of the specific food item for each city.
For example, the price of bean for the household with hhid 102 in the data
set is missing. I will like to replace the missing value with the mean
price of bean for the households living in Paxton city (that is households
101 and 103). the data set is presented below. Any help will be greatly
appreciated.
> >
> > | hhid | city | food | food price |
> > | 101 | Paxton | rice | 10 |
> > | 101 | Paxton | beans | 30 |
> > | 101 | Paxton | flour | NA |
> > | 101 | Paxton | eggs | 20 |
> > | 102 | Paxton | rice | NA |
> > | 102 | Paxton | beans | NA |
> > | 102 | Paxton | flour | 34 |
> > | 102 | Paxton | eggs | 21 |
> > | 103 | Paxton | rice | 15 |
> > | 103 | Paxton | beans | 28 |
> > | 103 | Paxton | flour | 32 |
> > | 103 | Paxton | eggs | NA |
> > | 104 | Hull | rice | NA |
> > | 104 | Hull | beans | 34 |
> > | 104 | Hull | flour | NA |
> > | 104 | Hull | eggs | 24 |
> > | 105 | Hull | rice | 18 |
> > | 105 | Hull | beans | 38 |
> > | 105 | Hull | flour | 36 |
> > | 105 | Hull | eggs | 26 |
> > | 106 | Hull | rice | NA |
> > | 106 | Hull | beans | NA |
> > | 106 | Hull | flour | 40 |
> > | 106 | Hull | eggs | NA |
> >
> >
> >     [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list