[R] How to replace missing values by mean of subgroup of a group

Boris Steipe boris.steipe at utoronto.ca
Wed May 10 00:35:49 CEST 2017


Great. 

I am CC'ing the list - this is important so that others who may come across this thread in the archives know that this question has been resolved.

Cheers,

B.


> On May 9, 2017, at 5:18 PM, Olu Ola <oluola2011 at yahoo.com> wrote:
> 
> Thank you!!! It worked.
> 
> Regards,
> Olu
> 
> 
> On Tuesday, May 9, 2017 4:20 PM, Boris Steipe <boris.steipe at utoronto.ca> wrote:
> 
> 
> Pedestrian code, so you can analyze this easily. However entirely untested since I have no ambitionto recreate your input data as a data frame. This code assumes:
> - your data _is_ a data frame
> - the desired column is called food.price, not "food price" (cf. ?make.names )
> 
> # define a function that imputes NA values in the same city, for the same food
> imputeFoodPrice <- function(DF, i) {
>   sel <- DF$city == DF$city[i] & DF$food == DF$food[i]
>   imputed <- mean(DF$food.price[sel], na.rm = TRUE)
>   if (is.nan(imputed)) { # careful, there might be no other match
>     imputed <- NA
>   }
>   return(imputed)
> }
> 
> 
> # apply the function to replace NA values
> for (iMissing in which(is.na(myDF$food.price))) {
>   myDF$food.price[iMissing] <- imputeFoodPrice(myDF, iMissing)
> }
> 
> 
> B.
> 
> 
> 
> > On May 9, 2017, at 3:14 PM, Olu Ola via R-help <r-help at r-project.org> wrote:
> > 
> > Hello,I have the following food data with some NA values in the food prices. I will like to replace the NA values in the food price column for each food item by the mean price of the specific food item for each city. For example, the price of bean for the household with hhid 102 in the data set is missing. I will like to replace the missing value with the mean price of bean for the households living in Paxton city (that is households 101 and 103). the data set is presented below. Any help will be greatly appreciated.
> > 
> > | hhid | city | food | food price |
> > | 101 | Paxton | rice | 10 |
> > | 101 | Paxton | beans | 30 |
> > | 101 | Paxton | flour | NA |
> > | 101 | Paxton | eggs | 20 |
> > | 102 | Paxton | rice | NA |
> > | 102 | Paxton | beans | NA |
> > | 102 | Paxton | flour | 34 |
> > | 102 | Paxton | eggs | 21 |
> > | 103 | Paxton | rice | 15 |
> > | 103 | Paxton | beans | 28 |
> > | 103 | Paxton | flour | 32 |
> > | 103 | Paxton | eggs | NA |
> > | 104 | Hull | rice | NA |
> > | 104 | Hull | beans | 34 |
> > | 104 | Hull | flour | NA |
> > | 104 | Hull | eggs | 24 |
> > | 105 | Hull | rice | 18 |
> > | 105 | Hull | beans | 38 |
> > | 105 | Hull | flour | 36 |
> > | 105 | Hull | eggs | 26 |
> > | 106 | Hull | rice | NA |
> > | 106 | Hull | beans | NA |
> > | 106 | Hull | flour | 40 |
> > | 106 | Hull | eggs | NA |
> > 
> > 
> >     [[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 



More information about the R-help mailing list