[R-sig-Geo] Impute missing value using k nearest neighbour

Frede Aakmann Tøgersen frtog at vestas.com
Tue Aug 25 06:55:48 CEST 2015


One method is from DMwR  package (https://cran.r-project.org/web/packages/DMwR/DMwR.pdf) :


Fill in NA values with the values of the nearest neighbours


Function that fills in all NA values using the k Nearest Neighbours of each case with NA values.
By default it uses the values of the neighbours and obtains an weighted (by the distance to the case)
average of their values to fill in the unknows.  If meth=’median’ it uses the median/most frequent
value, instead.

knnImputation(data, k = 10, scale = T, meth = "weighAvg", distData = NULL)

Yours sincerely / Med venlig hilsen

Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
frtog at vestas.com<mailto:frtog at vestas.com>

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice<http://www.vestas.com/legal/notice>
If you have received this e-mail in error please contact the sender.

From: R-sig-Geo [mailto:r-sig-geo-bounces at r-project.org] On Behalf Of Metastate Metastate
Sent: 25. august 2015 05:05
To: r-sig-geo at r-project.org
Subject: [R-sig-Geo] Impute missing value using k nearest neighbour


I have a data set with location ID (FIPS), latitude and longitude of the location, V1 to v3 that are some features for relevant location. The dataset have more than 3000 locations.  Please see the attached file for a small sample. In the sample file, you can see there are some missing values for V2 and V3 at FIPS of 26089. Does anyone know any r package that can impute the missing values based on k nearest neighbor using the distance matrix calculated from latitude and longitude?

Really appreciation for your kindly help or suggestion.


	[[alternative HTML version deleted]]

More information about the R-sig-Geo mailing list