[R] Fwd: Re: knn - random result although use.all=TRUE
itziar irigoien
itziar.irigoien at ehu.es
Mon Nov 23 12:58:45 CET 2015
Thank you very much for your prompt response. Now I see why the results
have a random part: although all units with tied distances are included
in the neighbourhood, the votes have to be broken at random.
Thank you!
Itziar Irigoien
On or., 2015.eko azaren 20a 16:40, David L Carlson wrote:
> Changing your definition of cl to clase let me replicate the problem. If you set a random seed just before running knn() the results are consistent so that indicates that the function is drawing a random number at some point.
>
> You should probably contact the package maintainer, but your toy data set is trivially simple. You have 40 total observations, but X1 has only 3 different values and X2 has only 2 different values so there are only 6 different combinations. The distance matrix on your training set has 435 distances, but only 5 different values! As a result there are many, many tied values so the algorithm probably uses a random method of selecting which 3 to use.
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
More information about the R-help
mailing list