[R] NAs introduced by coercion in dist()
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Wed May 2 17:34:15 CEST 2007
Silvia Lomascolo wrote:
> It was suggested that the 'NAs introduced by coercion' message might be
> warning me that my data are not what they should be. I checked this using
> str(PeaksMatrix), as suggested, and the data seem to be what I thought they
> were:
>
> 'data.frame': 335 obs. of 127 variables:
> $ Code : Factor w/ 335 levels "A1MR","A1MU",..: 1 2 3 4 5 6 7 8 9 10 ...
> $ P3.70 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P3.97 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P4.29 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P4.90 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P6.30 : num 0 0 0 0 0 0 0 0 0 0 ...
> $ P6.45 : num 7.73 0 0 0 0 0 4.03 0 0 0 ...
> $ P6.55 : num 0 0 0 0 0 0 0 0 0 0 ...
>
> ...
>
> I do have 335 observations, 127 variables that are named P3.70, 3.97, P4.29,
> etc.. This was a relief, but I still don't know whether the distance matrix
> is what it should be. I tried 'str(dist.PxMx)', which is the name of my
> distance matrix, but I get something that has not much meaning to me, an
> unexperienced R user:
>
> Class 'dist' atomic [1:55945] 329.6 194.9 130.1 70.7 116.9 ...
> ..- attr(*, "Size")= int 335
> ..- attr(*, "Labels")= chr [1:335] "1" "2" "3" "4" ...
> ..- attr(*, "Diag")= logi FALSE
> ..- attr(*, "Upper")= logi FALSE
> ..- attr(*, "method")= chr "euclidean"
> ..- attr(*, "call")= language dist(x = PeaksMatrix, method = "euclidean",
> diag = FALSE, upper = FALSE, p = 2)
>
> Any more suggestions, please?
>
>
>
Actually, you seem to have 126 variables plus a factor called "Code",
which has non-numeric levels. I think you probably want to lose that one
before calculating distances.
> Silvia Lomascolo wrote:
>
>> I work with Windows and use R version 2.4.1. I am JUST starting to learn
>> this program...
>>
>> I get this warning message 'NAs introduced by coercion' while trying to
>> build a distance matrix (to be analyzed with NMDS later) from a 336 x 100
>> data matrix. The original matrix has lots of zeros and no missing values,
>> but I don't think this should matter.
>>
>> I searched this forum and people have suggested that the warning should be
>> ignored but when I try to print the distance matrix I only get the row
>> numbers (the matrix seems to be 'empty') and I'm not being able to judge
>> whether the matrix worked or not.
>>
>> To get the distance matrix I wrote:
>> dist.PxMx <- dist (PeaksMatrix, method='euclidean', diag=FALSE,
>> upper=FALSE)
>>
>> I tried including the p argument (included in the help for dist()) and
>> leaving it out, but that didn't seem to change anything. I think that's
>> required for one distance measure though, not for euclidean dist.
>>
>> Should I really ignore this warning? If so, why am I not being able to see
>> the distance matrix?
>>
>>
>
>
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list