[R] dmvnorm returns NaN

David Winsemius dwinsemius at comcast.net
Fri Oct 18 19:31:09 CEST 2013


On Oct 18, 2013, at 1:12 AM, peter dalgaard wrote:

> 
> On Oct 18, 2013, at 08:37 , David Winsemius wrote:
> 
>> 
>> On Oct 17, 2013, at 9:11 PM, Steven LeBlanc wrote:
>> 
>>> Greets,
>>> 
>>> I'm using nlminb() to estimate the parameters of a multivariate normal random sample with missing values and ran into an unexpected result from my call to dmvnorm()
>> 
>> There are at least 5 different version of dmvnorm. None of them are in the default packages.
>> 
>>> within the likelihood function. Particular details are provided below.
>> 
>> Complete? Except for the name of the package that has `dmvnorm`.
>> 
> 
> More importantly, it gives no clue as to the connection between sigma and the data set. It is not the covariance matrix:
> 
>> s <- scan(what=list("",0,0))
> 1: [1,]  0.84761637  3.994261
> 2: [2,]  0.91487059  4.952595
> 3: [3,]  0.84527267  4.521837
> ....
> 40: [40,]  0.65938218  5.209301
> 41: 
> Read 40 records
>> cor(s[[2]],s[[3]])
> [1] 0.8812403
>> colMeans(cbind(s[[2]],s[[3]]))
> [1] 1.252108 5.540686
>> var(cbind(s[[2]],s[[3]]))
>         [,1]     [,2]
> [1,] 1.284475 2.536627
> [2,] 2.536627 6.450582
> 
> These are not the u and sigma stated.
> 
> Furthermore the matrix given as sigma is not a covariance matrix. Try working out the correlation coefficient:
> 
>> 2.2289513/sqrt(0.6464647*5.697834)
> [1] 1.161377
> 
> That should be enough to make any version of dmvnorm complain...

I understood the question differently, but you are my superior in both R and statistics, so I beg some education if I'm totally confused. I thought that what was being requested was the passage of the "complete" matrix (a series of 40 points in 2-space) to some unspecified version of dmvnorm with the hope of getting 40 density estimates from a theoretical MVN distribution with mean = c(1.267198, 5.475045) and the variance-covariance matrix, sigma= matrix( c( 0.6461647, 2.2289513, 2.228951,  5.697834), 2)

library(mixtools)  # just a guess mind you

 dmvnorm( c(0.84761637 , 3.994261), mu=c(1.267198, 5.475045), 
                           sigma= matrix( c( 0.6461647, 2.2289513, 2.228951,  5.697834), 2) )
[1] 0.1224835

> complete <-matrix( scan(), ncol=2, byrow=TRUE)
1: 0.84761637  3.994261
3:  0.91487059  4.952595
5:  0.84527267  4.521837
7:  2.53821358  8.374880
9:  1.16646209  6.255022
11: 
Read 10 items
> dmvnorm( complete, mu=c(1.267198, 5.475045), sigma= matrix( c( 0.6461647, 2.2289513, 2.228951,  5.697834), 2) )
[1] 0.12248353 0.14380268 0.13025764 0.06991921 0.19158060

So I don't get the same error as the OP and I am unable to explain why he was getting NaNs.

-- 
David.

> 
>>> It appears that dmvnorm() makes a call to log(eigen(sigma)). Whereas eigen(sigma) is returning a negative number, I understand log()'s complaint. However, it is a mystery to me why this data set should produce such a result.
>>> 
>>> Any suggestions?
>>> 
>>> Best Regards,
>>> Steven
>>> 
>>>> complete
>>>           [,1]      [,2]
>>> [1,]  0.84761637  3.994261
>>> [2,]  0.91487059  4.952595
>>> [3,]  0.84527267  4.521837
>>> [4,]  2.53821358  8.374880
>>> [5,]  1.16646209  6.255022
>>> [6,]  0.94706527  4.169510
>>> [7,]  0.48813564  3.349230
>>> [8,]  3.71828469  9.441518
>>> [9,]  0.08953357  1.651497
>>> [10,]  0.68530515  5.498403
>>> [11,]  1.52771645  8.484671
>>> [12,]  1.55710697  5.231272
>>> [13,]  1.89091603  4.152658
>>> [14,]  1.08483541  5.401544
>>> [15,]  0.58125385  5.340141
>>> [16,]  0.24473250  2.965046
>>> [17,]  1.59954401  8.095561
>>> [18,]  1.57656436  5.335744
>>> [19,]  2.73976992  8.572871
>>> [20,]  0.87720252  6.067468
>>> [21,]  1.18403087  3.526790
>>> [22,] -1.03145244  1.776478
>>> [23,]  2.88197343  7.720838
>>> [24,]  0.60705218  4.406073
>>> [25,]  0.58083464  3.374075
>>> [26,]  0.87913427  5.247637
>>> [27,]  1.10832692  3.534508
>>> [28,]  2.92698371  8.682130
>>> [29,]  4.04115277 11.827360
>>> [30,] -0.57913297  1.476586
>>> [31,]  0.84804365  7.009075
>>> [32,]  0.79497940  3.671164
>>> [33,]  1.58837762  5.535409
>>> [34,]  0.63412821  3.932767
>>> [35,]  3.14032433  9.271014
>>> [36,] -0.18183869  1.666647
>>> [37,]  0.57535770  6.881830
>>> [38,]  3.21417723 10.901636
>>> [39,]  0.29207932  4.120408
>>> [40,]  0.65938218  5.209301
>>>> u
>>> [1] 1.267198 5.475045
>>>> sigma
>>>        [,1]     [,2]
>>> [1,] 0.6461647 2.228951
>>> [2,] 2.2289513 5.697834
>>>> dmvnorm(x=complete,mean=u,sigma=sigma)
>> 
>>> [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>>> [30] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>>> Warning message:
>>> In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>>> NaNs produced
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list