[R] dmvnorm returns NaN
David Winsemius
dwinsemius at comcast.net
Fri Oct 18 19:31:09 CEST 2013
On Oct 18, 2013, at 1:12 AM, peter dalgaard wrote:
>
> On Oct 18, 2013, at 08:37 , David Winsemius wrote:
>
>>
>> On Oct 17, 2013, at 9:11 PM, Steven LeBlanc wrote:
>>
>>> Greets,
>>>
>>> I'm using nlminb() to estimate the parameters of a multivariate normal random sample with missing values and ran into an unexpected result from my call to dmvnorm()
>>
>> There are at least 5 different version of dmvnorm. None of them are in the default packages.
>>
>>> within the likelihood function. Particular details are provided below.
>>
>> Complete? Except for the name of the package that has `dmvnorm`.
>>
>
> More importantly, it gives no clue as to the connection between sigma and the data set. It is not the covariance matrix:
>
>> s <- scan(what=list("",0,0))
> 1: [1,] 0.84761637 3.994261
> 2: [2,] 0.91487059 4.952595
> 3: [3,] 0.84527267 4.521837
> ....
> 40: [40,] 0.65938218 5.209301
> 41:
> Read 40 records
>> cor(s[[2]],s[[3]])
> [1] 0.8812403
>> colMeans(cbind(s[[2]],s[[3]]))
> [1] 1.252108 5.540686
>> var(cbind(s[[2]],s[[3]]))
> [,1] [,2]
> [1,] 1.284475 2.536627
> [2,] 2.536627 6.450582
>
> These are not the u and sigma stated.
>
> Furthermore the matrix given as sigma is not a covariance matrix. Try working out the correlation coefficient:
>
>> 2.2289513/sqrt(0.6464647*5.697834)
> [1] 1.161377
>
> That should be enough to make any version of dmvnorm complain...
I understood the question differently, but you are my superior in both R and statistics, so I beg some education if I'm totally confused. I thought that what was being requested was the passage of the "complete" matrix (a series of 40 points in 2-space) to some unspecified version of dmvnorm with the hope of getting 40 density estimates from a theoretical MVN distribution with mean = c(1.267198, 5.475045) and the variance-covariance matrix, sigma= matrix( c( 0.6461647, 2.2289513, 2.228951, 5.697834), 2)
library(mixtools) # just a guess mind you
dmvnorm( c(0.84761637 , 3.994261), mu=c(1.267198, 5.475045),
sigma= matrix( c( 0.6461647, 2.2289513, 2.228951, 5.697834), 2) )
[1] 0.1224835
> complete <-matrix( scan(), ncol=2, byrow=TRUE)
1: 0.84761637 3.994261
3: 0.91487059 4.952595
5: 0.84527267 4.521837
7: 2.53821358 8.374880
9: 1.16646209 6.255022
11:
Read 10 items
> dmvnorm( complete, mu=c(1.267198, 5.475045), sigma= matrix( c( 0.6461647, 2.2289513, 2.228951, 5.697834), 2) )
[1] 0.12248353 0.14380268 0.13025764 0.06991921 0.19158060
So I don't get the same error as the OP and I am unable to explain why he was getting NaNs.
--
David.
>
>>> It appears that dmvnorm() makes a call to log(eigen(sigma)). Whereas eigen(sigma) is returning a negative number, I understand log()'s complaint. However, it is a mystery to me why this data set should produce such a result.
>>>
>>> Any suggestions?
>>>
>>> Best Regards,
>>> Steven
>>>
>>>> complete
>>> [,1] [,2]
>>> [1,] 0.84761637 3.994261
>>> [2,] 0.91487059 4.952595
>>> [3,] 0.84527267 4.521837
>>> [4,] 2.53821358 8.374880
>>> [5,] 1.16646209 6.255022
>>> [6,] 0.94706527 4.169510
>>> [7,] 0.48813564 3.349230
>>> [8,] 3.71828469 9.441518
>>> [9,] 0.08953357 1.651497
>>> [10,] 0.68530515 5.498403
>>> [11,] 1.52771645 8.484671
>>> [12,] 1.55710697 5.231272
>>> [13,] 1.89091603 4.152658
>>> [14,] 1.08483541 5.401544
>>> [15,] 0.58125385 5.340141
>>> [16,] 0.24473250 2.965046
>>> [17,] 1.59954401 8.095561
>>> [18,] 1.57656436 5.335744
>>> [19,] 2.73976992 8.572871
>>> [20,] 0.87720252 6.067468
>>> [21,] 1.18403087 3.526790
>>> [22,] -1.03145244 1.776478
>>> [23,] 2.88197343 7.720838
>>> [24,] 0.60705218 4.406073
>>> [25,] 0.58083464 3.374075
>>> [26,] 0.87913427 5.247637
>>> [27,] 1.10832692 3.534508
>>> [28,] 2.92698371 8.682130
>>> [29,] 4.04115277 11.827360
>>> [30,] -0.57913297 1.476586
>>> [31,] 0.84804365 7.009075
>>> [32,] 0.79497940 3.671164
>>> [33,] 1.58837762 5.535409
>>> [34,] 0.63412821 3.932767
>>> [35,] 3.14032433 9.271014
>>> [36,] -0.18183869 1.666647
>>> [37,] 0.57535770 6.881830
>>> [38,] 3.21417723 10.901636
>>> [39,] 0.29207932 4.120408
>>> [40,] 0.65938218 5.209301
>>>> u
>>> [1] 1.267198 5.475045
>>>> sigma
>>> [,1] [,2]
>>> [1,] 0.6461647 2.228951
>>> [2,] 2.2289513 5.697834
>>>> dmvnorm(x=complete,mean=u,sigma=sigma)
>>
>>> [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>>> [30] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>>> Warning message:
>>> In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>>> NaNs produced
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list