[R] Generating samples from truncated multivariate Student-t distribution

Martin Maechler maechler at stat.math.ethz.ch
Wed Aug 2 09:08:17 CEST 2017


>>>>> David Winsemius <dwinsemius at comcast.net>
>>>>>     on Tue, 9 May 2017 14:33:04 -0700 writes:

    >> On May 9, 2017, at 2:05 PM, Czarek Kowalski <czarek230800 at gmail.com> wrote:
    >> 
    >> I have already posted that in attachement - pdf file.

    > I see that now. I failed to scroll to the 3rd page.

from a late reader:

Please, Czarek, and every other future reader:   This is an
unwritten "regulation" about all R lists I know (a few of which
I co-maintain): 

___>  R code must be given as unformatted plain text  rather than
___>  in a formatted document (pdf in your case) 


    >> I am posting plain text here:

good .. and please do always on these lists


Martin Maechler, ETH Zurich & R Core


    >> 
    >>> library(tmvtnorm)
    >>> meann = c(55, 40, 50, 35, 45, 30)
    >>> covv = matrix(c(  1, 1, 0, 2, -1, -1,
    >> 1, 16, -6, -6, -2, 12,
    >> 0, -6, 4, 2, -2, -5,
    >> 2, -6, 2, 25, 0, -17,
    >> -1, -2, -2, 0, 9, -5,
    >> -1, 12, -5, -17, -5, 36), 6, 6)
    >> df = 4
    >> lower = c(20, 20, 20, 20, 20, 20)
    >> upper = c(60, 60, 60, 60, 60, 60)
    >> X1 <- rtmvt(n=100000, meann, covv, df, lower, upper)
    >> 
    >> 
    >>> sum(X1[,1]) / 100000
    >> [1] 54.98258
    >> sum(X1[,2]) / 100000
    >> [1] 40.36153
    >> sum(X1[,3]) / 100000
    >> [1] 49.83571
    >> sum(X1[,4]) / 100000
    >> [1] 34.69571      # "4th element of mean vector"
    >> sum(X1[,5]) / 100000
    >> [1] 44.81081
    >> sum(X1[,6]) / 100000
    >> [1] 31.10834
    >> 
    >> And corresponding results received using equation (3) from pdf file:
    >> [54.97,
    >> 40,
    >> 49.95,
    >> 35.31, #  "4th element of mean vector"
    >> 44.94,
    >> 31.32]
    >> 

    > I get similar results for the output from your code, 

    > My 100-fold run of your calculations were:

    > meansBig <- replicate(100, {Xbig <- rtmvt(n=100000, meann, covv, df, lower, upper)
    > colMeans(Xbig)} )

    > describe(meansBig[4,])  # describe is from Hmisc package

    > meansBig[4, ] 
    > n  missing distinct     Info     Mean      Gmd      .05      .10      .25 
    > 100        0      100        1     34.7  0.01954    34.68    34.68    34.69 
    > .50      .75      .90      .95 
    > 34.70    34.72    34.72    34.73 

    > lowest : 34.65222 34.66675 34.66703 34.66875 34.67566
    > highest: 34.72939 34.73012 34.73051 34.73742 34.74441


    > So agree, 35.31 is outside the plausible range of an RV formed with that package, but I don't have any code relating to your calculations from theory.

    > Best;
    > David.


    >> On 9 May 2017 at 22:17, David Winsemius <dwinsemius at comcast.net> wrote:
    >>> 
    >>>> On May 9, 2017, at 1:11 PM, Czarek Kowalski <czarek230800 at gmail.com> wrote:
    >>>> 
    >>>> Of course I have expected the difference between theory and a sample
    >>>> of realizations of RV's and result mean should still be a random
    >>>> variable. But, for example for 4th element of mean vector: 35.31 -
    >>>> 34.69571 = 0.61429. It is quite big difference, nieprawdaż? I have
    >>>> expected that the difference would be smaller because of law of large
    >>>> numbers (for 10mln samples the difference is quite similar).
    >>> 
    >>> I for one have no idea what is meant by a "4th element of mean vector". So I have now idea what to consider "big". I have found that my intuitions about multivariate distributions, especially those where the covariate structure is as complex as you have assumed, are often far from simulated results.
    >>> 
    >>> I suggest you post some code and results.
    >>> 
    >>> --
    >>> David.
    >>> 
    >>> 
    >>>> 
    >>>> On 9 May 2017 at 21:40, David Winsemius <dwinsemius at comcast.net> wrote:
    >>>>> 
>>>>> On May 9, 2017, at 10:09 AM, Czarek Kowalski <czarek230800 at gmail.com> wrote:
>>>>> 
>>>>> Dear Members,
>>>>> I am working with 6-dimensional Student-t distribution with 4 degrees
>>>>> of freedom truncated to [20; 60]. I have generated 100 000 samples
>>>>> from truncated multivariate Student-t distribution using rtmvt
>>>>> function from package ‘tmvtnorm’. I have also calculated  mean vector
>>>>> using equation (3) from attached pdf. The problem is, that after
>>>>> summing all elements in one column of rtmvt result (and dividing by
>>>>> 100 000) I do not receive the same result as using (3) equation. Could
>>>>> You tell me, what is incorrect, why there is a difference?
    >>>>> 
    >>>>> I guess the question is why you would NOT expect a difference between theory and a sample of realizations of RV's? The result mean should still be a random variable, night wahr?
    >>>>> 
    >>>>> 
>>>>> Yours faithfully
>>>>> Czarek Kowalski
>>>>> <truncatedT.pdf>______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
    >>>>> 
    >>>>> David Winsemius
    >>>>> Alameda, CA, USA
    >>>>> 
    >>> 
    >>> David Winsemius
    >>> Alameda, CA, USA
    >>> 

    > David Winsemius
    > Alameda, CA, USA

    > ______________________________________________
    > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list