[R] QQ plotting of various distributions...

Duncan Murdoch murdoch at stats.uwo.ca
Sun Sep 27 15:06:17 CEST 2009


Eric Thompson wrote:
> The supposed example of a Q-Q plot is most certainly not how to make a
> Q-Q plot. I don't even know where to start....
>
> First off, the two "Q:s in the title of the plot stand for "quantile",
> not "random". The "answer" supplied simply plots two sorted samples of
> a distribution against each other. While this may resemble the general
> shape of a QQ plot, that is where the similarities end.
>   

The empirical quantiles of a sample are simply the sorted values.  You 
can plot empirical quantiles of one sample versus some version of 
quantiles from a distribution (what qqnorm does) or versus empirical 
quantiles of another sample (what Sunil did).  The randomness in his 
demonstration did two things: it generated some data, and it showed the 
variability of the plot under repeated sampling.
> Some general advice: be careful who you take advice from on the
> internet. 

That's good advice.

Duncan Murdoch

> The Wikipedia entry for Q-Q plot may be a good start if you
> don't know what a Q-Q plot is, although you should also use it with
> caution.
>
> Lets say you have some samples that may be normally distributed:
>
> set.seed(1)
> x <- rnorm(30)
>
> # now try with R's built in function
> qqnorm(x, xlim = c(-3, 3), ylim = c(-3, 3))
>
> # Now try Sunil's "Q-Q plot" method, but for rnorm
> # rather than rgamma
> some_data <- x
> test_data <- rnorm(30)
> points(sort(some_data),sort(test_data), col = "blue")
>
> # Note that the points are NOT the same!
>
> This should have been obvious for the simple reason that the QQ plot
> should not be influenced by the random number generator that you are
> using! A QQ plot is uniquely reproducible. The more general (and
> correct) way to get the QQ plot involves choosing a plotting position
> and the quantile function (e.g. qnorm or qgamma functions in R) of the
> pertinent distribution:
>
> # Sort the data:
> x.s <- sort(x)
> n <- length(x)
>
> # Plotting position (must be careful here in general!)
> p <- ppoints(n)
>
> # Compute the quantile
> x.q <- qnorm(p)
>
> points(x.q, x.s, col = "red")
>
> # and they fall exactly on the points generated by qqnorm().
>
> Now, you should be able to generalize this for any distribution. Hope
> this helps.
>
>
> Eric Thompson
>
>
>
>
> 2009/9/27 Petar Milin <pmilin at ff.uns.ac.rs>:
>   
>> Thanks for the answer. Now, only problem is to to get parameter(s) of a
>> given function. For gamma, I shall try with gammafit() from mhsmm package.
>> Also, I shall look for others appropriate parameter estimates. Will use
>> SuppDists too.
>>
>> Best,
>> PM
>>
>> Sunil Suchindran wrote:
>>     
>>> #same shape
>>>
>>> some_data <- rgamma(500,shape=6,scale=2)
>>> test_data <- rgamma(500,shape=6,scale=2)
>>> plot(sort(some_data),sort(test_data))
>>> # You can also use qqplot(some_data,test_data)
>>> abline(0,1)
>>>
>>> # different shape
>>>
>>> some_data <- rgamma(500,shape=6,scale=2)
>>> test_data <- rgamma(500,shape=4,scale=2)
>>> plot(sort(some_data),sort(test_data))
>>> abline(0,1)
>>>
>>> It is helpful to assess the sampling variability, by
>>> creating repeated sets of test_data, and plotting
>>> all of these along with your observations to create
>>> a confidence "envelope".
>>>
>>> The SuppDists provides Inverse Gauss.
>>>
>>>
>>> On Thu, Sep 17, 2009 at 11:46 AM, Petar Milin <pmilin at ff.uns.ac.rs> wrote:
>>>
>>>    Hello!
>>>    I am trying with this question again:
>>>    I would like to test few distributional assumptions for some
>>>    behavioral response data. There are few theories about true
>>>    distribution of those data, like: normal, lognormal, gamma,
>>>    ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian) etc. The
>>>    best way would be via qq-plot, to show to students differences.
>>>    First two are trivial:
>>>    qqnorm(dat$X)
>>>    qqnorm(log(dat$X))
>>>    Then, things are getting more "hairy". I am not sure how to make
>>>    plots for the rest. I tried gamma with:
>>>    qqmath(~ X, data=dat, distribution=function(X)
>>>    � qgamma(X, shape, scale))
>>>    Which should be the same as:
>>>    plot(qgamma(ppoints(dat$X), shape, scale), sort(dat$X))
>>>    Shape and scale parameters I got via mhsmm package that has
>>>    gammafit() for shape and scale parameters estimation.
>>>    Am I on right track? Does anyone know how to plot the rest:
>>>    ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian)?
>>>
>>>    Thanks,
>>>    PM
>>>
>>>    ______________________________________________
>>>    R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>>>    https://stat.ethz.ch/mailman/listinfo/r-help
>>>    PLEASE do read the posting guide
>>>    http://www.R-project.org/posting-guide.html
>>>    <http://www.r-project.org/posting-guide.html>
>>>    and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>       
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>     
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list