[R] QQ plotting of various distributions...
Duncan Murdoch
murdoch at stats.uwo.ca
Sun Sep 27 15:06:17 CEST 2009
Eric Thompson wrote:
> The supposed example of a Q-Q plot is most certainly not how to make a
> Q-Q plot. I don't even know where to start....
>
> First off, the two "Q:s in the title of the plot stand for "quantile",
> not "random". The "answer" supplied simply plots two sorted samples of
> a distribution against each other. While this may resemble the general
> shape of a QQ plot, that is where the similarities end.
>
The empirical quantiles of a sample are simply the sorted values. You
can plot empirical quantiles of one sample versus some version of
quantiles from a distribution (what qqnorm does) or versus empirical
quantiles of another sample (what Sunil did). The randomness in his
demonstration did two things: it generated some data, and it showed the
variability of the plot under repeated sampling.
> Some general advice: be careful who you take advice from on the
> internet.
That's good advice.
Duncan Murdoch
> The Wikipedia entry for Q-Q plot may be a good start if you
> don't know what a Q-Q plot is, although you should also use it with
> caution.
>
> Lets say you have some samples that may be normally distributed:
>
> set.seed(1)
> x <- rnorm(30)
>
> # now try with R's built in function
> qqnorm(x, xlim = c(-3, 3), ylim = c(-3, 3))
>
> # Now try Sunil's "Q-Q plot" method, but for rnorm
> # rather than rgamma
> some_data <- x
> test_data <- rnorm(30)
> points(sort(some_data),sort(test_data), col = "blue")
>
> # Note that the points are NOT the same!
>
> This should have been obvious for the simple reason that the QQ plot
> should not be influenced by the random number generator that you are
> using! A QQ plot is uniquely reproducible. The more general (and
> correct) way to get the QQ plot involves choosing a plotting position
> and the quantile function (e.g. qnorm or qgamma functions in R) of the
> pertinent distribution:
>
> # Sort the data:
> x.s <- sort(x)
> n <- length(x)
>
> # Plotting position (must be careful here in general!)
> p <- ppoints(n)
>
> # Compute the quantile
> x.q <- qnorm(p)
>
> points(x.q, x.s, col = "red")
>
> # and they fall exactly on the points generated by qqnorm().
>
> Now, you should be able to generalize this for any distribution. Hope
> this helps.
>
>
> Eric Thompson
>
>
>
>
> 2009/9/27 Petar Milin <pmilin at ff.uns.ac.rs>:
>
>> Thanks for the answer. Now, only problem is to to get parameter(s) of a
>> given function. For gamma, I shall try with gammafit() from mhsmm package.
>> Also, I shall look for others appropriate parameter estimates. Will use
>> SuppDists too.
>>
>> Best,
>> PM
>>
>> Sunil Suchindran wrote:
>>
>>> #same shape
>>>
>>> some_data <- rgamma(500,shape=6,scale=2)
>>> test_data <- rgamma(500,shape=6,scale=2)
>>> plot(sort(some_data),sort(test_data))
>>> # You can also use qqplot(some_data,test_data)
>>> abline(0,1)
>>>
>>> # different shape
>>>
>>> some_data <- rgamma(500,shape=6,scale=2)
>>> test_data <- rgamma(500,shape=4,scale=2)
>>> plot(sort(some_data),sort(test_data))
>>> abline(0,1)
>>>
>>> It is helpful to assess the sampling variability, by
>>> creating repeated sets of test_data, and plotting
>>> all of these along with your observations to create
>>> a confidence "envelope".
>>>
>>> The SuppDists provides Inverse Gauss.
>>>
>>>
>>> On Thu, Sep 17, 2009 at 11:46 AM, Petar Milin <pmilin at ff.uns.ac.rs> wrote:
>>>
>>> Hello!
>>> I am trying with this question again:
>>> I would like to test few distributional assumptions for some
>>> behavioral response data. There are few theories about true
>>> distribution of those data, like: normal, lognormal, gamma,
>>> ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian) etc. The
>>> best way would be via qq-plot, to show to students differences.
>>> First two are trivial:
>>> qqnorm(dat$X)
>>> qqnorm(log(dat$X))
>>> Then, things are getting more "hairy". I am not sure how to make
>>> plots for the rest. I tried gamma with:
>>> qqmath(~ X, data=dat, distribution=function(X)
>>> � qgamma(X, shape, scale))
>>> Which should be the same as:
>>> plot(qgamma(ppoints(dat$X), shape, scale), sort(dat$X))
>>> Shape and scale parameters I got via mhsmm package that has
>>> gammafit() for shape and scale parameters estimation.
>>> Am I on right track? Does anyone know how to plot the rest:
>>> ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian)?
>>>
>>> Thanks,
>>> PM
>>>
>>> ______________________________________________
>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> <http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list