[R] QQ plot
Michael Dewey
Wed Nov 13 12:46:58 CET 2019
Dear Ana
As others have commented this is getting a bit off-topic but here are
some hints.
It is helpful to distinguish two sorts of plot: archival plots and
impact plots. If you want to have an impact plot which gives you a
picture but possibly at the cost of completeness and accuracy then why not:
1 - plot a sample of your 5 million drawn at random
2 - bin the data and plot median p-value against median expected
3 - deal with overlap by choosing a graphical device which supports
transparency and plot points in very light grey so the overlap is more
visible.
Michael
On 12/11/2019 22:04, Ana Marija wrote:
> why I selected only those with P<0.003 to put on QQ plot is because
> the original data set contains 5556249 points and when I extract only
> P<0.001 I am getting 3713 points. Is there is a way to plot the whole
> data set, or choose only the representative points?
>
> On Tue, Nov 12, 2019 at 3:42 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>
>> the smallest p value in my dataset goes to 9.89e-08. How do I make
>> that known on the new QQ plot with multiplied with 1000 values
>>
>> On Tue, Nov 12, 2019 at 3:37 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>>
>>> Just do I need to change the axis when I multiply with 1000 and what
>>> should I put on my axis?
>>>
>>> On Tue, Nov 12, 2019 at 3:07 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>>>
>>>> Hi Duncan,
>>>>
>>>> yes I choose for QQ plot only P<1e-3 and multiplying everything with
>>>> 1000 works great!
>>>> This should not in my understanding influence the interpretation of
>>>> the plot, it is only changing the scale of axis.
>>>>
>>>> Thank you so much,
>>>> Ana
>>>>
>>>> On Tue, Nov 12, 2019 at 2:51 PM Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>>>>
>>>>> On 12/11/2019 2:56 p.m., Jim Lemon wrote:
>>>>>> I thought about this and did a little study of GWAS and the use of
>>>>>> p-values to assess significant associations. As Ana's plot begins at
>>>>>> values of about 0.001, this seems to imply that almost everything in
>>>>>> the genome is associated to some degree. One expects that most SNPs
>>>>>> will not be associated with a particular condition (p~1), so perhaps
>>>>>> something is going wrong in the calculations that produce the
>>>>>> p-values.
>>>>>
>>>>> I may be misunderstanding your last sentence, but if there is no
>>>>> association, the p-value would usually have a uniform distribution from
>>>>> 0 to 1, it wouldn't be near 1.
>>>>>
>>>>> I'd guess we're not seeing the p values from every test, only those that
>>>>> are less than 0.001. If that's true, and there are no effects, it makes
>>>>> sense to multiply all of them by 1000 to get U(0,1) values. On the
>>>>> plot, that would correspond to subtracting 3 from -log10(p), or adding 3
>>>>> to the reference line, as Ana requested.
>>>>>
>>>>> Or just multiply them by 1000 and pass them to qq():
>>>>>
>>>>> qq(dd$P*1000, main = "Q-Q plot of small GWAS p-values")
>>>>>
>>>>> As far as I can see, there's no way to tell qqman::qq to move the
>>>>> reference line.
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>> On Wed, Nov 13, 2019 at 12:28 AM Patrick (Malone Quantitative)
>>>>>> <malone using malonequantitative.com> wrote:
>>>>>>>
>>>>>>> I agree with Abby. That would defeat the purpose of a QQ plot.
>>>>>>>
>>>>>>> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle <spurdle.a using gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> I'm not familiar with the qqman package, or GWAS studies.
>>>>>>>> However, my guess would be that you're *not* supposed to change the
>>>>>>>> position of the line.
>>>>>>>>
>>>>>>>> On Tue, Nov 12, 2019 at 11:48 AM Ana Marija <sokovic.anamarija using gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I was using this library, qqman
>>>>>>>>> https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html
>>>>>>>>>
>>>>>>>>> to create QQ plot, attached. How would I change this default abline to
>>>>>>>>> start from the beginning of my QQ line?
>>>>>>>>>
>>>>>>>>> This is my code:
>>>>>>>>> qq(dd$P, main = "Q-Q plot of GWAS p-values")
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Ana
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>
>
