[R] Fixed! Thanks all:RE: scatterplot to boxplot translation?

Bert Gunter gunter.berton at gene.com
Fri Dec 9 22:23:58 CET 2011


Kelly:

Glad you got what you were looking for, but this whole thread begs the
question; (Why) should you do this? You lose information in binning
the continuous data, of course. Perhaps your answer is that the point
scatter in the data is too noisy to clearly discern what's going on, a
legitimate response. One might  then -- or in general -- consider
overlaying a fitted smooth (nonparameteric) curve to the data to
reveal the "trend." There are a zillion ways to do this in R: both
lattice and ggplot have built-in capabilities to do this easily, as
does base R with ?scatter.smooth. If that's too easy, you can do it by
hand via ?lowess (or it's more flexible cousin, ?loess),
smooth.spline, etc. In actuality, your binning strategy is a crude,
non-smooth version of such smoothing, so it's not that far-fetched. Or
as some of the choicer R-Help pages say, cutting and boxplotting is to
smoothing as histograms are to nonparametric density estimates.

Cheers,
Bert


On Fri, Dec 9, 2011 at 12:05 PM, Vining, Kelly
<Kelly.Vining at oregonstate.edu> wrote:
> Thanks to David and Jorge - both of your helpful suggestions got me to the desired endpoint. In case anyone else has this question: I boxplotted my y variable data, but did the "cut" operation on the x variable in order to conserve the order of the y data. I see another suggestion coming in from another user that basically says this.
>
> So, my working line of code was:
>
> boxplot(count$RPKM ~ cut(count$C_count, breaks=4)
>
> Much appreciation to everyone who responded...thanks for helping with a naïve question without making me feel stupid.
>
> This discussion board is very, very good.
>
> --Kelly V.
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Friday, December 09, 2011 11:58 AM
> To: Uwe Ligges
> Cc: Vining, Kelly; r-help at r-project.org
> Subject: Re: [R] scatterplot to boxplot translation?
>
>
> On Dec 9, 2011, at 2:50 PM, Uwe Ligges wrote:
>
>>
>>
>> On 09.12.2011 20:41, Vining, Kelly wrote:
>>> Thanks for the tip on "cut," seems like it should work. I must still
>>> be missing something, though. Here, I'm cutting on the y variable,
>>> then attempting the boxplot:
>>>
>>> cutRPKM<- cut(count$RPKM, breaks=4)
>>>
>>> head(cutRPKM)
>>> [1] (-0.0995,24.8] (-0.0995,24.8] (-0.0995,24.8] (-0.0995,24.8]
>>> (-0.0995,24.8] [6] (-0.0995,24.8]
>>> Levels: (-0.0995,24.8] (24.8,49.8] (49.8,74.7] (74.7,99.6]
>>>
>>> boxplot(as.numeric(cutRPKM))
>>>
>>> This gives me a single box instead of five boxes. ??
>>
>>
>> You obviously want:
>>
>> boxplot(count$RPKM ~ cut(count$RPKM, breaks=seq(0, max(count$RPKM),
>> by=100)))
>
> In that context (having defined a cut-variable with single-integer break argument),  would have thought this should work:
>
>  boxplot(count$RPKM ~ cutRPKM)
>
> --
> David.
>
>>
>>
>> Uwe Ligges
>>
>>
>>> Thanks again,
>>> --Kelly V.
>>> ________________________________________
>>> From: David Winsemius [dwinsemius at comcast.net]
>>> Sent: Friday, December 09, 2011 11:14 AM
>>> To: Vining, Kelly
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] scatterplot to boxplot translation?
>>>
>>> On Dec 9, 2011, at 2:11 PM, Vining, Kelly wrote:
>>>
>>>> My apologies if anyone is seeing this twice...looks like my previous
>>>> message didn't come through...
>>>>
>>>> Dear UseRs,
>>>> I have a feeling this is a relatively simple question, but I'm
>>>> having a hard time getting my head around it. I have a simple x-y
>>>> scatterplot with many points, as shown below(attached). I'd like to
>>>> make a boxplot of this by interval, such that there is one box
>>>> representing the points in the 0-100 interval, one for the 101-200
>>>> interval, and so on. How do I structure my R data frame to be able
>>>> to generate such a boxplot?
>>>>
>>>
>>> ?cut
>>>
>>>>
>>>> From: r-help-bounces at r-project.org
>>>> [mailto:r-help-bounces at r-project.org
>>>> ] On Behalf Of Vining, Kelly
>>>> Sent: Friday, December 09, 2011 11:01 AM
>>>> To: r-help at r-project.org
>>>> Subject: [R] scatterplot to boxplot translation?
>>>>
>>>>
>>>> <C_count_vs_RPKM.png>______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> West Hartford, CT
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list