[R] about ECDF display in ggplot2
Bogdan Tanasa
t@n@@@ @end|ng |rom gm@||@com
Mon Jul 9 02:44:45 CEST 2018
Dear Jeff,
thank you for your email.
Yes, in order to be more descriptive/comprehensive, please find attached to
my email the following files (my apologies ... I am sending these as
attachments, as I do not have a web server running at this moment) :
-- the R script (R_script_display_ECDF.R) that reads the file "LENGTH" and
outputs ECDF figure by using the standard R function or ggplot2.
-- the display of ECDF by using standard R function
("display.R.ecdf.LENGTH.pdf")
-- the display of ECDF by using ggplot2 ("display.ggplot2.ecdf.LENGTH.pdf")
The ECDF over xlim(0,500) looks very different (contrasting plot(ecdf) vs
ggplot2). Please would you advise why ? what shall I change in my ggplot2
code ?
thanks a lot,
- bogdan
ps : the R code is also written below :
library("ggplot2")
>
> file <- read.delim("LENGTH", sep="\t", header=T, stringsAsFactors=F)
>
> ############################# display with PLOT FUNCTION:
>
> pdf("display.R.ecdf.LENGTH.pdf", width=10, height=6, paper='special')
>
> plot(ecdf(file$LENGTH), xlab="DEL SIZE",
> ylab="fraction of DEL",
> main="LENGTH of DEL",
> xlim=c(0,500),
> col = "dark red", axes = FALSE)
>
> ticks_y <- c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4)
>
> axis(2, at=ticks_y, labels=ticks_y, col.axis="red")
>
> ticks_x <- c(0, 100, 200, 400, 500, 600, 700, 800)
>
> axis(1, at=ticks_x, labels=ticks_x, col.axis="blue")
>
> dev.off()
>
> ############################# display in GGPLOT2 :
>
> BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
> 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
>
> barfill <- "#4271AE"
> barlines <- "#1F3552"
>
> pdf("display.ggplot2.ecdf.LENGTH.pdf", width=10, height=6,
> paper='special')
>
> ggplot(file, aes(LENGTH)) +
> stat_ecdf(geom = "point", colour = barlines, fill = barfill) +
> scale_x_continuous(name = "LENGTH of DEL",
> breaks = BREAKS,
> limits=c(0, 500)) +
> scale_y_continuous(name = "FRACTION") +
> ggtitle("ECDF of LENGTH") +
> theme_bw() +
> theme(legend.position = "bottom", legend.direction =
> "horizontal",
> legend.box = "horizontal",
> legend.key.size = unit(1, "cm"),
> axis.title = element_text(size = 12),
> legend.text = element_text(size = 9),
> legend.title=element_text(face = "bold", size = 9))
>
> dev.off()
On Sat, Jul 7, 2018 at 9:47 PM, Jeff Newmiller <jdnewmil using dcn.davis.ca.us>
wrote:
> It is a feature of ggplot that points excluded by limits raise warnings,
> while base graphics do not.
>
> You may find that using coord_cartesian with the xlim=c(0,500) argument
> works better with ggplot by showing the consequences of points out of the
> limits on lines within the viewport.
>
> There are other possible problems with your data that your
> non-reproducible example does not show, and sending R code in
> HTML-formatted email usually corrupts it.. so please follow the
> recommendations in the Posting Guide next time you post.
>
> On July 6, 2018 4:32:41 PM PDT, Bogdan Tanasa <tanasa using gmail.com> wrote:
> >Dear all,
> >
> >I would appreciate having your advice/suggestions/comments on the
> >following
> >:
> >
> >1 -- starting from a vector that contains LENGTHS (numerically, the
> >values
> >are from 1 to 10 000)
> >
> >2 -- shall I display the ECDF by using the R code and some "limits" :
> >
> >BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400,
> >500,
> > 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
> >
> >ggplot(x, aes(LENGTH)) +
> > stat_ecdf(geom = "point") +
> > scale_x_continuous(name = "LENGTH of DEL",
> > breaks = BREAKS,
> > limits=c(0, 500))
> >
> >3 -- I am getting the following warning message : "Warning message:
> >Removed
> >109 rows containing non-finite values (stat_ecdf)."
> >
> >The question is : are these 109 values removed from VISUALIZATION as i
> >set
> >up the "limits", or are these 109 values removed from statistical
> >CALCULATION?
> >
> >4 -- in contrast, shall I use the standard R functions plot(ecdf),
> >there is
> >no "warning mesage"
> >
> >plot(ecdf(x$LENGTH), xlab="DEL LENGTH",
> > ylab="Fraction of DEL", main="DEL", xlim=c(0,500),
> > col = "dark red")
> >
> >Thanks a lot !
> >
> >-- bogdan
> >
> > [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: display.ggplot2.ecdf.LENGTH.pdf
Type: application/pdf
Size: 8841 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20180708/75da9c56/attachment-0004.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: display.R.ecdf.LENGTH.pdf
Type: application/pdf
Size: 13600 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20180708/75da9c56/attachment-0005.pdf>
More information about the R-help
mailing list