[R] weighted cumulative distribution with ggplot2

David Winsemius dwinsemius at comcast.net
Mon Oct 8 19:24:17 CEST 2012


On Oct 8, 2012, at 10:12 AM, David Winsemius wrote:

> 
> On Oct 8, 2012, at 9:18 AM, David Winsemius wrote:
> 
>> 
>> On Oct 8, 2012, at 8:01 AM, Francesco wrote:
>> 
>>> I think I have my answer... ggplot2 uses ecdf which does NOT allow
>>> weightings...
>>> so there is no warning or error, but still the resulting plot do not
>>> take into account the command weight=weight
>> 
>> It was completely unclear why you expected ggplot to use ' ewcdf' when you gave a command to use 'ecdf'.
> 
> You might want to look at stat_function. It appears designed to provide a mechanism for running data through functions that do not have current support in ggplot2. I've never really grok-ked how one is supposed to pass arguments into ggplot constructs and find the help pages not so helpful in figuring this out, so this is a big fat untested guess.
> 

Here's a further stab at implementing my guess:

dat <- read.table(text="X     Weight Year
0      2         2001
0      1         2001
1      5         2001
2      1         2001
2      3         2001
2      2         2002", header=TRUE)

# Notice that ewcdf returns a function rather than a vector:

 with(dat, ewcdf(X, weights=Weight) )
Empirical CDF 
Call: ewcdf(X, weights = Weight)
 x[1:3] =      0,      1,      2

temp<-qplot(X,weight=weight,data=dat,stat = "ecdf", geom =
                        "step",colour=factor(year))

temp + stat_function(fun = with(dat, ewcdf(X, weights=Weight) ), 
                      mapping=aes(x=X, weights=Weight), colour = "red", 
                      data=dat )

I'm not sure that is what was intended. I think there may still be residual points for the forst qplot call but the data does seem to be getting through to the ewcdf function. Maybe you can fix it.


> -- 
> David.
> 
>> 
>> 
>>> 
>>> Hope that helps someone, just in case ;-)
>>> 
>>> On 8 October 2012 15:40, Francesco <cariboupad at gmx.fr> wrote:
>>>> Dear all,
>>>> 
>>>> I am trying to draw a weighted cumulative distribution (as defined
>>>> here http://rss.acs.unt.edu/Rdoc/library/spatstat/html/ewcdf.html)
>>>> with ggplot2
>>>> 
>>>> however the syntax
>>>> 
>>>> temp<-qplot(X,weight=weight,data=data,stat = "ecdf", geom =
>>>> "step",colour=factor(year))
>>>> 
>>>> seems not to produce exactly the right figure (the values seems higher
>>>> at some points)... I am wrong in the weight definition?
>>>> 
>>>> The data is like the following
>>>> 
>>>> X     Weight Year
>>>> 0      2         2001
>>>> 0      1         2001
>>>> 1      5         2001
>>>> 2      1         2001
>>>> 2      3         2001
>>>> 2      2         2002
>>>> 3.. etc
>>>> 
>>>> Any ideas ?
>>>> Many thanks in advance
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius, MD
>> Alameda, CA, USA
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Alameda, CA, USA
> 

David Winsemius, MD
Alameda, CA, USA




More information about the R-help mailing list