# [R] ecdf() to nls() - how to transform data?

David Winsemius dwinsemius at comcast.net
Mon Jul 18 02:59:50 CEST 2011

```On Jul 17, 2011, at 8:13 PM, Jochen1980 wrote:

> Hi David,
>
> my first attempt to work through your code was successful, my
> predicted line
> is pretty close to the ecdf-function. I have no idea why you
> inverted the
> gumbel-function and what the advantage of this strategy is?

I was advised by a person who knows way more statistics than I do that
the strategy of minimizing the error around a regular step function
was statistically suspect (presumably even if the x value were a
random process). So I inverted the solution and put the random values
on the y-axis, leaving the stair-step values on the x-axis.. I think
claiming an "advantage" might be premature (especially since both
methods yield very similar values) and would probably require some
testing before acceptance. There is of course the fitdistr function in
MASS and there is a nice vignette on "Fitting Distribution in R" some
place around (yep, first google hit):

http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf

>
> I interpret your (1:100/100)-trick like this: you build a sequence
> of 100
> Tics. In addition to this 100 is identical to the number of created
> points.

The ecdf function object (at least as I understood from its plotted
result) sets up its y-values (in the function's environment) to be a
regular sequence determined by the number of original points and
allows you to recover the x-values with the knots function.

--
David

>
> In my real world situation, ecdf() will produce ties, same values,
> and this
> results that I do not know what number knot() will return, then this
> will
> result in a non-symmetric-data.frame()-error.
>
> R-Code:
> ############### Tutorial 3, Gamma-Distribution-Points to Gumbel
> ################
> # --- create Points --- #
> http://r.789695.n4.nabble.com/ecdf-to-nls-how-to-transform-data-td3671754.html
> print( "Tutorial 3" )
> p <- rgamma( 100, 2)
> print( p )
> ep <- ecdf( p )
> x <- knots(ep)
> hist( p, main="Tutorial 3" )
> ypre <- seq( from=1, to=100, by=1 )
> y <- ypre / 100
> df <- data.frame( x=x, y=y )
> plot( df )
> res <- nls( y ~ exp( - exp( - ( x - mue ) / beta )), start=list(mue=2,
> beta=2), data=df, trace=TRUE )
> print( summary(res) )
> #print( str(res))
> lines( x, predict( res, list( x=x )), lty = 1, col = "blue" ) #
> predicted
> vals
>
> Console-Output:
> [1] "Tutorial 3"
>  [1] 4.75748951 5.36642043 2.03025702 1.80216440 1.40745277 0.91393251
>  [7] 1.30575505 2.07451301 1.18500815 0.24972503 1.36604865 2.36786796
> [13] 1.37386798 1.72390843 1.93700139 0.78722468 2.92828385 1.51165415
> [19] 4.17000960 2.98356424 1.20195996 1.85899533 0.94863757 1.50438053
> [25] 0.50854274 3.32760075 2.21334316 0.58463817 2.26985676 2.88033389
> [31] 2.00718903 1.19043470 1.62945752 0.32010478 0.54837755 1.32965131
> [37] 3.32024573 0.53825226 3.30077557 0.46426513 1.00681720 0.59276030
> [43] 1.31431609 2.27822419 2.56404713 0.52459218 1.05228996 2.23799673
> [49] 1.14438962 2.41311612 1.40254244 1.20379073 0.44195457 0.95880408
> [55] 1.45254027 4.49645001 2.18826796 1.07161515 1.62617544 3.15506003
> [61] 2.15611491 2.43261017 1.40293461 2.79977886 3.44503201 3.25282551
> [67] 0.91570531 0.70243512 5.67971774 3.55532192 1.71515046 2.97718949
> [73] 0.47145131 0.32879089 0.60735231 0.92388638 4.29277857 1.73681839
> [79] 0.09387383 2.81281295 1.84419058 2.63070353 1.52298124 2.89761263
> [85] 1.05251987 1.24258703 3.09130229 2.66738574 3.17035060 2.15287327
> [91] 2.63042751 1.69736779 3.21126033 1.56920999 2.68985477 0.82952068
> [97] 3.62230865 1.65286592 2.76798783 2.08091935
>
> Formula: y ~ exp(-exp(-(x - mue)/beta))
>
> Parameters:
>     Estimate Std. Error t value Pr(>|t|)
> mue  1.382744   0.005926   233.3   <2e-16 ***
> beta 0.984836   0.008816   111.7   <2e-16 ***
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/ecdf-to-nls-how-to-transform-data-tp3671754p3674227.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

```