[R] Siegel nonparametric regression / mblm package
Marco Besozzi
m@rco@be@o48 @end|ng |rom gm@||@com
Mon Feb 11 14:48:37 CET 2019
Thanks a lot!
Il giorno lun 11 feb 2019 alle ore 14:39 Roger Koenker <
rkoenker using illinois.edu> ha scritto:
> A quick look at the code for Siegel in mblm reveals that it is extremely
> inefficient, but it seems to be correct.
> One “explanation” for this behavior, presuming that we haven’t overlooked
> something more basic, is that such
> high breakdown estimates sacrifice some efficiency, that is to say, they
> are more variable than other methods
> when the data is well behaved, and of course, the Galton data is famously
> “almost Gaussian”.
>
> On Feb 11, 2019, at 12:47 PM, Marco Besozzi <marco.beso48 using gmail.com>
> wrote:
>
> Thank you very much for your reply.
> If I have well understood, unfortunately in this way I have lost the only
> idea I had...
> Do you believe that a problem in the R algorithm employed in the package
> mblm for Siegel regression is possible?
> And do you know if Siegel regression is available in a different package?
> I was unable to find it.
> Thanks again!
> Best regards.
>
> P.S.: sorry for my bad english...
>
> Il giorno lun 11 feb 2019 alle ore 12:54 Roger Koenker <
> rkoenker using illinois.edu> ha scritto:
>
>> My first thought was also that this was an artifact of the ties, but
>> dithering the data
>> n <- length(child)
>> child <- child + runif(n,-.5,.5)
>> parent <- parent + runif(n,-.5,.5)
>>
>> and rerunning yields the same discrepancy between the Siegel and other
>> fits. Curiously, both
>> lmsreg and ltsreg from MASS produce lines that are more steeply sloped
>> than those
>> of the other methods. Since I stupidly forgot to set.seed(), YMMV.
>>
>> > On Feb 11, 2019, at 10:24 AM, Marco Besozzi <marco.beso48 using gmail.com>
>> wrote:
>> >
>> > I employed the "galton" set of data included in the package "psych".
>> With
>> > the package "mblm" I obtained the Theil-Sen nonparametric regression and
>> > the Siegel non parametric regression, and compared them with the
>> ordinary
>> > least square regression line.
>> > The results of standard regression and Theil-Sen regression are
>> practically
>> > identical. But the Siegel regression seems to have a bias that I cannot
>> > understand. May I ask for a possible explanation? The bias may be
>> related
>> > to the number of ties in the set of data? Here's the code and the image.
>> >
>> > Best regards.
>> >
>> > Marco Besozzi
>> > # Theil-Sen and Siegel nonparametric regression with package mblm
>> > # comparison with ordinary least squares (parametric) regression
>> > # on galton set of data included in the package psych
>> > #
>> > library(psych)
>> > attach(galton)
>> > library(mblm)
>> > #
>> > reglin_yx <- lm(child ~ parent, data=galton) # ordinary least squares
>> > (parametric) regression
>> > a_yx <- reglin_yx$coefficients[1] # intercept a
>> > b_yx <- reglin_yx$coefficients[2] # slope b
>> > #
>> > regnonTS <- mblm(child ~ parent, data=galton, repeated=FALSE) #
>> Theil-Sen
>> > nonparametric regression (wait a few minutes!)
>> > a_TS <- regnonTS$coefficients[1] # intercept a
>> > b_TS <- regnonTS$coefficients[2] # slope b
>> > #
>> > regnonS = mblm(child ~ parent, data=galton, repeated=TRUE) # Siegel
>> > nonparametric regression
>> > a_S <- regnonS$coefficients[1] # intercept a
>> > b_S <- regnonS$coefficients[2] # slope b
>> > #
>> > # xy plot of data and regression lines
>> > #
>> > windows() # open a new window
>> > plot(parent, child, xlim = c(60,80), ylim = c(60,80), pch=1,
>> xlab="Parent
>> > heigt (inch)", ylab="Chile height (inch)", main="Regression lines
>> > comparison", cex.main = 0.9) # data plot
>> > abline(a_yx, b_yx, col="green", lty=1) # ordinary least squares
>> > (parametric) regression line
>> > abline(a_TS, b_TS, col="blue", lty=1) # Theil-Sen nonparametric
>> regression
>> > line
>> > abline(a_S, b_S, col="red", lty=1) # Siegel nonparametric regression
>> > legend(60, 80, legend=c("Ordinary least squares regression", "Theil-Sen
>> > nonparametric regression","Siegel nonparametric regression"),
>> > col=c("green", "blue", "red"), lty=c(4,4,1), cex=0.8) # add a legend
>> > #
>> > <Siegel.PNG>______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> <http://www.r-project.org/posting-guide.html>
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list