[R] Linear regression with a rounded response variable

Victor Tian tianxu03 at gmail.com
Wed Oct 21 18:21:31 CEST 2015


Hi Ravi,

Thanks for this interesting question. My thoughts are given below.

If you believe the rounding is indeed uniformly distributed, then the
problem is equivalent with adding a uniform random error between (-0.5,
0.5) for every observation in addition to the standard normal error, which
will make the new error term have a mixture distribution.

Intuitively, the impact of this newly added term depends on the relative
scale of the original normal and the new uniform error terms. To see the
exact impact, you can simulate sets of new response variables by adding
uniform errors from (-0.5, 0.5) to the original response variables and see
the results.

I wish I could have more theoretical answers and hope this helps as well.

Best,
Xu

Xu Tian, Ph.D.
Senior Statistician
Validus Research
New York, NY 10005

On Wed, Oct 21, 2015 at 10:53 AM, Ravi Varadhan <ravi.varadhan at jhu.edu>
wrote:

> Hi,
> I am dealing with a regression problem where the response variable, time
> (second) to walk 15 ft, is rounded to the nearest integer.  I do not care
> for the regression coefficients per se, but my main interest is in getting
> the prediction equation for walking speed, given the predictors (age,
> height, sex, etc.), where the predictions will be real numbers, and not
> integers.  The hope is that these predictions should provide unbiased
> estimates of the "unrounded" walking speed. These sounds like a measurement
> error problem, where the measurement error is due to rounding and hence
> would be uniformly distributed (-0.5, 0.5).
>
> Are there any canonical approaches for handling this type of a problem?
> What is wrong with just doing the standard linear regression?
>
> I googled and saw that this question was asked by someone else in a
> stackexchange post, but it was unanswered.  Any suggestions?
>
> Thank you,
> Ravi
>
> Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
> Associate Professor,  Department of Oncology
> Division of Biostatistics & Bionformatics
> Sidney Kimmel Comprehensive Cancer Center
> Johns Hopkins University
> 550 N. Broadway, Suite 1111-E
> Baltimore, MD 21205
> 410-502-2619
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
*Xu Tian*

	[[alternative HTML version deleted]]



More information about the R-help mailing list