# [R-sig-Geo] Alternate statistical test to linear regression?

r@i@1290 m@iii@g oii @im@com r@i@1290 m@iii@g oii @im@com
Wed Oct 23 19:25:38 CEST 2019

```Hi Greg and others,
Thank you for your very informative response! I actually made a mistake in my initial message, in that I was actually testing for the y variable, not the x. I will also look into those packages on CRAN, but even if there is some skewness on the y, because my sample size is much larger than 30 (N>30), it might be safe to apply a linear regression analysis, if we can assume linearity?
A useful alternative would be to use correlation coefficients to test the degree of association between the x and y variables; specifically, the Pearson correlation coefficient, since both x and y variables are quantitative. Does that make sense?

Thanks again,

-----Original Message-----
From: Greg Snow <538280 using gmail.com>
To: rain1290 <rain1290 using aim.com>
Cc: r-sig-geo <r-sig-geo using r-project.org>
Sent: Wed, Oct 23, 2019 1:00 pm
Subject: Re: [R-sig-Geo] Alternate statistical test to linear regression?

Note that the normality assumptions are about the residuals (or about
y conditional on x), not on the x variable(s) or all of y
(non-conditional).  If x is highly skewed and the residuals are normal
then diagnostics just on y will also show skewness (if there is a
relationship between x and y).

Also, the normality assumptions are about the tests and confidence
intervals, the least squares fit is legitimate (but possibly not the
most interesting fit) whether the residuals are normal or not.  The
Central Limit Theorem also applies in regression, so if the residuals
are non-normal, but you have a large sample size then the tests and
intervals will still be approximately correct (with the quality of the
approximation depending on the degree of non-normality and sample
size).

There are many alternative tools.  There is a task view on CRAN for
Robust Statistical Methods that gives summaries of many packages and
tools for robust regression (and other things as well) which does not
depend on the normality assumptions.

On Wed, Oct 23, 2019 at 9:21 AM rain1290--- via R-sig-Geo
<r-sig-geo using r-project.org> wrote:
>
> Greetings,
> I am testing to see if linear relationships exist between my x and y variables. I conducted various diagnoses in R to test for normality of the x variable data by using qqnorm, qqline and histograms that show the distribution of the data. If the data is shown to be normally distributed in either normal quantile plots or in the histograms (i.e. a bell curve-shaped distribution), I would assume normality and apply the linear regression model, using "lm". However, in some cases, my distributions do not satisfy the normality criteria, and so I feel that using the linear regression model, in those cases, would not be appropriate. For that reason, would you be able to suggest an alternate test to the linear regression model in R? Maybe a non-parametric counterpart to it?
> Thank you, and any help would be greatly appreciated!
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

--
Gregory (Greg) L. Snow Ph.D.
538280 using gmail.com
[[alternative HTML version deleted]]

```