[R] Scatterplot and Causality

R. Michael Weylandt michael.weylandt at gmail.com
Mon Apr 22 17:00:33 CEST 2013


On Mon, Apr 22, 2013 at 3:48 PM, Lorenzo Isella
<lorenzo.isella at gmail.com> wrote:
> Dear All,
> I hope this is not too off topic.
> I am given a set of scatteplots (nothing too fancy; think about a
> normal x-y 2D plot).
> I do not deal with two time series (indeed I have no info about time).
> If I call A=(A1,A2,...) and B=(B1, B2, ...) the 2 variables (two
> vectors of numbers most of the case, but sometimes they can be
> categorical variables), I can plot one against the other and I
> essentially I need to determine whether
>
> A=f(B, noise) or B=g(A, noise)

What's the mathematical difference in these two cases? It seems only a
matter of interpretation.

>
> where the noise is the effect of other possibly unknown variables,
> measurement errors etc.... and f and g are two functions.
>
> Without the noise, if I want to test if A=f(B) [B causes A], then I
> need at least to ensure that f(B1)!=f(B2) must imply B1!=B2 (different
> effects must have a different cause), whereas it is not ruled out that
> f(B1)=f(B2) for B1!=B2 (different causes may lead to the same effect).
>
> However, in presence of the noise, these properties will hold only
> approximately

Do they even hold approximately?

>so....any idea about how a statistical test, rather than
> eyeballing, to tell apart A=f(B, noise) vs B=g(A, noise)?
> Any suggestion is welcome.

http://xkcd.com/552/

>
> Lorenzo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list